Neural OPen infomation extractor, based on our Squadie dataset and bootstrapped from the Graphene OpenIE system.
Model: Attention is All You Need (Transformer)
General Training procedure: Initially train the transformer to mimic the Graphene OpenIE system (i.e. “bootstrapping”) until convergence. Then alternate with REINFORCE on our Squadie and News QA IE datasets to improve. Rewards are given for (1) NER’s being identified (as provided by an external NER system) (2) tuples being identified, (3) correct causal and temporal tags, (4) negative rewards for remaining pronouns. (See: MIXER training in https://arxiv.org/pdf/1511.06732.pdf)
- Make Squadie Dataset
- Make News QA Dataset
- Create a seq2seq with attention baseline on the datasets using OpenNMT (just for fun)
- Construct a database using the Graphene parser
- Train a transformer model on the graphene database
- Use MIXER training to improve on the model
- python 3
- pytorch (>0.4)
- pandas
- spacy
- csv