Data and code for ACL 2019 paper "Global Textual Relation Embedding for Relational Understanding"
We will release the full relation graph soon!
The filtered relation graph and pre-trained models can be downloaded via Dropbox.
-GloREPlus |-small_graph --- The filtered relation graph used to train the textual relation embedding in this work. |-train_data --- The filtered relation graph with the input format of the embedding model, for reference purpose. |-models --- The pre-trained textual relation embedding models.
For the filtered relation graph, we have the following format. The 3 columns are tab-separated:
textual_relation KB_relation global_co-occurrence_statistics
-code |-hyperparams.py --- Hyperparamter settings and input/output file names. |-train.py --- Train textual relation embedding model. |-eval.py --- Generate textual relation embedding with pre-trained model. |-model.py --- The Transformer embedding model. |-baseline.py --- The RNN embedding model. |-data_utils.py --- Utility functions. -data |-kb_relation2id.txt --- Target KB relations. |-vocab.txt --- Vocabulary file. |-test.txt --- A sample textual relation input file. -model to store the pre-trained models. -result to store the result embeddings.
Create two directories named
result, to store pre-trained models and the result textual relation embeddings, respectively.
$ mkdir model $ mkdir result
Put the pre-trained embedding model under
model/ directory. Change the
model_dir variable in
hyperparams.py to the name of the pre-trained model you want to use.
Prepare the input textual relations (parsed with universial dependency) in a single file, with one textual relation per line. The tokens (including both lexical words and dependency relations) are seperated by "##". Refer to a sample input file
Put the the input textual relations file under
data/ directory as test.txt. Specify your output file name as the
output_file variable in
hyperparams.py. Then run
eval.py to produce embeddings of the input textual relations:
$ python eval.py
The output file of the textual relation embeddings have the similar format as word2vec.