Code and data for the paper "End-to-end Learning of Logical Rules for Enhancing Document-level Relation Extraction"
- Python 3.8
- pytorch==2.0.0
More prerequisites can be found in the original repositories.
We use four datasets in our experiments.
Datasets | Download Links (original) |
---|---|
DWIE | https://github.com/klimzaporojets/DWIE |
DocRED | https://github.com/thunlp/DocRED |
Re-DocRED | https://github.com/bigai-nlco/DocGNRE |
DocGNRE | https://github.com/bigai-nlco/DocGNRE |
We use five models in our experiments.
Models | Code Download Links (original) |
---|---|
LSTM | https://github.com/thunlp/DocRED |
Bi-LSTM | https://github.com/thunlp/DocRED |
GAIN | https://github.com/DreamInvoker/GAIN |
ATLOP | https://github.com/wzhouad/ATLOP |
DREEAM | https://github.com/YoumiMa/dreeam |
Data of word and char embeddings for the DocRED dataset can be downloaded in Google drive
Path for code: ./JMRL-LSTM-BiLSTM
The script for training on the DWIE dataset is:
python train.py --model_name LSTM --save_name checkpoint_LSTM --train_prefix dev_train --test_prefix dev_dev
The script for evaluation on the DWIE dataset is:
python3 test.py --model_name LSTM --save_name checkpoint_LSTM --train_prefix dev_train --test_prefix dev_dev --input_theta [theta]
Path for code: ./JMRL-LSTM-BiLSTM
The script for training on the DWIE dataset is:
python train.py --model_name BiLSTM --save_name checkpoint_BiLSTM --train_prefix dev_train --test_prefix dev_dev
The script for evaluation on the DWIE dataset is:
python3 test.py --model_name BiLSTM --save_name checkpoint_BiLSTM --train_prefix dev_train --test_prefix dev_dev --input_theta [theta]
Path for code: ./JMRL-GAIN
The script for training on the DWIE dataset is:
python -u train.py --train_set ../dataset_dwie/train_annotated.json --train_set_save ../dataset_dwie/prepro_data/train_BERT.pkl --dev_set ../dataset_dwie/dev.json --dev_set_save ../dataset_dwie/prepro_data/dev_BERT.pkl --test_set ../dataset_dwie/test.json --test_set_save ../dataset_dwie/prepro_data/test_BERT.pkl --use_model bert --model_name JMRL_GAIN_BERT_base_DWIE --lr 0.00002 --batch_size 4 --test_batch_size 4 --epoch 300 --test_epoch 1 --log_step 1 --save_model_freq 5 --negativa_alpha 4 --gcn_dim 808 --gcn_layers 2 --bert_hid_size 768 --bert_path ../PLM/bert-base-uncased --use_entity_type --use_entity_id --dropout 0.1 --activation relu --coslr
The script for evaluation on the DWIE dataset is:
python -u test.py --train_set ../dataset_dwie/train_annotated.json --train_set_save ../dataset_dwie/prepro_data/train_BERT.pkl --dev_set ../dataset_dwie/dev.json --dev_set_save ../dataset_dwie/prepro_data/dev_BERT.pkl --test_set ../dataset_dwie/test.json --test_set_save ../dataset_dwie/prepro_data/test_BERT.pkl --use_model bert --pretrain_model checkpoint/JMRL_GAIN_BERT_base_DWIE_best.pt --lr 0.00002 --batch_size 4 --test_batch_size 4 --epoch 300 --test_epoch 1 --log_step 1 --save_model_freq 5 --negativa_alpha 4 --gcn_dim 808 --gcn_layers 2 --bert_hid_size 768 --bert_path ../PLM/bert-base-uncased --use_entity_type --use_entity_id --dropout 0.1 --activation relu --coslr --input_theta [theta]
The script for training on the DOCRED dataset is:
python -u train.py --train_set ../dataset_docred/train_annotated.json --train_set_save ../dataset_docred/prepro_data/train_BERT.pkl --dev_set ../dataset_docred/dev.json --dev_set_save ../dataset_docred/prepro_data/dev_BERT.pkl --test_set ../dataset_docred/test.json --test_set_save ../dataset_docred/prepro_data/test_BERT.pkl --use_model bert --model_name JMRL_GAIN_BERT_base_DOCRED --lr 0.00002 --batch_size 4 --test_batch_size 4 --epoch 20 --test_epoch 1 --log_step 1 --save_model_freq 5 --negativa_alpha 4 --gcn_dim 808 --gcn_layers 2 --bert_hid_size 768 --bert_path ../PLM/bert-base-uncased --use_entity_type --use_entity_id --dropout 0.1 --activation relu --coslr
The script for evaluation on the DOCRED dataset is:
python -u test.py --train_set ../dataset_docred/train_annotated.json --train_set_save ../dataset_docred/prepro_data/train_BERT.pkl --dev_set ../dataset_docred/dev.json --dev_set_save ../dataset_docred/prepro_data/dev_BERT.pkl --test_set ../dataset_docred/test.json --test_set_save ../dataset_docred/prepro_data/test_BERT.pkl --use_model bert --pretrain_model checkpoint/JMRL_GAIN_BERT_base_DOCRED_best.pt --lr 0.00002 --batch_size 4 --test_batch_size 4 --epoch 20 --test_epoch 1 --log_step 1 --save_model_freq 5 --negativa_alpha 4 --gcn_dim 808 --gcn_layers 2 --bert_hid_size 768 --bert_path ../PLM/bert-base-uncased --use_entity_type --use_entity_id --dropout 0.1 --activation relu --coslr --input_theta [theta]
Path for code: ./JMRL-ATLOP
The script for both training and evaluation on the DWIE dataset is:
python -u train.py --dataset dwie --transformer_type bert --model_name_or_path ../PLM/bert-base-uncased --train_file train_annotated.json --dev_file dev.json --test_file test.json --save_path ../trained_model/model_JMRL_ALTOP_DWIE.pth --num_train_epochs 300.0 --train_batch_size 4 --test_batch_size 4 --seed 66 --num_class 66 --tau 1.0 --lambda_al 1.0
The script for both training and evaluation on the DocRED dataset is:
python -u train.py --dataset docred --transformer_type bert --model_name_or_path ../PLM/bert-base-uncased --train_file train_annotated.json --dev_file dev.json --test_file test.json --save_path ../trained_model/model_JMRL_ALTOP_DOCRED.pth --num_train_epochs 20.0 --train_batch_size 4 --test_batch_size 4 --seed 66 --num_class 97 --tau 0.2 --lambda_al 1.0
Path for code: ./JMRL-DREEAM
- The script for inferring on the distantly-supervised data:
bash scripts/infer_distant_roberta.sh ${name} ${load_dir} # for RoBERTa
where ${name}
is the logging name and ${load_dir}
is the directory that contains the checkpoint (Checkpoint for teacher model can be downloaded from: Google drive. The command will perform an inference run on train_distant.json
and record token importance as train_distant.attns saved under ${load_dir}
.
Note that you should alter model.py to model.py.bak for this step, as the teacher model do not involve rule reasoning.
- The script for utilizing the recorded token importance as supervisory signals for the self-training of the student model:
bash scripts/run_self_train_roberta.sh ${name} ${teacher_signal_dir} ${lambda} ${seed}
where ${name}
is the logging name, ${teacher_signal_dir}
is the directory that stores the train_distant.attns file, ${lambda} is the scaler than controls the weight of evidence loss, and ${seed}
is the value of random seed.
- The script for fine-tuning the model on human-annotated data:
bash scripts/run_finetune_roberta.sh ${name} ${student_model_dir} ${lambda} ${seed}
where ${name}
is the logging name and ${student_model_dir}
is the directory that stores the checkpoint of student model.
- The script for testing on the dev set:
bash scripts/isf_roberta.sh ${name} ${model_dir} dev
where ${name}
is logging name and ${model_dir}
is the directory that contains the checkpoint we are going to evaluate. The commands have two functions:
a. Perform inference-stage fusion on the development data, return the scores and dump the predictions into ${model_dir}/
;
b. Select the threshold and record it as ${model_dir}/thresh
.
- With ${model_dir}/thresh available, we can make predictions on test set:
bash scripts/isf_roberta.sh ${name} ${model_dir} test
where ${model_dir}
is the directory that contains the checkpoint we are going to evaluate.
Path for code: ./DocGNRE
bash scripts/run_roberta_gpt.sh ${name} ${lambda} ${seed}
where ${name} is the logging name, ${lambda} is the scaler that controls the weight of evidence loss, and ${seed} is the value of random seed.
The script for rule extraction:
python rule_extraction.py [model_path] [beam_size]
where [model_path]
is the path for the trained JMRL-enhanced model and [beam_size]
is the beam size.
Please consider citing the following paper if you find our codes helpful. Thank you!
@inproceedings{QiDW24,
author = {Kunxun Qi and Jianfeng Du and Hai Wan },
title = {End-to-end Learning of Logical Rules for Enhancing Document-level Relation Extraction},
booktitle = {Proceedings of the 62nd Annual Meeting of the Association for Computational
Linguistics},
pages = {7247--7263},
year = {2024}
}