Document-level relation extraction (RE) aims to identify the relations between entities throughout an entire document. It needs complex reasoning skills to synthesize various knowledge such as coreferences and commonsense. Large-scale knowledge graphs (KGs) contain a wealth of real-world facts, and can provide valuable knowledge to document-level RE. In this paper, we propose an entity knowledge injection framework to enhance current document-level RE models. Specifically, we introduce coreference distillation to inject coreference knowledge, endowing an RE model with the more general capability of coreference reasoning. We also employ representation reconciliation to inject factual knowledge and aggregate KG representations and document representations into a unified space. The experiments on two benchmark datasets validate the generalization of our entity knowledge injection framework and the consistent improvement to several document-level RE models.
We use Python and PyTorch to develop the basic framework of KIRE. The framework architecture is illustrated in the following Figure.
KIRE/
βββ B4+KIRE/
β βββ configs/ # the code for running the model
β βββ checkpoints/ # store the model after train
β βββ prepro_data/ # the preprocessed data
β βββ model/ # the models include CNN, LSTM, BiLSTM, Context-aware
β βββ knowledge_injection_layer/ # the knowledge injection module
β βββ scripts/ # different code files corresponding to the sh files in the home directory
βββ GLRE+KIRE/
β βββ configs/ # different configs used for experiments
β βββ data/ # datasets and corresponding data loading code
β βββ data_processing/ # the preprocess code for datasets
β βββ knowledge_injection_layer/ # the knowledge injection module
β βββ scripts/ # different code files corresponding to the sh files in the home directory
β βββ other directories contain source code of GLRE model
βββ SSAN+KIRE/
β βββ data/ # datasets and corresponding generate code
β βββ checkpoints/ # store the model after train
β βββ pretrained_lm/ # store the pretrained model
β βββ knowledge_injection_layer/ # the knowledge injection module
β βββ other directories or files contain source code of SSAN model
βββ ATLOP+KIRE/
β βββ data/ # datasets and corresponding generate code
β βββ knowledge_injection_layer/ # the knowledge injection module
β βββ scripts/ # different sh files used for experiments under different settings
β βββ other directories or files contain source code of ATLOP model
- Python (tested on 3.7.4)
- CUDA
- PyTorch (tested on 1.7.1)
- Transformers (tested on 2.11.0)
- numpy
- opt-einsum (tested on 3.3.0)
- ujson
- tqdm
- yamlordereddictloader (tested on 0.4.0)
- scipy (tested on 1.5.2)
- recordtype
- tabulate
- scikit-learn
- Download processed data from figshare
- Download pretrained autoencoder model from figshare
- Download pretrained language model from huggingface
To run the off-the-shelf approaches and reproduce our experiments, we choose ATLOP model as an example.
Train the ATLOP and KIRE + ATLOP model on DocRED with the following command:
>> sh scripts/run_docred_bert.sh # for ATLOP_BERT_base model
>> sh scripts/run_docred_bert_kire.sh # for ATLOP_BERT_base + KIRE model
The program will generate a test file result.json
in the official evaluation format. You can compress and submit it to Colab for the official test score.
Train the ATLOP and KIRE + ATLOP model on DWIE with the following command:
>> sh scripts/run_dwie_bert.sh # for ATLOP_BERT_base model
>> sh scripts/run_dwie_bert_kire.sh # for ATLOP_BERT_base + KIRE model
The scripts to run other basic models with the KIRE framework can be found in their corresponding directories.
The following table shows the used hyperparameter values in the experiments.
Hyperparameter | Values |
---|---|
Batch size | 4 |
Learning rate | 0.0005 |
Gradient clipping | 10 |
Early stop patience | 10 |
Regularization | 0.0001 |
Dropout ratio | 0.2 or 0.5 |
Dimension of hidden layers in MLP | 256 |
Dimension of GloVe and Skip-gram | 100 |
Dimension of hidden layers in AutoEncoder | 50 |
Dimension, kernel size and stride of CNN1D | 100,3,1 |
Number of R-GAT layers and heads | 3, 2 |
Number of aggregators | 2 |
Dimension of hidden layers in aggregation | 768 |
πΌ1, πΌ2, πΌ3 | 1, 0.01, 0.01 |
KIRE utilizes 7 basic document-level relation extraction models. The citation for each models corresponds to either the paper describing the model.
Name | Citation |
---|---|
CNN | Yao et al., 2019 |
LSTM | Yao et al., 2019 |
BiLSTM | Yao et al., 2019 |
Context-aware | Yao et al., 2019 |
GLRE | Wang et al., 2020 |
SSAN | Xu et al., 2020 |
ATLOP | Zhou et al., 2020 |
KIRE chooses 3 basic knowledge injection models as competitors. The citation for each models corresponds to either the paper describing the model.
Name | Citation |
---|---|
RESIDE | Vashishth et al., 2018 |
RECON | Bastos et al., 2019 |
KB-graph | Verlinden et al., 2021 |
KIRE selects two benchmark document-level relation extraction datasets: DocRED and DWIE. The statistical data is listed in the following tables.
Datasets | Documents | Relation types | Instances | N/A instances |
---|---|---|---|---|
Training | 3,053 | 96 | 38,269 | 1,163,035 |
Validation | 1,000 | 96 | 12,332 | 385,263 |
Test | 1,000 | 96 | 12,842 | 379,316 |
Datasets | Documents | Relation types | Instances | N/A instances |
---|---|---|---|---|
Training | 544 | 66 | 13,524 | 492,057 |
Validation | 137 | 66 | 3,488 | 121,750 |
Test | 96 | 66 | 2,453 | 78,995 |
This project is licensed under the GPL License - see the LICENSE file for details
@inproceedings{KIRE,
author = {Xinyi Wang and
Zitao Wang and
Weijian Sun and
Wei Hu},
title = {Enhancing Document-level Relation Extraction by Entity Knowledge Injection},
booktitle = {ISWC},
year = {2022}
}