Enhancing Document-level Relation Extraction by Entity Knowledge Injection

Document-level relation extraction (RE) aims to identify the relations between entities throughout an entire document. It needs complex reasoning skills to synthesize various knowledge such as coreferences and commonsense. Large-scale knowledge graphs (KGs) contain a wealth of real-world facts, and can provide valuable knowledge to document-level RE. In this paper, we propose an entity knowledge injection framework to enhance current document-level RE models. Specifically, we introduce coreference distillation to inject coreference knowledge, endowing an RE model with the more general capability of coreference reasoning. We also employ representation reconciliation to inject factual knowledge and aggregate KG representations and document representations into a unified space. The experiments on two benchmark datasets validate the generalization of our entity knowledge injection framework and the consistent improvement to several document-level RE models.

Table of contents📑

Introduction of KIRE
1. Overview
2. Package Description
Getting Started
Models
1. Document-level RE models
2. Knowledge injection models
Datasets
1. DocRED dataset
2. DWIE dataset
License
Citation

Introduction of KIRE🎖️

Overview

We use Python and PyTorch to develop the basic framework of KIRE. The framework architecture is illustrated in the following Figure.

Package Description

KIRE/
├── B4+KIRE/
│   ├── configs/  # the code for running the model
│   ├── checkpoints/  # store the model after train
│   ├── prepro_data/  # the preprocessed data
│   ├── model/  # the models include CNN, LSTM, BiLSTM, Context-aware
│   ├── knowledge_injection_layer/  # the knowledge injection module
│   ├── scripts/  # different code files corresponding to the sh files in the home directory
├── GLRE+KIRE/
│   ├── configs/  # different configs used for experiments
│   ├── data/  # datasets and corresponding data loading code
│   ├── data_processing/  # the preprocess code for datasets
│   ├── knowledge_injection_layer/  # the knowledge injection module
│   ├── scripts/  #  different code files corresponding to the sh files in the home directory
│   ├── other directories contain source code of GLRE model
├── SSAN+KIRE/
│   ├── data/  # datasets and corresponding generate code
│   ├── checkpoints/  # store the model after train
│   ├── pretrained_lm/  # store the pretrained model
│   ├── knowledge_injection_layer/  # the knowledge injection module
│   ├── other directories or files contain source code of SSAN model
├── ATLOP+KIRE/
│   ├── data/  # datasets and corresponding generate code
│   ├── knowledge_injection_layer/  # the knowledge injection module
│   ├── scripts/  # different sh files used for experiments under different settings
│   ├── other directories or files contain source code of ATLOP model

Getting Started✈️

Dependencies

Python (tested on 3.7.4)
CUDA
PyTorch (tested on 1.7.1)
Transformers (tested on 2.11.0)
numpy
opt-einsum (tested on 3.3.0)
ujson
tqdm
yamlordereddictloader (tested on 0.4.0)
scipy (tested on 1.5.2)
recordtype
tabulate
scikit-learn

Downloads

Download processed data from figshare
Download pretrained autoencoder model from figshare
Download pretrained language model from huggingface

Usage

To run the off-the-shelf approaches and reproduce our experiments, we choose ATLOP model as an example.

Train on DocRED

Train the ATLOP and KIRE + ATLOP model on DocRED with the following command:

>> sh scripts/run_docred_bert.sh  # for ATLOP_BERT_base model
>> sh scripts/run_docred_bert_kire.sh  # for ATLOP_BERT_base + KIRE model

The program will generate a test file result.json in the official evaluation format. You can compress and submit it to Colab for the official test score.

Train on DWIE

Train the ATLOP and KIRE + ATLOP model on DWIE with the following command:

>> sh scripts/run_dwie_bert.sh  # for ATLOP_BERT_base model
>> sh scripts/run_dwie_bert_kire.sh  # for ATLOP_BERT_base + KIRE model

The scripts to run other basic models with the KIRE framework can be found in their corresponding directories.

The following table shows the used hyperparameter values in the experiments.

Hyperparameter	Values
Batch size	4
Learning rate	0.0005
Gradient clipping	10
Early stop patience	10
Regularization	0.0001
Dropout ratio	0.2 or 0.5
Dimension of hidden layers in MLP	256
Dimension of GloVe and Skip-gram	100
Dimension of hidden layers in AutoEncoder	50
Dimension, kernel size and stride of CNN_1D	100,3,1
Number of R-GAT layers and heads	3, 2
Number of aggregators	2
Dimension of hidden layers in aggregation	768
𝛼₁, 𝛼₂, 𝛼₃	1, 0.01, 0.01

Models🤖

Document-level RE models

KIRE utilizes 7 basic document-level relation extraction models. The citation for each models corresponds to either the paper describing the model.

Name	Citation
CNN	Yao et al., 2019
LSTM	Yao et al., 2019
BiLSTM	Yao et al., 2019
Context-aware	Yao et al., 2019
GLRE	Wang et al., 2020
SSAN	Xu et al., 2020
ATLOP	Zhou et al., 2020

Knowledge injection models

KIRE chooses 3 basic knowledge injection models as competitors. The citation for each models corresponds to either the paper describing the model.

Name	Citation
RESIDE	Vashishth et al., 2018
RECON	Bastos et al., 2019
KB-graph	Verlinden et al., 2021

Datasets🗂️

KIRE selects two benchmark document-level relation extraction datasets: DocRED and DWIE. The statistical data is listed in the following tables.

DocRED dataset

Datasets	Documents	Relation types	Instances	N/A instances
Training	3,053	96	38,269	1,163,035
Validation	1,000	96	12,332	385,263
Test	1,000	96	12,842	379,316

DWIE dataset

Datasets	Documents	Relation types	Instances	N/A instances
Training	544	66	13,524	492,057
Validation	137	66	3,488	121,750
Test	96	66	2,453	78,995

License

This project is licensed under the GPL License - see the LICENSE file for details

Citation🚩

@inproceedings{KIRE,
  author    = {Xinyi Wang and
               Zitao Wang and
  	       Weijian Sun and
               Wei Hu},
  title     = {Enhancing Document-level Relation Extraction by Entity Knowledge Injection},
  booktitle = {ISWC},
  year      = {2022}
}

Name		Name	Last commit message	Last commit date
Latest commit History 55 Commits
ATLOP+KIRE		ATLOP+KIRE
B4+KIRE		B4+KIRE
GLRE+KIRE		GLRE+KIRE
SSAN+KIRE		SSAN+KIRE
figs		figs
LICENSE		LICENSE
README.md		README.md

License

nju-websoft/KIRE

Folders and files

Latest commit

History

Repository files navigation

Enhancing Document-level Relation Extraction by Entity Knowledge Injection

Table of contents📑

Introduction of KIRE🎖️

Overview

Package Description

Getting Started✈️

Dependencies

Downloads

Usage

Train on DocRED

Train on DWIE

Models🤖

Document-level RE models

Knowledge injection models

Datasets🗂️

DocRED dataset

DWIE dataset

License

Citation🚩

About

Resources

License

Stars

Watchers

Forks

Languages