DocuNet

This repository is the official implementation of DocuNet, which is model proposed in a paper: Document-level Relation Extraction as Semantic Segmentation, accepted by IJCAI2021 main conference.

❗NOTE: Docunet is integrated in the knowledge extraction toolkit DeepKE.

Brief Introduction

This paper innovatively proposes the DocuNet model, which first regards the document-level relation extraction as the semantic segmentation task in computer vision.

Requirements

To install requirements:

pip install -r requirements.txt

Training

To train the DocuNet model in the paper on the dataset DocRED, run this command:

>> bash scripts/run_docred.sh # use BERT/RoBERTa by setting --transformer-type

To train the DocuNet model in the paper on the dataset CDR and GDA, run this command:

>> bash scripts/run_cdr.sh  # for CDR
>> bash scripts/run_gda.sh  # for GDA

Evaluation

To evaluate the trained model in the paper, you setting the --load_path argument in training scripts. The program will log the result of evaluation automatically. And for DocRED it will generate a test file result.json in the official evaluation format. You can compress and submit it to Colab for the official test score.

Results

Our model achieves the following performance on :

Document-level Relation Extraction on DocRED

Model	Ign F1 on Dev	F1 on Dev	Ign F1 on Test	F1 on Test
DocuNet-BERT (base)	59.86±0.13	61.83±0.19	59.93	61.86
DocuNet-RoBERTa (large)	62.23±0.12	64.12±0.14	62.39	64.55

Document-level Relation Extraction on CDR and GDA

Model	CDR	GDA
DocuNet-SciBERT (base)	76.3±0.40	85.3±0.50

Acknowledgement

Part of our code is borrowed from https://github.com/wzhouad/ATLOP, many thanks. You can refer to https://github.com/fenchri/edge-oriented-graph for the detailed preprocessing process of GDA and CDR datasets (acquire the file of train_filter.data, dev_filter.data and test_filter.data).

Papers for the Project & How to Cite

If you use or extend our work, please cite the paper as follows:

@inproceedings{ijcai2021-551,
  title     = {Document-level Relation Extraction as Semantic Segmentation},
  author    = {Zhang, Ningyu and Chen, Xiang and Xie, Xin and Deng, Shumin and Tan, Chuanqi and Chen, Mosha and Huang, Fei and Si, Luo and Chen, Huajun},
  booktitle = {Proceedings of the Thirtieth International Joint Conference on
               Artificial Intelligence, {IJCAI-21}},
  publisher = {International Joint Conferences on Artificial Intelligence Organization},
  editor    = {Zhi-Hua Zhou},
  pages     = {3999--4006},
  year      = {2021},
  month     = {8},
  note      = {Main Track}
  doi       = {10.24963/ijcai.2021/551},
  url       = {https://doi.org/10.24963/ijcai.2021/551},
}

Name		Name	Last commit message	Last commit date
Latest commit History 64 Commits
checkpoint/docred		checkpoint/docred
dataset/docred		dataset/docred
logs/docred		logs/docred
meta		meta
scripts		scripts
submit_result		submit_result
.DS_Store		.DS_Store
LICENSE		LICENSE
README.md		README.md
attn_unet.py		attn_unet.py
element_wise.py		element_wise.py
evaluation.py		evaluation.py
long_seq.py		long_seq.py
losses.py		losses.py
model.png		model.png
model_balanceloss.py		model_balanceloss.py
plot_output.py		plot_output.py
prepro.py		prepro.py
requirements.txt		requirements.txt
supplementary.pdf		supplementary.pdf
train_balanceloss.py		train_balanceloss.py
train_bio.py		train_bio.py
utils_sample.py		utils_sample.py

License

zjunlp/DocuNet

Folders and files

Latest commit

History

Repository files navigation

DocuNet

Brief Introduction

Requirements

Training

Evaluation

Results

Document-level Relation Extraction on DocRED

Document-level Relation Extraction on CDR and GDA

Acknowledgement

Papers for the Project & How to Cite

About

Topics

Resources

License

Stars

Watchers

Forks

Languages