Injecting Knowledge Base Information into End-to-End Joint Entity and Relation Extraction and Coreference Resolution

Introduction

This repository contains the code to reproduce the results from the following paper:

@article{verlinden2021injecting,
  title={Injecting Knowledge Base Information into End-to-End Joint Entity and Relation Extraction and Coreference Resolution},
  author={Verlinden, Severine and Zaporojets, Klim and Deleu, Johannes and Demeester, Thomas and Develder, Chris},
  year={2021},
  booktitle={Proceedings of the 2021 Annual Meeting of the Association for Computational Linguistics (ACL 2021): Findings} 
}

Creating the Environment

We recommend creating a separate environment to run the code and then installing the packages in requirements.txt:

conda create -n e2e-kb-ie python=3.9
conda activate e2e-kb-ie
pip install -r requirements.txt

Downloading the Data

To download the datasets, embeddings and dictionaries execute the following script:

./download_data.sh

After running the script, the resulting directory structure should be as follows:

├── experiments
├── data
│   ├── datasets
│   │   ├── docred
│   │   └── dwie 
│   ├── dictionaries
│   └── embeddings
│       ├── kb
│       └── text
├── src

Training

The training script is located in src/train.py, it takes two arguments:

1- --config_file: the configuration file to run one of the experiments in experiments directory (the names of experiment config files are self explanatory).
2- --path: the directory where the output model, results and tensorboard logs are going to be saved. For each experiment, a separate subdirectory with the name of experiment is automatically created.

Example:

python src/train.py --config_file experiments/attention_dwie_kb_both.json --path results

Evaluation

After the training is finished, the prediction files are saved as test.json inside the results directory. In order to obtain the F1 scores execute:

python src/evaluation_script.py --pred [path to prediction file] --gold data/datasets/dwie/data/annos_with_content/ --gold-filter test

Example:

python src/evaluation_script.py --pred results/attention_dwie_kb_both/test.json --gold data/datasets/dwie/data/annos_with_content/ --gold-filter test

The test.json can be also obtained by explicitly loading and evaluating the model in the results directory:

python -u src/run_model.py --path [path to the results directory with serialized model savemodel.pth]

Example:

python -u src/run_model.py --path results/attention_dwie_kb_both/

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
experiments		experiments
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
download_data.sh		download_data.sh
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

experiments

experiments

src

src

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

download_data.sh

download_data.sh

requirements.txt

requirements.txt

Repository files navigation

Injecting Knowledge Base Information into End-to-End Joint Entity and Relation Extraction and Coreference Resolution

Introduction

Creating the Environment

Downloading the Data

Training

Evaluation

About

Releases

Packages

Languages

License

klimzaporojets/e2e-kb-ie

Folders and files

Latest commit

History

Repository files navigation

Injecting Knowledge Base Information into End-to-End Joint Entity and Relation Extraction and Coreference Resolution

Introduction

Creating the Environment

Downloading the Data

Training

Evaluation

About

Resources

License

Stars

Watchers

Forks

Languages