ENGINE is an entity-aware model for article generation and retrieval. This repository contains the implementation of our paper. If you find this code useful in your research, please consider citing:
@inproceedings{
zhang2023show,
title={Show, Write, and Retrieve: Entity-aware Article Generation and Retrieval},
author={Zhongping Zhang and Yiwen Gu and Bryan A. Plummer},
booktitle={The 2023 Conference on Empirical Methods in Natural Language Processing},
year={2023},
url={https://openreview.net/forum?id=SlL3dr0Xa9}
}
The requirements for ENGINE can be installed by
pip install -r requirements.txt
The PyTorch library is installed on CUDA 11.1 platform and the Python version is 3.8.8.
If you have multiple projects to address in a single machine, you might need to create an independent Anaconda environment for ENGINE
conda create -n engine python=3.8.8 &&
conda activate engine &&
pip install -r requirements.txt
Note: If you would like to implement the Clip-based NER in our paper, you will need to set up the CLIP environment by
pip install -r requirements_clip.txt
We performed our experiments on three datasets, GoodNews, VisualNews, and WikiText. We provided the versions we employed for model training and evaluation through the following links. Compared to the original version, we removed broken links and non-English articles.
Datasets | Google Drive Link |
---|---|
GoodNews | GoodNews Link |
VisualNews | VisualNews Link |
WikiText | WikiText Link |
If you would like to obtain the original data, please consider collect the data from their official websites: GoodNews & VisualNews & WikiText
Run the following script to finetune and evaluate GPT2:
cd code &
sh scripts/run_train_gpt2_goodnews.sh &
sh scripts/run_eval_gpt2_goodnews.sh
Run the following script to finetune and evaluate ENGINE:
sh scripts/run_train_gpt2_goodnwes_NEboth_wcap.sh &
sh scripts/run_eval_gpt2_goodnews_NEboth_wcap.sh
Run the following script to generate articles.
sh scripts/EXP1_NEboth_art_gen_goodnews.sh
Experiments on VisualNews can be performed by simply modifying the file paths in our goodnews scripts.
Download ENGINE weights in this http URL.