The original project is tested under the following environments:
python==3.8.2
torch==1.8.1+cu111
pytorch-lightning==1.5.0
transformers==4.10.0
scikit-learn==1.2.1
Please download the .jsonl files and store them under ./data
.
The datasets can be downloaded using the links below (requires format conversion, code can be found in ./scripts/
:
- Essays (Stab and Gurevych, 2017): link
- AbstRCT (Mayer et al., 2020): link
- ECHR (Poudyal et al., 2020): link
- CDCP (Park and Cardie, 2018; Niculae et al., 2017): link
- AMPERE++ link
Or you can directly download the processed data from link.
To train a standard supervised relation extraction model on CDCP:
SEED=1
DOMAIN="cdcp"
python -m argument_relation_transformer.train \
--datadir=./data \
--seed=${SEED} \
--dataset=${DOMAIN} \
--ckptdir=./checkpoints \
--exp-name=demo-${DOMAIN}_seed=${SEED} \
--warmup-steps=5000 \
--learning-rate=1e-5 \
--scheduler=constant \
--max-epochs=15 \
--window-size=10 \
--cl=1e-3