Code for EMNLP 2022 Main Conference paper A Unified Positive-Unlabeled Learning Framework for Document-Level Relation Extraction with Different Levels of Labeling.
Our code is modified based on ATLOP. Here we sincerely thanks for their excellent work.
- Python (tested on 3.6.7)
- CUDA (tested on 11.0)
- PyTorch (tested on 1.7.1)
- Transformers (tested on 4.18.0)
- numpy (tested on 1.19.5)
- apex (tested on 0.1)
- opt-einsum (tested on 3.3.0)
- ujson
- tqdm
The DocRED dataset can be downloaded following the instructions at link.
The Re-DocRED dataset can be downloaded following the instructions at link.
The ChemDisGene dataset can be downloaded following the instructions at link.
SSR-PU
|-- dataset
| |-- docred
| | |-- train_annotated.json
| | |-- train_distant.json
| | |-- train_ext.json
| | |-- train_revised.json
| | |-- dev.json
| | |-- dev_ext.json
| | |-- dev_revised.json
| | |-- test_revised.json
| |-- chemdisgene
| | |-- train.json
| | |-- valid.json
| | |-- test.anno_all.json
|-- meta
| |-- rel2id.json
| |-- relation_map.json
Train DocRED model with the following command:
>> sh scripts/run_bert.sh # S-PU BERT
>> sh scripts/run_bert_rank.sh # SSR-PU BERT
>> sh scripts/run_roberta.sh # S-PU RoBERTa
>> sh scripts/run_roberta_rank.sh # SSR-PU RoBERTa
>> sh scripts/run_bert_rank_full.sh # SSR-PU BERT Fully supervised
>> sh scripts/run_roberta_rank_full.sh # SSR-PU RoBERTa Fully supervised
>> sh scripts/run_bert_rank_ext.sh # SSR-PU BERT Extremely unlabeled
>> sh scripts/run_roberta_rank_ext.sh # SSR-PU RoBERTa Extremely unlabeled
Train ChemDisGene model with the following command:
>> sh scripts/run_bio.sh # S-PU PubmedBERT
>> sh scripts/run_bio_rank.sh # SSR-PU PubmedBERT
Our model parameters can be downloaded from https://huggingface.co/wwwyyy/SSR-PU_DocRE/tree/main.