Skip to content
/ SSR-PU Public

EMNLP 2022: "A Unified Positive-Unlabeled Learning Framework for Document-Level Relation Extraction with Different Levels of Labeling"

Notifications You must be signed in to change notification settings

www-Ye/SSR-PU

Repository files navigation

SSR-PU

Code for EMNLP 2022 Main Conference paper A Unified Positive-Unlabeled Learning Framework for Document-Level Relation Extraction with Different Levels of Labeling.

Our code is modified based on ATLOP. Here we sincerely thanks for their excellent work.

Requirements

  • Python (tested on 3.6.7)
  • CUDA (tested on 11.0)
  • PyTorch (tested on 1.7.1)
  • Transformers (tested on 4.18.0)
  • numpy (tested on 1.19.5)
  • apex (tested on 0.1)
  • opt-einsum (tested on 3.3.0)
  • ujson
  • tqdm

Dataset

The DocRED dataset can be downloaded following the instructions at link.

The Re-DocRED dataset can be downloaded following the instructions at link.

The ChemDisGene dataset can be downloaded following the instructions at link.

SSR-PU
 |-- dataset
 |    |-- docred
 |    |    |-- train_annotated.json
 |    |    |-- train_distant.json
 |    |    |-- train_ext.json
 |    |    |-- train_revised.json
 |    |    |-- dev.json
 |    |    |-- dev_ext.json
 |    |    |-- dev_revised.json
 |    |    |-- test_revised.json
 |    |-- chemdisgene
 |    |    |-- train.json
 |    |    |-- valid.json
 |    |    |-- test.anno_all.json
 |-- meta
 |    |-- rel2id.json
 |    |-- relation_map.json

Training and Evaluation

DocRED

Train DocRED model with the following command:

>> sh scripts/run_bert.sh  # S-PU BERT
>> sh scripts/run_bert_rank.sh  # SSR-PU BERT
>> sh scripts/run_roberta.sh  # S-PU RoBERTa
>> sh scripts/run_roberta_rank.sh  # SSR-PU RoBERTa
>> sh scripts/run_bert_rank_full.sh  # SSR-PU BERT Fully supervised
>> sh scripts/run_roberta_rank_full.sh  # SSR-PU RoBERTa Fully supervised
>> sh scripts/run_bert_rank_ext.sh  # SSR-PU BERT Extremely unlabeled
>> sh scripts/run_roberta_rank_ext.sh  # SSR-PU RoBERTa Extremely unlabeled

ChemDisGene

Train ChemDisGene model with the following command:

>> sh scripts/run_bio.sh  # S-PU PubmedBERT
>> sh scripts/run_bio_rank.sh  # SSR-PU PubmedBERT

Download Model Parameters

Our model parameters can be downloaded from https://huggingface.co/wwwyyy/SSR-PU_DocRE/tree/main.

About

EMNLP 2022: "A Unified Positive-Unlabeled Learning Framework for Document-Level Relation Extraction with Different Levels of Labeling"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published