DSCAU

This repo is our code and dataset for paper De-biasing Distantly Supervised Named Entity Recognition via Causal Intervention. Our work is based on BOND and PU-Learning. Therefore, our code is created by modifying their codes.

Framework

Data

We use the data from BOND. We sampled several sub-dictionaries by sampling entities from the global dictionary. The probability of each entity being sampled corresponds to its utterance frequency. Therefore, in each dataset (DSCAU/dataset), there are several train*.json files, each of them is generated by a single sub-dictionary.

Environment

Python 3.7, DSCAU/requirements_bond.txt is the environment of BOND and DSCAU/requirements_pul.txt is the environment of PU-Learning.

Training

For BOND:

cd DSCAU/BOND/
./scripts/train_conll2003.sh
./scripts/train_twitter.sh
./scripts/train_webpage.sh
./scripts/train_wikigold.sh

For PU-Learning:

Download glove.6B.100d.txt first, and move it to the directory DSCAU/PUL/data_bond/

cd DSCAU/PUL/
./scripts/train_conll2003.sh

Evaluation

For BOND:

cd DSCAU/BOND/
./scripts/eval_conll2003.sh
./scripts/eval_twitter.sh
./scripts/eval_webpage.sh
./scripts/eval_wikigold.sh

For PU-Learning:

cd DSCAU/PUL/
./scripts/eval_conll2003.sh

Trained Models

You can download the following trained models and replace the SAVED_DIR in eval_*.sh with it to obtain the results.

	CoNLL03	Twitter	Webpage	Wikigold
BOND	Download	Download	Download	Download
PU-Learning	Download	-	-	-

Citation

Please cite our ACL 2021 paper:

@inproceedings{zhang-etal-2021-de,
    title = "De-biasing Distantly Supervised Named Entity Recognition via Causal Intervention",
    author = "Zhang, Wenkai  and
      Lin, Hongyu  and
      Han, Xianpei  and
      Sun, Le",
    booktitle = "Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)",
    year = "2021",
    publisher = "Association for Computational Linguistics",
    pages = "4803--4813"
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BOND

BOND

PUL

PUL

dataset

dataset

docs

docs

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

requirements_bond.txt

requirements_bond.txt

requirements_pul.txt

requirements_pul.txt

Repository files navigation

DSCAU

Framework

Data

Environment

Training

Evaluation

Trained Models

Citation

About

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commits
BOND		BOND
PUL		PUL
dataset		dataset
docs		docs
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements_bond.txt		requirements_bond.txt
requirements_pul.txt		requirements_pul.txt

License

zwkatgithub/DSCAU

Folders and files

Latest commit

History

Repository files navigation

DSCAU

Framework

Data

Environment

Training

Evaluation

Trained Models

Citation

About

Resources

License

Stars

Watchers

Forks