This is the repo of for EMNLP 2021 paper "Progressive Adversarial Learning for Bootstrapping: A Case Study on Entity Set Expansion"
CoNLL and OntoNotes datasets can be downloaded from here; External pre-training datasets can be downloaded from here.
After downloading, please unarchive them and put them into "dataset" folder at the root directory.
Using self-supervised and supervised pre-training as
python -u pretrain_self.py --output_model_file models/pretrain_self_100_local --device 0 --local > logs/pretrain_self_100_local.txt
python -u pretrain_sup.py --input_model_file models/pretrain_self_100_local --output_model_file models/pretrain_self_100_sup_200_local --device 0 --local > logs/pretrain_self_100_sup_200_local.txt
python -u bootstrap.py --dataset dataset/CoNLL --n_iter 20 --min_match 2 --device 0 --local > logs/conll_local.txt
or
python -u bootstrap.py --dataset dataset/OntoNotes --n_iter 20 --device 0 --local > logs/onto_local.txt
python -u bootstrap.py --input_model_file models/ul_weight1e-1/pretrain_self_100_sup_200_local --dataset dataset/CoNLL --min_match 2 --n_iter 20 --device 0 --local > logs/conll_100_200_local.txt
or
python -u bootstrap.py --input_model_file models/ul_weight1e-2/pretrain_self_100_sup_200_local --dataset dataset/OntoNotes --n_iter 20 --device 0 --local > logs/onto_100_200_local.txt