AdaE2ML-XSF

Code repository for our EMNLP2023 paper "Adaptive End-to-End Metric Learning for Zero-Shot Cross-Domain Slot Filling"

Overall Framework

How to run

We choose one target domain each time and the rest of domains are combined as source within SNIPS dataset under cross-domain setting. For zero-resource setting, we select CoNLL03 English NER as source dataset and SciTech NER as target dataset. Considering the cross-dataset setting, we select SNIPS as source(target) domain while ATIS as target(source) domain.

Preparations

1. Data Preparation

We put the raw SNIPS dataset under data/raw_dataset/original_snips_data. The following commands might be useful for preparation:

cd YOUR_PATH_TO/zero-shot-slu
mkdir data/snips
python data/raw_dataset/preprocess_snips.py

We also put the raw ATIS dataset under data/raw_dataset/atis. The following preprocessing commands are used for cross-dataset setting:

cd YOUR_PATH_TO/zero-shot-slu
mkdir data/merge_dataset
mkdir data/merge_dataset/snips
mkdir data/merge_dataset/atis
python data/dataPreprocessingM.py

For SciTech NER dataset, please refer to this link and put the dataset under data/ner/tech folder.

2. Pretrained Model Preparation

mkdir bert_model
git lfs install
git clone https://huggingface.co/bert-base-uncased

3. Environments

We implement our method with Pytorch 1.8. Other required packages could be found in requirements.txt

Key Configurations

--tgt_dm: SNIPS target domain
--n_samples: Number of samples used in the target domain, for K-shot, set n_samples to K
--cl: using slot-level CL
--cl_type: slot-level CL metric function
--cl_temperature: slot-level CL temperature \tau
--alpha: scalar coefficient of typing loss(L_typ) with default to 1.0
--beta: scalr coefficient of slot-CL loss(L_ctr) with default to 1.0
--model_ckpt: Saved model path
--vocab_ckpt: Saved vocab path

Zero-shot Cross-domain Slot Filling

Train our model for zero-shot adaptation to GetWeather domain:

❱❱❱ python slu_e2e_bert_f2train.py --cuda 0 -lr 1e-3 --n_sample 0 --tgt_dm GetWeather --epoch 30 --dropout 0.1 --cl --cl_type cosine --cl_temperature 0.5 --model_ckpt loss_log/test.ckpt --vocab_ckpt loss_log/test_vocab.ckpt

Train our model without slot-CL for zero-shot adaptation to GetWeather domain:

❱❱❱ python slu_e2e_bert_f2train.py --cuda 0 -lr 1e-3 --n_sample 0 --tgt_dm GetWeather --epoch 30 --dropout 0.1 --model_ckpt ckpt/end2end_cl/bert_domain_atp0.ckpt --vocab_ckpt ckpt/vocab/bert_domain_atp0_vocab.ckpt

Few-shot Cross-domain Slot Filling

Train our model for 50-shot adaptation to GetWeather domain:

❱❱❱ python slu_e2e_bert_f2train.py --cuda 0 -lr 1e-3 --n_sample 50 --tgt_dm GetWeather --epoch 30 --dropout 0.1 --cl --cl_type cosine --cl_temperature 0.5 --model_ckpt loss_log/test.ckpt --vocab_ckpt loss_log/test_vocab.ckpt

Cross-dataset setting

To train and evaluate our model under the cross-dataset scenario (i.e. SNIPS <-> ATIS), you can use the following commands:

❱❱❱ python f2xDataset.py --cuda 0 -lr 1e-3 --src snips --tgt atis --alpha 1.5 --beta 1.0 --epoch 30 --dropout 0.1 --cl --cl_type cosine --cl_temperature 0.5 --model_ckpt snips-atis.ckpt --vocab_ckpt snips-atis_vocab.ckpt
❱❱❱ python f2xDataset.py --cuda 0 -lr 1e-4 --src atis --tgt snips --batch_size 8 --alpha 2.0 --beta 2.0 --epoch 30 --dropout 0.1 --cl --cl_type euclidean --cl_temperature 0.5 --model_ckpt snips-atis.ckpt --vocab_ckpt snips-atis_vocab.ckpt

Cross-domain NER

Train our model for zero-resource adaptation to sci-t ech domain

❱❱❱ python ner_e2e_train.py --cuda 0 -lr 1e-3 --n_sample 0 --tgt_dm tech --epoch 30 --dropout 0.5 --cl --cl_type cosine --cl_temperature 0.1 --model_ckpt test_tech.ckpt --vocab_ckpt test_vocab.ckpt

Train our model without slot-level CL for zero-resource adaptation to sci-tech domain

❱❱❱ python ner_e2e_train.py --cuda 0 -lr 1e-3 --n_sample 0 --tgt_dm tech --epoch 30 --dropout 0.5 --model_ckpt test_tech.ckpt --vocab_ckpt test_vocab.ckpt

Citation

If you use any source codes or ideas included in this repository for your work, please cite the following paper.

@inproceedings{shi-etal-2023-adaptive,
    title = "Adaptive End-to-End Metric Learning for Zero-Shot Cross-Domain Slot Filling",
    author = "Shi, Yuanjun  and
      Wu, Linzhi  and
      Shao, Minglai",
    editor = "Bouamor, Houda  and
      Pino, Juan  and
      Bali, Kalika",
    booktitle = "Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing",
    month = dec,
    year = "2023",
    address = "Singapore",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.emnlp-main.387",
    doi = "10.18653/v1/2023.emnlp-main.387",
    pages = "6291--6301"
}

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.vscode		.vscode
config		config
data		data
imgs		imgs
logger		logger
model		model
modules		modules
utils		utils
LICENSE		LICENSE
README.md		README.md
f2xDataset.py		f2xDataset.py
ner_e2e_train.py		ner_e2e_train.py
requirements.txt		requirements.txt
slu_e2e_bert_f2train.py		slu_e2e_bert_f2train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AdaE2ML-XSF

Overall Framework

How to run

Preparations

1. Data Preparation

2. Pretrained Model Preparation

3. Environments

Key Configurations

Zero-shot Cross-domain Slot Filling

Few-shot Cross-domain Slot Filling

Cross-dataset setting

Cross-domain NER

Citation

About

Releases

Packages

Languages

License

Switchsyj/AdaE2ML-XSF

Folders and files

Latest commit

History

Repository files navigation

AdaE2ML-XSF

Overall Framework

How to run

Preparations

1. Data Preparation

2. Pretrained Model Preparation

3. Environments

Key Configurations

Zero-shot Cross-domain Slot Filling

Few-shot Cross-domain Slot Filling

Cross-dataset setting

Cross-domain NER

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages