Description

More details will provided soon!

Official repository of the paper Promoting Generalized Cross-lingual Question Answering in Few-resource Scenarios via Self-knowledge Distillation containing the implementation to reproduce it.

Please, refer to the preprint for more details about the fundamental ideas, the method and the evaluation results: URL

Installation

First, install Python dependencies and extracting training and evaluation data.

bash ./setup.sh

The script creates a Python virtualenv in the venv directory and extract all the necessary data in the corpora directory.

How to train

We provide simple script to train the models in the paper. For example, to train the overall best performing model, referred in the paper as mbert-qa-en, skd, mAP@k, run the following steps:

Standard cross-entropy fine-tuning of the mBERT model for the extractive QA task using the SQuAD v1.1 training dataset in English.

bash train_qa_mbert.sh

Cross-lingual fine-tuning of the previous mBERT model, called mBERT-qa-en, using self-knowledge distillation with mAP@k loss coefficients.

bash train_skd_map.sh

The result will be stored in the runs directory along with the tensoboard logs.

All other scripts will fine-tuning the mBERT-qa-en model with different methods. Note that each script is use a configuration of hyperparameters correspoding to the best models. Change the configuration inside the scripts to train different models.

How to evaluate

We also provide scripts to evaluate the model after training. For example, to evaluate on the MLQA-test dataset, run:

bash eval_qa.sh <model_path_trained_model> mlqa-test

The evaluation result will be stored inside the trained model directory under the name eval_results_mlqa-test

Is it possible to choose another test set between xquad, mlqa-dev and tydiqa-goldp datasets.

Available models:

We uploaded the best-performing models on the HuggingFace Models Hub under the names: mBERT-qa-en, skd and mBERT-qa-en, skd, mAP@k

How to Cite

To cite our work use the following BibTex:

@misc{carrino2023promoting,
      title={Promoting Generalized Cross-lingual Question Answering in Few-resource Scenarios via Self-knowledge Distillation}, 
      author={Casimiro Pio Carrino and Carlos Escolano and José A. R. Fonollosa},
      year={2023},
      eprint={2309.17134},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
corpora.tar.gz		corpora.tar.gz
eval_qa.sh		eval_qa.sh
requirements.txt		requirements.txt
setup.sh		setup.sh
train_ce.sh		train_ce.sh
train_qa_mbert.sh		train_qa_mbert.sh
train_skd.sh		train_skd.sh
train_skd_map.sh		train_skd_map.sh
train_skd_map_no_ce.sh		train_skd_map_no_ce.sh
train_skd_no_ce.sh		train_skd_no_ce.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Description

Installation

How to train

How to evaluate

Available models:

How to Cite

About

Releases

Packages

Languages

License

ccasimiro88/self-distillation-gxlt-qa

Folders and files

Latest commit

History

Repository files navigation

Description

Installation

How to train

How to evaluate

Available models:

How to Cite

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages