Skip to content

kaushal0494/Meta_XNLG

Repository files navigation

Meta-XNLG

license others others

Sample Generations from Meta-XNLG

About

This repository contains the source code of the paper titled Meta-XNLG: A Meta-Learning Approach Based on Language Clustering for Zero-Shot Cross-Lingual Transfer and Generation which is published in the Findings of the Association of Computational Linguistics (ACL 2022) conference. If you have any questions, please feel free to create a Github issue or reach out to the first author at cs18resch11003@iith.ac.in.

Environment Setup and Downloads

To set up the environment, use the following conda commands:

conda env create --file environment.yml
conda activate py37_ZmBART

The code was tested with Python=3.8, PyTorch==1.8, and transformers=4.11. You can download model-checkpoint from here.

Training and Generation

Step-1: ZmT5 Checkpoint

ZmT5 is obtained by following the fine-tuning algorithms presented in the ZmBART (see step-1). However, if you wish to skip this step, you can directly download the ZmT5 checkpoint which supports 30 languages listed below:

SN ISO-3 Language SN ISO-3 Language SN ISO-3 Language
1 eng English 11 mar Marathi 21 deu German
2 hin Hindi 12 nep Nepali 22 fra French
3 urd Urdu 13 tam Tamil 23 rus Russian
4 tel Telugu 14 pan Punjabi 24 ces Czech
5 tur Turkish 15 swa Swahili 25 vie Vietnamese
6 fin Finnish 16 spa Spanish 26 tha Thai
7 jpn Japanese 17 ita Italian 27 zho Chinese (Sim)
8 kor Korean 18 por Portuguese 28 ind Indonesian
9 guj Gujarati 19 rom Romanian 29 ell Greek
10 ben Bengali 20 nld Dutch 30 ara Arabic

Step-2: Meta-Learning with Centroid Languages

In this section, we present the meta-learning training and generation pipeline for the abstractive text summarization task. We use the popular XL-SUM dataset. The underlying pre-trained checkpoint is ZmT5, and the meta-learning algorithm used is MAML.

export task_name="sum"
export input_data_dir_name="xlsum"

#input and output directories
export BASE_DIR='.'
export input_dir="XLSum_input/"
export output_dir="outputs/xlsum_out"

#model details
export model_type="t5" 
export model_chkpt="ZmT5_checkpoint"

export cache_dir='../cache_dir'
export config_file_name="auxi_tgt_lang_config" 

python train.py \
    --input_dir ${input_dir}${input_data_dir_name} \
    --output_dir ${output_dir} \
    --model_type $model_type \
    --model_chkpt $model_chkpt \
    --max_source_length 512 \
    --max_target_length 84 \
    --train_batch_size 4 \
    --learning_rate 1e-4 \
    --meta_lr 1e-5 \
    --weight_decay 0.01 \
    --adam_epsilon 1e-08 \
    --num_train_epochs 10 \
    --logging_steps 10 \
    --save_steps 1 \
    --cache_dir ${cache_dir} \
    --read_n_data_obj 1000  \
    --task_name ${task_name} \
    --freeze_embeds_and_decoder \
    --task_data_name ${input_data_dir_name} \
    --config_file_name ${config_file_name} \
    --n_inner_iter 2 \

Step-3: Zero-shot Generation with Target (non-centriod) Language Generation with Meta-XNLG

source activate hpnlg_py38

export seed=1234

#input and output directories
export task_name="sum"
export input_data_dir_name="xlsum"
export BASE_DIR='.'
export input_dir="../XLSum_input/"
export output_dir="outputs/xlsum_14"
export gen_file_name="pred.tsv"
export cache_dir='../cache_dir'

# model settings
export model_type="t5" 
export model_chkpt="outputs/xlsum"

python train.py \
    --input_dir ${input_dir}${input_data_dir_name} \
    --output_dir ${output_dir} \
    --model_type ${model_type} \
    --model_chkpt ${model_chkpt} \
    --test_batch_size 32 \
    --max_source_length 512 \
    --max_target_length 84 \
    --length_penalty 0.6 \
    --beam_size 4 \
    --early_stopping \
    --num_of_return_seq 1 \
    --min_generated_seq_len 0 \
    --max_generated_seq_len 200 \
    --cache_dir ${cache_dir} \
    --cache_dir ${cache_dir} \
    --read_n_data_obj -1 \
    --gen_file_name ${gen_file_name} \
    --task_name ${task_name} \
    --task_data_name ${input_data_dir_name} \
    --do_test 

Citation

@inproceedings{maurya-desarkar-2022-meta,
    title = "Meta-X$_{NLG}$: A Meta-Learning Approach Based on Language Clustering for Zero-Shot Cross-Lingual Transfer and Generation",
    author = "Maurya, Kaushal  and
      Desarkar, Maunendra",
    booktitle = "Findings of the Association for Computational Linguistics: ACL 2022",
    month = may,
    year = "2022",
    address = "Dublin, Ireland",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2022.findings-acl.24",
    doi = "10.18653/v1/2022.findings-acl.24",
    pages = "269--284",
}

The meta-learning implementations are done with higher library and inspired from X-MAML.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published