Skip to content

Yasminekaroui/CliCoTea

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CLiCoTEA: Cross-Lingual Contextualised Token Embedding Alignment

This code reproduces the results from ACL 2023 paper "Stop Pre-Training: Adapt Visual-Language Models to Unseen Languages".

Installation

These dependencies must be installed:

  • Hatch: for managing the Python package
  • gdown: for downloading the datasets
pip install hatch gdown

Prepare datasets

Download all datasets for training the Cross-Lingual Contextualised Token Embedding Alignment and the Zero-Shot Cross-Lingual transfer to downstream tasks:

bash scripts/datasets/download_datasets.sh data

The archive contains the original files from Flickr30k, SNLI and NLVR2 which are all in English. It also includes the translated files for each language required in the downstream tasks.

Note that the translation of train/dev sets of Flickr30k, SNLI and NVLR2 datasets has be done using Googletrans package, running the following commands:

bash scripts/datasets/prepare_flickr30k.sh
bash scripts/datasets/prepare_snli.sh
bash scripts/datasets/prepare_nlvr2.sh

Compute token alignment with awesome-align model

bash scripts/alignment/token_alignment_flickr30k.sh
bash scripts/alignment/token_alignment_snli.sh
bash scripts/alignment/token_alignment_nlvr2.sh

This should create aligned word pairs in data folder for each dataset as follows:

data/
   flickr30k/
      word_pairs_dev_en-de.json
      word_pairs_dev_en-es.json
      word_pairs_dev_en-id.json
      word_pairs_dev_en-ru.json
      word_pairs_dev_en-tr.json
      word_pairs_train_en-de.json
      word_pairs_train_en-es.json
      word_pairs_train_en-id.json
      word_pairs_train_en-ru.json
      word_pairs_train_en-tr.json
   nlvr2/
      word_pairs_dev_en-id.json
      word_pairs_dev_en-sw.json
      word_pairs_dev_en-ta.json
      word_pairs_dev_en-tr.json
      word_pairs_dev_en-zh-cn.json
      word_pairs_train_en-id.json
      word_pairs_train_en-sw.json
      word_pairs_train_en-ta.json
      word_pairs_train_en-tr.json
      word_pairs_train_en-zh-cn.json
   snli/
      word_pairs_dev_en-ar.json
      word_pairs_dev_en-es.json
      word_pairs_dev_en-fr.json
      word_pairs_dev_en-ru.json
      word_pairs_train_en-ar.json
      word_pairs_train_en-es.json
      word_pairs_train_en-fr.json
      word_pairs_train_en-ru.json

Train CLiCoTEA

Train CLiCoTEA by running the following commands (default options should be modified from the bash script):

# train CLiCoTEA for image/text retrieval on flickr30k in German
bash scripts/embeddings/train_clicotea.sh flickr30k albef_retrieval flickr de

# train CLiCoTEA for visual reasoning on NLVR2 in Swahili
bash scripts/embeddings/train_clicotea.sh nlvr2 albef_nlvr nlvr sw

# train CLiCoTEA for visual entailment on SNLI in French
1bash scripts/embeddings/train_clicotea.sh snli albef_classification ve fr

Note that we start from pre-trained ALBEF models which are available in LAVIS package.

Zero-shot transfer to unseen languages

  1. Download the images of the downstream tasks from the official website:

Text data can be downloaded from the IGLUE Benchmark with:

bash scripts/zero-shot/download_datasets.sh
  1. Run zero-shot evaluation:
DATA_DIR="<path to folder containing test files>"
LANG="<language to test>"
FLICKR30K_IMAGE_ROOT="<place path to image folder>"
COCO_IMAGE_ROOT="<place path to image folder>"
MARVL_IMAGE_ROOT="<place path to image folder>"
PATH_TO_CHECKPOINT="<place path to model checkpoint>"
  • Retrieval task on xFlickrCO
bash scripts/zero-shot/zeroshot_retrieval.sh $DATA_DIR $LANG $FLICKR30K_IMAGE_ROOT $COCO_IMAGE_ROOT $PATH_TO_CHECKPOINT
  • Visual entailment task on XVNLI
bash scripts/zero-shot/zeroshot_ve.sh $DATA_DIR $LANG $FLICKR30K_IMAGE_ROOT $PATH_TO_CHECKPOINT
  • Visual reasoning task on MaRVL
bash scripts/zero-shot/zeroshot_vr.sh $DATA_DIR $LANG $MARVL_IMAGE_ROOT $PATH_TO_CHECKPOINT

Running tests

Running all tests:

hatch run test:run

Or running a specific test:

hatch run test:run -k test_get_token_pairs

Citation

Please cite as:

@inproceedings{clicotea,
   title = "Stop Pre-Training: Adapt Visual-Language Models to Unseen Languages",
    author = "Karoui, Yasmine  and
      Lebret, R{\'e}mi  and
      Foroutan Eghlidi, Negar  and
      Aberer, Karl",
    booktitle = "Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)",
    month = jul,
    year = "2023",
    address = "Toronto, Canada",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.acl-short.32",
    pages = "366--375",
}

About

Coming Soon

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published