Attention Realignment and Pseudo-Labelling for Interpretable Cross-Lingual Classification of Crisis Tweets
Purpose: A cross-lingual neural network model over XLM-R with the capability to attend over the similar words (dlo
in Haitian Creole versus water
in English) in different languages.
http://ceur-ws.org/Vol-2657/paper3.pdf (KiML @ KDD 2020)
@article{krishnanAttentionRealignment,
title={Attention Realignment and Pseudo-Labelling for Interpretable Cross-Lingual Classification of Crisis Tweets},
author={Krishnan, Jitin and Purohit, Hemant and Rangwala, Huzefa},
booktitle={In Proceedings of KDD Workshop on Knowledge-infused Mining and Learning},
year={2020}
}
- Python3.6, Keras, Tensorflow.
- Install fairseq for XLMR. Apex is not needed.
Download Appen dataset consisting of Multilingual Disaster Response Messages.
python get_xlmr_embeddings.py en train
python get_xlmr_embeddings.py en val
python get_xlmr_embeddings.py en test
python get_xlmr_embeddings.py ml train
python get_xlmr_embeddings.py en val
python get_xlmr_embeddings.py en test
This step produces 6.npy
files with embeddings and 6.txt
files with corresponding tweets. This will make it easier to train as XLMR is a bit slow.
python baseline.py en ml
python modelA.py en ml
python modelB.py en ml
Source --> Target (Source --> Source)
S --> T | Baseline | Model A | Model B |
---|---|---|---|
en --> ml | 59.98 (80.57) | 62.53 (77.02) | 66.79 (82.39) |
ml --> en | 60.93 (70.07) | 65.69 (63.50) | 70.95 (73.84) |
Click Here to view the Jupyter Notebook that shows the attention heat map.
For help or issues, please submit a GitHub issue or contact Jitin Krishnan (jkrishn2@gmu.edu
).