Skip to content
Branch: master
Find file History
ajoulin and facebook-github-bot Add unsupervised multilingual alignement
Summary: Add a script for unsupervised multilingual alignment.

Reviewed By: EdouardGrave

Differential Revision: D17180273

fbshipit-source-id: edbb139ff9474ef325a43bb16e9c0cf1a76e0900
Latest commit 252c8a5 Sep 6, 2019
Type Name Latest commit message Commit time
Failed to load latest commit information. Add unsupervised multilingual alignement Sep 6, 2019

Alignment of Word Embeddings

This directory provides code for learning alignments between word embeddings in different languages.

The code is in Python 3 and requires NumPy.

The script shows how to use this code to learn and evaluate a bilingual alignment of word embeddings.

The word embeddings used in [1] can be found on the fastText project page and the supervised bilingual lexicons on the MUSE project page.

Supervised alignment

The script aligns word embeddings from two languages using a bilingual lexicon as supervision. The details of this approach can be found in [1].

Unsupervised alignment

The script aligns word embeddings from two languages without requiring any supervision. Additionally, the script aligns multiple languages to a common space with no supervision. The details of these approaches can be found in [2] and [3] respectively.

In addition to NumPy, the unsupervised methods require the Python Optimal Transport toolbox.


Wikipedia fastText embeddings aligned with our method can be found here.


If you use the supervised alignment method, please cite:

[1] A. Joulin, P. Bojanowski, T. Mikolov, H. Jegou, E. Grave, Loss in Translation: Learning Bilingual Word Mapping with a Retrieval Criterion

    title={Loss in Translation: Learning Bilingual Word Mapping with a Retrieval Criterion},
    author={Joulin, Armand and Bojanowski, Piotr and Mikolov, Tomas and J\'egou, Herv\'e and Grave, Edouard},
    booktitle={Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing},

If you use the unsupervised bilingual alignment method, please cite:

[2] E. Grave, A. Joulin, Q. Berthet, Unsupervised Alignment of Embeddings with Wasserstein Procrustes

    title={Unsupervised Alignment of Embeddings with Wasserstein Procrustes},
    author={Grave, Edouard and Joulin, Armand and Berthet, Quentin},
    journal={arXiv preprint arXiv:1805.11222},

If you use the unsupervised alignment script, please cite:

[3] J. Alaux, E. Grave, M. Cuturi, A. Joulin, Unsupervised Hyperalignment for Multilingual Word Embeddings

  title={Unsupervised hyperalignment for multilingual word embeddings},
  author={Alaux, Jean and Grave, Edouard and Cuturi, Marco and Joulin, Armand},
  journal={arXiv preprint arXiv:1811.01124},
You can’t perform that action at this time.