Skip to content

Latest commit

 

History

History
87 lines (57 loc) · 4.23 KB

README.md

File metadata and controls

87 lines (57 loc) · 4.23 KB

CrossLingualELMo

Cross-Lingual Alignment of Contextual Word Embeddings

This repo will contain the code and models for the NAACL19 paper - Cross-Lingual Alignment of Contextual Word Embeddings, with Applications to Zero-shot Dependency Parsing

More pieces of the code will be released soon. Meanwhile, the ELMo models for several languages with their alignment matrices are provided here.

Aligned Multi Lingual Deep Contextual Word Embeddings

Embeddings

The following are models were trained on Wikipedia and the second layer was aligned to English:

Language Model weights Aligning matrix
en weights.hdf5 best_mapping.pth
es weights.hdf5 best_mapping.pth
fr weights.hdf5 best_mapping.pth
it weights.hdf5 best_mapping.pth
pt weights.hdf5 best_mapping.pth
sv weights.hdf5 best_mapping.pth
de weights.hdf5 best_mapping.pth

options file (for all models) - options.json

To download all the ELMo models in the table, use get_models.sh

To download all of the alignment matrices in the table, use get_alignments.sh.

Generating anchors

Use the gen_anchors.py script to generate your own anchors. You will need a trained ELMo model, text files with one sentence per line, and vocab file with token per line containing the tokens that you wish to calculate for. run gen_anchors.py -h for more details.

Usage

Generating aligned contextual embeddings

Given the output of a specific layer from ELMo (the contextual embeddings), run:

aligning  = torch.load(aligning_matrix_path)
aligned_embeddings = np.matmul(embeddings, aligning.transpose())

An example can be seen in demo.py.

Using in a model

The models can be used with the AllenNLP framework by simply using any model with ELMo embeddings and replacing the paths in the configuration with our provided models.

Each ELMo model was trained on Wikipedia of the relevant language. To align the models, you will need to add the following code to your model:

Load the alignment matrix in the __init__() function:

aligning_matrix_path = ... (pth file)
self.aligning_matrix = torch.FloatTensor(torch.load(aligning_matrix_path))
self.aligning = torch.nn.Linear(self.aligning_matrix[0], self.aligning_matrix[1], bias=False)
self.aligning.weight = torch.nn.Parameter(self.aligning_matrix)
self.aligning.weight.requires_grad = False

Then, simply apply the alignment on the embedded tokens in the forward() pass:

embedded_text = self.aligning(embedded_text)

Note that our alignments were done on the second layer of ELMo and we had to do a few hacks to freeze the layer weights in the AllenNLP repo. We will release that code soon. However, note that an alignment can be learned and applied for each layer separately to preserve the original weighted sum of layers in the ELMo embedder.

Citation

If you find this repo useful, please cite our paper.

@article{Schuster2019,
title = {Cross-Lingual Alignment of Contextual Word Embeddings, with Applications to Zero-shot Dependency Parsing},
author = {Schuster, Tal and Ram, Ori and Barzilay, Regina and Globerson, Amir},
eprint = {arXiv:1902.09492v1},
url = {https://arxiv.org/pdf/1902.09492.pdf},
year = {2019}
}