Cross-Lingual Alignment of Contextual Word Embeddings
This repo will contain the code and models for the NAACL19 paper - Cross-Lingual Alignment of Contextual Word Embeddings, with Applications to Zero-shot Dependency Parsing
More pieces of the code will be released soon. Meanwhile, the ELMo models for several languages with their alignment matrices are provided here.
The following are models were trained on Wikipedia and the second layer was aligned to English:
Language | Model weights | Aligning matrix |
---|---|---|
en | weights.hdf5 | best_mapping.pth |
es | weights.hdf5 | best_mapping.pth |
fr | weights.hdf5 | best_mapping.pth |
it | weights.hdf5 | best_mapping.pth |
pt | weights.hdf5 | best_mapping.pth |
sv | weights.hdf5 | best_mapping.pth |
de | weights.hdf5 | best_mapping.pth |
options file (for all models) - options.json
To download all the ELMo models in the table, use get_models.sh
To download all of the alignment matrices in the table, use get_alignments.sh
.
Use the gen_anchors.py
script to generate your own anchors. You will need a trained ELMo model, text files with one sentence per line, and vocab file with token per line containing the tokens that you wish to calculate for.
run gen_anchors.py -h
for more details.
Given the output of a specific layer from ELMo (the contextual embeddings), run:
aligning = torch.load(aligning_matrix_path)
aligned_embeddings = np.matmul(embeddings, aligning.transpose())
An example can be seen in demo.py
.
The models can be used with the AllenNLP framework by simply using any model with ELMo embeddings and replacing the paths in the configuration with our provided models.
Each ELMo model was trained on Wikipedia of the relevant language. To align the models, you will need to add the following code to your model:
Load the alignment matrix in the __init__()
function:
aligning_matrix_path = ... (pth file)
self.aligning_matrix = torch.FloatTensor(torch.load(aligning_matrix_path))
self.aligning = torch.nn.Linear(self.aligning_matrix[0], self.aligning_matrix[1], bias=False)
self.aligning.weight = torch.nn.Parameter(self.aligning_matrix)
self.aligning.weight.requires_grad = False
Then, simply apply the alignment on the embedded tokens in the forward()
pass:
embedded_text = self.aligning(embedded_text)
Note that our alignments were done on the second layer of ELMo and we had to do a few hacks to freeze the layer weights in the AllenNLP repo. We will release that code soon. However, note that an alignment can be learned and applied for each layer separately to preserve the original weighted sum of layers in the ELMo embedder.
If you find this repo useful, please cite our paper.
@article{Schuster2019,
title = {Cross-Lingual Alignment of Contextual Word Embeddings, with Applications to Zero-shot Dependency Parsing},
author = {Schuster, Tal and Ram, Ori and Barzilay, Regina and Globerson, Amir},
eprint = {arXiv:1902.09492v1},
url = {https://arxiv.org/pdf/1902.09492.pdf},
year = {2019}
}