Skip to content

Latest commit

 

History

History
27 lines (20 loc) · 1.18 KB

README.md

File metadata and controls

27 lines (20 loc) · 1.18 KB

Cr5

This repository contains the code for the following paper, which proposes a novel approach of learning crosslingual word embeddings optimized for document level aggregation.

"Crosslingual Document Embedding as Reduced-Rank Ridge Regression". Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining. ACM, 2019.


Pretrained crosslingual embeddings

We also publish a dataset of pretrained word embeddings in 28 languages, where words are embedded in a shared latent space. The dataset is available here.


If you found the provided resources useful, please cite the above paper. Here's a BibTeX entry you may use:

@inproceedings{josifoski-wsdm2019-cr5,
  title={Crosslingual Document Embedding as Reduced-Rank Ridge Regression},
  author={Josifoski, Martin and Paskov, Ivan S. and Paskov, Hristo S. and Jaggi, Martin and West, Robert},
  booktitle={Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining},
  organization={ACM},
  year={2019}
}

Any questions or suggestions?

Contact martin.josifoski@epfl.ch.