Skip to content
/ Cr5 Public

Code and data for the WSDM '19 paper "Crosslingual Document Embedding as Reduced-Rank Ridge Regression (Cr5)"

Notifications You must be signed in to change notification settings

epfl-dlab/Cr5

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 

Repository files navigation

Cr5

This repository contains the code for the following paper, which proposes a novel approach of learning crosslingual word embeddings optimized for document level aggregation.

"Crosslingual Document Embedding as Reduced-Rank Ridge Regression". Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining. ACM, 2019.


Pretrained crosslingual embeddings

We also publish a dataset of pretrained word embeddings in 28 languages, where words are embedded in a shared latent space. The dataset is available here.


If you found the provided resources useful, please cite the above paper. Here's a BibTeX entry you may use:

@inproceedings{josifoski-wsdm2019-cr5,
  title={Crosslingual Document Embedding as Reduced-Rank Ridge Regression},
  author={Josifoski, Martin and Paskov, Ivan S. and Paskov, Hristo S. and Jaggi, Martin and West, Robert},
  booktitle={Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining},
  organization={ACM},
  year={2019}
}

Any questions or suggestions?

Contact martin.josifoski@epfl.ch.

About

Code and data for the WSDM '19 paper "Crosslingual Document Embedding as Reduced-Rank Ridge Regression (Cr5)"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages