Skip to content

epfl-dlab/Cr5

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
src
 
 
 
 
 
 

Cr5

This repository contains the code for the following paper, which proposes a novel approach of learning crosslingual word embeddings optimized for document level aggregation.

"Crosslingual Document Embedding as Reduced-Rank Ridge Regression". Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining. ACM, 2019.


Pretrained crosslingual embeddings

We also publish a dataset of pretrained word embeddings in 28 languages, where words are embedded in a shared latent space. The dataset is available here.


If you found the provided resources useful, please cite the above paper. Here's a BibTeX entry you may use:

@inproceedings{josifoski-wsdm2019-cr5,
  title={Crosslingual Document Embedding as Reduced-Rank Ridge Regression},
  author={Josifoski, Martin and Paskov, Ivan S. and Paskov, Hristo S. and Jaggi, Martin and West, Robert},
  booktitle={Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining},
  organization={ACM},
  year={2019}
}

Any questions or suggestions?

Contact martin.josifoski@epfl.ch.

About

Code and data for the WSDM '19 paper "Crosslingual Document Embedding as Reduced-Rank Ridge Regression (Cr5)"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published