Skip to content


Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Latest commit


Git stats


Failed to load latest commit information.

Neural Cross-Lingual Named Entity Recognition with Minimal Resources

This is the code we used in our paper

Neural Cross-Lingual Named Entity Recognition with Minimal Resources

Jiateng Xie, Zhilin Yang, Graham Neubig, Noah A. Smith, Jaime Carbonell

EMNLP 2018


Python 2.7 or 3.6

PyTorch >= 0.3.0

Theano 1.0

Lasagne 0.2

The original results of the paper are tuned and obtained using the NER model written in Theano/Lasagne. Everything else is in PyTorch. We also provide a PyTorch implementation of the NER model, which might produce slightly worse results, due to implementation differences between the libraries such as different weight initialization schemes.

Train Bilingual Word Embeddings

To train bilingual word embeddings, we use MUSE.

After installing MUSE, to get a mapping (e.g., en-es, identical character strings), first set VALIDATION_METRIC = 'mean_cosine-csls_knn_10-S2T-10000' in, and then run, for instance:

python --src_lang en --tgt_lang es --src_emb data/wiki.en.vec --tgt_emb data/ --n_refinement 3 --dico_train identical_char --max_vocab 100000

which will produce a mapping at a location such as /your_path/MUSE/dumped/debug/qbun3algl8/best_mapping.pth

To create a word-to-word translation file, run:


Note, if your embedding file contains a 1st line that specifies the size and the dimension of the embedding file, such as 2519370 300, remove it before you run this script (include it though when running MUSE).

Data Format

We use IOB2 tagging scheme, and NER data in the following format:

Peter B-PER

Blackburn I-PER

Transfer Training Data

Simply run:


Train Cross-Lingual NER Model

For the Lasagne/Theano implementation, to reproduce our results, run:


For the PyTorch implementation, run:



Neural Cross-Lingual Named Entity Recognition with Minimal Resources






No releases published


No packages published