Skip to content
πŸ”€ πŸ‘€ Seeing Language Through Character Level Taggers
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.

Character Eyes

Code for our project analyzing character level taggers. This repository is a work in progress but contains some of our code and analysis. More will be added soon!

example activations


  • - A fully character level tagger model, implemented in DyNet. It has support for asymmetric bi-directional RNNs, which we found had performance effects depending on linguistic properties of the language.
  • Pretrained models for 5 of our 24 languages
  • Ready-to-train datasets (from Univseral Dependencies 2.3) for all 24 languages
  • This notebook reproduces some of the figures and charts in our paper.

Coming Soon

  • Interactive Notebooks - play with character level representations on the fly!
  • better dependencies/requirements.txt
  • Storage size permitting, more pretrained models including asymmetric configurations

Much of the code is modified from Mimick, a character level system that can replace OOVs or UNKs with learned representations approximating a closed vocabulary set of word embeddings.

Citation format

When using our work, please use the following .bib entry:

  title={Character Eyes: Seeing Language through Character-Level Taggers},
  author={Pinter, Yuval and Marone, Marc and Eisenstein, Jacob},
  journal={arXiv preprint arXiv:1903.05041},
You can’t perform that action at this time.