GitHub is home to over 28 million developers working together. Join them to grow your own development teams, manage permissions, and collaborate on projects.
slides from the fall 2018 version of csci 544
Bundle service and command-line interface for supporting worksheets in CodaLab.
Tokenizer developed by Ulf Harmjakob @ USC ISI
Universal Romanizer i.e. convert any script to roman (latin) script
repository for the code and data behind the NL Seminar website
slides from fall 2017 iteration of csci544
Java Utils for NLP research
Scorer for the TAC KBP EAL evaluations
TAC KBP Event Argument Extraction and Linking Shared Task
Smatch tool: evaluation of AMR semantic structures
Scripts for automating NMT for lorelei
A tool for constructing and executing experiment pipelines on a cluster
Tiburon tree transducer toolkit
C++/CUDA toolkit for training sequence and sequence-to-sequence models across multiple GPUs
finite-state toolkit, EM and Bayesian (Gibbs sampling) training for FST and context-free derivation forests
in pursuit of a universal tokenizer
Dataset published alongside the paper "Extracting Structured Scholarly Information from the Machine Translation Literature" in LREC 2016.
refactor of tokserver
tokenizer viz website server
English, French, and German tokenizer used by the Agile team in GALE/BOLT
LSTM Parser for Syntactic Parsing, Syntax Based MT, and AMR parsing
SHERG rule extraction and parsing tools
This is a repository for all the finite-state machines that are compatible with Carmel.
tools for deeplang project