UDPipe: Trainable pipeline for tokenizing, tagging, lemmatizing and parsing Universal Treebanks and other CoNLL-U files
Switch branches/tags
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
bindings Add initial implementation of EPE interchange format. Apr 10, 2017
doc
releases Add ISO 639-{1,2} codes to lang-names. Nov 3, 2017
scripts Update MorphoDiTa. Dec 18, 2017
src allow for pipelines starting from tokenizer because of Weblicht havin… Oct 4, 2018
src_lib_only Add initial implementation of EPE interchange format. Apr 10, 2017
tests Make all sources ASCII only, add Travis test. Mar 3, 2017
training
web Pass data using multipart/form-data if supported. Jan 31, 2018
.travis.yml Set TravisCI distribution to trusty, fix g++-4.7 issue. Aug 24, 2017
AUTHORS Initial upload. Feb 8, 2016
CHANGES Add segment_size and learning_rate_final tokenizer training options. Sep 28, 2018
Dockerfile
INSTALL Update requirements. Mar 29, 2017
LICENSE Initial upload. Feb 8, 2016
MANUAL Add segment_size and learning_rate_final tokenizer training options. Sep 28, 2018
MANUAL.html Add segment_size and learning_rate_final tokenizer training options. Sep 28, 2018
README Move the info about udpipe Docker image to doc/readme.t2t. Jan 14, 2018
README.md

README.md

UDPipe

Build Status

UDPipe is a trainable pipeline for tokenization, tagging, lemmatization and dependency parsing of CoNLL-U files. UDPipe is language-agnostic and can be trained given annotated data in CoNLL-U format. Trained models are provided for nearly all UD treebanks. UDPipe is available as a binary for Linux/Windows/OS X, as a library for C++, Python, Perl, Java, C#, and as a web service. Third-party R CRAN package also exists.

UDPipe is a free software distributed under the Mozilla Public License 2.0 and the linguistic models are free for non-commercial use and distributed under the CC BY-NC-SA license, although for some models the original data used to create the model may impose additional licensing conditions. UDPipe is versioned using Semantic Versioning.

Copyright 2017 by Institute of Formal and Applied Linguistics, Faculty of Mathematics and Physics, Charles University, Czech Republic.

UDPipe website http://ufal.mff.cuni.cz/udpipe contains download links of both the released packages and trained models, hosts documentation and offers online demo.

UDPipe development repository http://github.com/ufal/udpipe is hosted on GitHub.

Third-party contribution: Instructions how to build UDPipe REST server as Docker image is here: http://github.com/samisalkosuo/udpipe-rest-server-docker. Instructions how to train UDPipe language models using a Docker image is also there.