Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
TexNLP: Texas Natural Language Processing tools
Java Shell Python
branch: master

Added '#' and '$' to CCG Category parser in order to support POS tags as

well.  Also, now when the parser finds an unrecognized symbol, it throws
an exceptions instead of just printing a message to the console and
moving on.
latest commit c4ce348764
@dhgarrette dhgarrette authored
Failed to load latest commit information.
bin
data
lib
src/main
tmp
CHANGES
LICENSE
README
README.md
build.xml

README.md

TexNLP

TexNLP: Texas Natural Language Processing tools

This is the site for the TexNLP code used in the following papers:

  • Jason Baldridge. 2008. Weakly supervised supertagging with grammar-informed initialization. In Proceedings of COLING-2008. Manchester, UK. PDF

  • Jason Baldridge and Alexis Palmer. 2009. How well does active learning actually work? Time-based evaluation of cost-reduction strategies for language documentation. In Proceedings of EMNLP-09. Singapore. PDF

  • Alexis Palmer, Taesun Moon, Jason Baldridge, Katrin Erk, Eric Campbell, and Telma Can. 2010. Computational strategies for reducing annotation effort in language documentation: A case study in creating interlinear texts for Uspanteko. Linguistic Issues in Language Technology. 3(4):1-42. PDF

The code supports supervised and semi-supervised learning for Hidden Markov Models for tagging, and standard supervised Maximum Entropy Markov Models (using the TADM toolkit). There is additional support for working with categories of Combinatory Categorial Grammar, especially with respect to supertagging for CCGbank.

Please reference Baldridge (2008) if you use this software. Please note that it is not user-friendly and is poorly documented – please email Jason Baldridge (jbaldrid@mail.utexas.edu) if you have questions about getting things working.

Download: TexNLP v0.2.0

License: LGPL

Contributors: Jason Baldridge, Taesun Moon, Elias Ponvert

This development of the software and the research behind it was done as part of the EARL project, supported under NSF grant No. 06651988, "Reducing Annotation Effort in the Documentation of Languages using Machine Learning and Active Learning."

Something went wrong with that request. Please try again.