Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Natural language processing framework for Ruby.
tree: 291f32198e

Fetching latest commit…

Cannot retrieve the latest commit at this time

Failed to load latest commit information.
bin
files
lib
models
spec
tmp
.gitignore
.rspec
.travis.yml
Gemfile
LICENSE
README.md
RELEASE
Rakefile
treat.gemspec

README.md

Build Status Dependency Status

Treat is a framework for natural language processing and computational linguistics in Ruby. It provides a common API for a number of gems and external libraries for document retrieval, parsing, annotation, and information extraction.

Current features

  • Text extractors for PDF, HTML, XML, Word, AbiWord, OpenOffice and image formats (Ocropus).
  • Text retrieval with indexation and full-text search (Ferret).
  • Text chunkers, sentence segmenters, tokenizers, and parsers for several languages (Stanford & Enju).
  • Word inflectors, including stemmers, conjugators, declensors, and number inflection.
  • Lexical resources (WordNet interface, several POS taggers for English, Stanford taggers for several languages).
  • Language, date/time, topic words (LDA) and keyword (TF*IDF) extraction.
  • Serialization of annotated entities to YAML, XML formats or to MongoDB.
  • Visualization in ASCII tree, directed graph (DOT) and tag-bracketed (standoff) formats.
  • Linguistic resources, including language detection and tag alignments for several treebanks.
  • Decision tree and multilayer perceptron classification (liblinear coming soon!)


Resources


License

This software is released under the GPL License and includes software released under the GPL, Ruby, Apache 2.0 and MIT licenses.

Something went wrong with that request. Please try again.