Skip to content
Common Lisp NLP toolset
Common Lisp Other
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
api
contrib
corpora
coursera-nlang
data
docs
langs
lib
models/pos-tagging
nltk
src
test
.gitignore
.travis.yml
LICENSE
README.md
cl-nlp-api.asd
cl-nlp-contrib.asd
cl-nlp.asd
cl-nltk.asd
coursera-nlang.asd
mkdocs.yml
run-api.lisp
version.txt

README.md

Build Status Documentation Status

CL-NLP -- a Lisp NLP toolkit

Brief description

Eventually, CL-NLP will provide a comprehensive and extensible set of tools to solve natural language processing problems in Common Lisp.

The goals of the project include the following:

  • support for constructing arbitrary NLP pipelines on top of it
  • support for easy and fast experimentation and development of new models and approaches
  • serve as a good framework for teaching NLP concepts

It comprises of a number of utility/horizontal and end-user/vertical modules that implement the basic functions and provide a way to add own extensions and models.

The utility layer includes:

  • tools for transforming raw natural language text, as well as various corpora into a form suitable for further processing
  • basic support for language modelling
  • support for a number of linguistic concepts
  • support for working with machine learning models and a number of training algorithms

The end-user layer will provide:

  • POS taggers
  • constituency parsers
  • dependency parsers
  • other stuff (will be added step-by-step, suggestions are welcome)

How to start working with CL-NLP

The project has already reached a stage of usefulness for the primary author: for instance, it supports my current language modelling experiments by providing easy access to treebanks and other utilities.

Yet, it is far from being production-ready. So, if you want to use it for production tasks, expect to bleed on the bleeding edge.

Otherwise, if you want to contribute to developing the toolkit, you're very welcome. Here are a few write-ups to give you the sense of the project and to help get started:

You'll also, probably, need to track the latest version of RUTILS from git.

For CL-NLP to reach v.0.1 that may be considered suitable for limited use by non-contributors, the following things should be finished (work-in-progress):

  • implement a comprehensive test-suite and fix all bugs encountered in the process
  • describe available models and their quality metrics

Technical notes

Dependencies

For development:

License

The license of CL-NLP is Apache 2.0.

Specific models may have different license due to the limitations of the dataset they are built with. Please see a <model>.license file accompanying each model for details.

(c) 2013-2014, Vsevolod Dyomkin vseloved@gmail.com

You can’t perform that action at this time.