Skip to content
This repository


Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
branch: nltk-py3k

This branch is 21 commits ahead and 8208 commits behind master

Fetching latest commit…

Cannot retrieve the latest commit at this time

Natural Language Toolkit (NLTK)


Copyright (C) 2001-2011 NLTK Project

For license information, see LICENSE.txt

NLTK -- the Natural Language Toolkit -- is a suite of open source Python modules, data sets and tutorials supporting research and development in Natural Language Processing.


A substantial amount of documentation about how to use NLTK, including a textbook and API documention, is available from the NLTK website:

  • The book covers a wide range of introductory topics in NLP, and shows how to do all the processing tasks using the toolkit.

  • The toolkit's reference documentation describes every module, interface, class, method, function, and variable in the toolkit. This documentation should be useful to both users and developers.

Mailing Lists

There are several mailing lists associated with NLTK:


If you would like to contribute to NLTK, please see


Have you found the toolkit helpful? Please support NLTK development by donating to the project via PayPal, using the link on the NLTK homepage.


  • NLTK source code is distributed under the Apache 2.0 License.
  • NLTK documentation is distributed under the Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States license.
  • NLTK corpora are provided under the terms given in the README file for each corpus; all are redistributable, and available for non-commercial use.
  • NLTK may be freely redistributed, subject to the provisions of these licenses.


If you publish work that uses NLTK, please cite the NLTK book, as follows:

Bird, Steven, Edward Loper and Ewan Klein (2009).
Natural Language Processing with Python.  O'Reilly Media Inc.
Something went wrong with that request. Please try again.