NLTK for App Engine
Pull request Compare This branch is 2 commits behind rutherford:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
appengine
emacs
examples
javasrc
nltk
papers
tools
web
.gitattributes
.gitignore
ChangeLog
INSTALL.txt
LICENSE.txt
MANIFEST.in
Makefile
NOTICE.txt
README.md
README.txt
RELEASE-HOWTO
distribute_setup.py
setup.cfg
setup.py
tox.ini

README.md

Natural Language Toolkit (NLTK) for App Engine

I have tampered with the NLTK in order to get it running on Google cloud platform. So far, tokenizing and Part Of Speech (POS) tagging are working.

Quick Summary of Changes:

  • Changed path references to a relevant app engine path.
  • Removed support for hunpos, stanford taggers due to subprocess spawning requirements in these modules
  • Removed downloader module; gui not relevant on app engine.

Running on App Engine:

Feel free to use the sample app located under appengine directory as a basis for your project. It includes the Treebank Part of Speech Tagger but not the NLTK for App Engine or PyYAML libs. In any case, the steps to running NLTK for App Engine are:

  1. Add following entry to the base of your app.yaml
libraries:
- name: numpy
  version: "1.6.1"
  1. Download PyYAML and copy it's lib directory to your project root.

  2. Copy NLTK for App Engine to your project root. import nltk and play on.

Sample Code:

Sample App Engine app utilising above method(s) located under appengine directory

Redistributing

NLTK for App Engine source code is distributed under the same license as the NLTK project, that is the Apache 2.0 License.