I have tampered with the NLTK in order to get it running on Google cloud platform. So far, tokenizing and Part Of Speech (POS) tagging are working.
- Changed path references to a relevant app engine path.
- Removed support for hunpos, stanford taggers due to subprocess spawning requirements in these modules
- Removed downloader module; gui not relevant on app engine.
Feel free to use the sample app located under appengine directory as a basis for your project. It includes the Treebank Part of Speech Tagger but not the NLTK for App Engine or PyYAML libs. In any case, the steps to running NLTK for App Engine are:
- Add following entry to the base of your app.yaml
libraries: - name: numpy version: "1.6.1"
Download PyYAML and copy it's lib directory to your project root.
Copy NLTK for App Engine to your project root. import nltk and play on.
Sample App Engine app utilising above method(s) located under appengine directory
NLTK for App Engine source code is distributed under the same license as the NLTK project, that is the Apache 2.0 License.