PyCantonese: Cantonese Linguistics and NLP in Python
Download and install
PyCantonese is available through pip:
$ pip install -U pycantonese
Setting up a Development Environment
The latest code under development is available on Github at pycantonese/pycantonese. To obtain this version for experimental features or for development:
$ git clone https://github.com/pycantonese/pycantonese.git $ cd pycantonese $ pip install -r requirements.txt $ pip install -r dev-requirements.txt $ python setup.py develop
To run tests:
$ py.test -vv --cov pycantonese pycantonese $ flake8 pycantonese
Developer: Jackson L. Lee
A talk introducing PyCantonese:
Lee, Jackson L. 2015. PyCantonese: Cantonese linguistic research in the age of big data. Talk at the Childhood Bilingualism Research Centre, Chinese University of Hong Kong. September 15. 2015. Notes+slides
Please also see
MIT License. Please see
LICENSE.txt for details.
The HKCanCor dataset included in PyCantonese is substantially modified from
its source in terms of format. The original dataset has a CC BY license.
pycantonese/data/hkcancor/readme.md for details.