-
Notifications
You must be signed in to change notification settings - Fork 0
Czech morphology library, using data files compatible with PDT 2.0
License
qiq/Czech-morphology
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
czmorphology is a shared library to lemmatize/analyze words. It is a refactored code of Jan Hajic's binary and requires original data files from the binary distribution: *{ae.cpd,au.cpd,ag.txt,aw.cpd}, which is available e.g. in the PDT 2.0 (http://ufal.mff.cuni.cz/pdt2.0/). The output is equivalent to the output of the original binary distribution. The interface is quite simple: https://github.com/qiq/Czech-morphology/blob/master/src/czmorphology/interface.h Main benefits compared to the original code: - shared library code - 64-bit compatible - no memory leaks - faster initialization - UTF-8 input/output - thread safety (only one thread can lemmatize a word at tha same time, though) - output compatible with In order to compile/use it, configure; make; make install should be sufficient.
About
Czech morphology library, using data files compatible with PDT 2.0
Resources
License
Stars
Watchers
Forks
Releases
No releases published