Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

List PyThaiNLP 2.0 #118

Closed
wannaphong opened this issue Sep 22, 2018 · 1 comment
Closed

List PyThaiNLP 2.0 #118

wannaphong opened this issue Sep 22, 2018 · 1 comment
Assignees
Labels
enhancement enhance functionalities
Milestone

Comments

@wannaphong
Copy link
Member

wannaphong commented Sep 22, 2018

New evaluation corpus

New features

Bug fixes

Other improvements and optimizations

Name changes in API

  • Rearrangement of utility functions. Most of them, like rank, find_keyword, collate, and functions related to date and time, are now in pythainlp.util module. (Utility functions: rearrange package locations + add thai_strftime() date and time formatter #160)
  • Some class and function names are changed from 1.7 to make it aligned with PEP8 (Style Guide for Python Code), make it more explicit about what they are doing, or make it more consistent with other related classes/functions. For examples:
    • thainer and thai2rom classes are now ThaiNameTagger and ThaiTransliterator (CapWords for class name)
    • pythainlp.soundex.LK82, pythainlp.soundex.Udom83, and pythainlp.MetaSound functions are now pythainlp.soundex.lk82, pythainlp.soundex.udom83, and pythainlp.soundex.metasound (small caps for function name, also move metasound to soundex module)
    • collation, correction, and romanization functions are now collate, correct, and romanize -- in a verb (action) form, and in line with tokenize and summarize functions.
  • pythainlp.corpus.alphabets, pythainlp.corpus.tone, etc. constants are now pythainlp.thai_consonants, pythainlp.thai_tonemarks, etc.
    • They are also now str instead of set.
    • This is to follow the example of string.ascii_letters, etc. str also iterate a little bit faster in one character for one member use cases that these constants are usually used for.
  • These changes will resulted in breaking code if your code directly invoke those classes/functions. In general, the change should be only at the level of class or function name, there should be no change at the arguments passing to the class or the function. Please refer to the API doc.
  • Internally, there are also name changes of corpus files (Naming convention for consistency วิธีการตั้งชื่อไฟล์ #141) but this should not has any effect to the API.
@wannaphong wannaphong added the enhancement enhance functionalities label Sep 22, 2018
@wannaphong wannaphong added this to the 1.8 milestone Sep 22, 2018
@bact bact self-assigned this Nov 12, 2018
@bact bact changed the title List PyThaiNLP 1.8 List PyThaiNLP 2.0 Nov 25, 2018
@wannaphong
Copy link
Member Author

PyThaiNLP 2.0 documentation #178

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement enhance functionalities
Projects
None yet
Development

No branches or pull requests

2 participants