-
Notifications
You must be signed in to change notification settings - Fork 287
Update 2.1 branch from dev #274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Merge from 2.0.5 release
issue_71: add documentation
1. Move all functions related to text processing to the new file `ulmfit/rules.py` 2. For the rules imported from `fastai` library, we copied the code to the pythainlp library 3. Use the code of BaseTokenizer class from `fastai` librarry
Function ungroup_emoji has a Cognitive Complexity of 7 (exceeds 5 allowed). Consider refactoring.
changes in this commit: - assert the output of the method `get_ner` when argument `tag` is set to True for all 13 tags as described in (https://github.com/wannaphongcom/thai-ner/tree/master/model/1.2) - assert the output of the method `get_ner` when argument `pos` is set to True - asser the output of the method `get_ner` when argument `pos` is set to False
As thainer version 1.2 provide different results from previous version
- Add new function `process_thai` as the function of `ulmfit` module. This function process Thai texts (such as replace repetitive character/words). More detail will be explained in the release note.
Change in this commit: - add new test case: test_remove_space - add new test case: test_replace_wrep_post_nonum - add new test case: test_replace_wrep_post - add new test case: test_replace_rep_nonum - add new test case: test_replace_rep_after - remove a test case: test_replace_all_caps (as the function `replace_all_caps` was removed from /pythianlp/ulmfit/__init__.py)
Fix tokenization benchmark issue
…eploy docs) Update command_line.rst
Add test cases for NER
|
Hello @bact! Thanks for opening this PR. We checked the lines you've touched for PEP 8 issues, and found:
|
No description provided.