Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix empty string ('') added (in some cases) when using word_tokenize with join_broken_num=True #912

Merged
merged 2 commits into from May 11, 2024

Conversation

S2P2
Copy link
Contributor

@S2P2 S2P2 commented May 10, 2024

fix empty string bug

What does this changes

pythainlp/tokenize/_utils.py : add the statement if connected_token : to check before appending connected_token to tokens_joined

What was wrong

#911

How this fixes it

tokens_joined won't be appended by empty string anymore

Your checklist for this pull request

馃毃Please review the guidelines for contributing to this repository.

  • Passed code styles and structures
  • Passed code linting checks and unit test

fix empty string bug
PyThaiNLP#911
@pep8speaks
Copy link

pep8speaks commented May 10, 2024

Hello @S2P2! Thanks for updating this PR. We checked the lines you've touched for PEP聽8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 馃嵒

Comment last updated at 2024-05-10 10:31:30 UTC

fix space before :
Copy link

sonarcloud bot commented May 10, 2024

Quality Gate Passed Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

@coveralls
Copy link

Coverage Status

coverage: 79.093% (+0.03%) from 79.063%
when pulling dcd2b47 on S2P2:fix-join-broken-num
into a38fd5e on PyThaiNLP:dev.

@bact bact merged commit fd4175e into PyThaiNLP:dev May 11, 2024
10 of 13 checks passed
@bact
Copy link
Member

bact commented May 11, 2024

Thank you.

@wannaphong wannaphong added this to the 5.0 milestone May 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bug: empty string ('') added (in some cases) when using word_tokenize with join_broken_num=True
5 participants