New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Intl.Segmenter example to split #194
Conversation
Codecov Report
@@ Coverage Diff @@
## production #194 +/- ##
==============================================
Coverage 100.00% 100.00%
==============================================
Files 490 530 +40
Lines 4426 1984 -2442
Branches 396 85 -311
==============================================
- Hits 4426 1984 -2442
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
|
Hey, nice to see you back 😊😊 What do you think? |
It still seems to me that for 99% of cases And the const str = "吾輩は猫である。名前はたぬき。";
const segmenterJa = new Intl.Segmenter("ja-JP", { granularity: "word" });
const segments = segmenterJa.segment(str);
console.table(Array.from(segments));
// [{segment: '吾輩', index: 0, input: '吾輩は猫である。名前はたぬき。', isWordLike: true},
// etc.
// ]And it also seems that |
|
Sorry I didn't approve it yet, I've been quite busy at work 😅 |
Recently found out that
Intl.Segmenterprovides a way to split the string to graphemes just like thelodash.splitdoes.No more
['f', 'o', 'o', '\uD83D', '\uDC68', '', '\uD83D', '\uDC68', '', '\uD83D', '\uDC66', '', '\uD83D', '\uDC66'], yay!