Skip to content

Issues: lighttransport/japanese-llama-experiment

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Label
Filter by label
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Milestones
Filter by milestone
Assignee
Filter by who’s assigned
Sort

Issues list

Implement exact dedup(at scale) enhancement New feature or request
#11 opened Mar 10, 2024 by syoyo
1 of 2 tasks
クォート情報
#10 opened Feb 9, 2024 by syoyo
Japanese sentence splitter
#9 opened Jan 14, 2024 by syoyo
[TODO] Implement exact dedup using suffix array enhancement New feature or request
#7 opened Dec 10, 2023 by syoyo
Better filtering & dedup based on RedPajava v2 enhancement New feature or request
#6 opened Dec 9, 2023 by syoyo
Discard Chinese kanji enhancement New feature or request
#5 opened Nov 26, 2023 by syoyo
Japanese specific line-wise filtering enhancement New feature or request
#3 opened Nov 7, 2023 by syoyo
ProTip! Follow long discussions with comments:>50.