You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Right now it splits on word boundaries, and limits the size of the monolingual data to be less than 100 "words". This needs to be changed to support another segmentation strategy for CJK languages, maybe just a byte limit.
The text was updated successfully, but these errors were encountered:
Right now it splits on word boundaries, and limits the size of the monolingual data to be less than 100 "words". This needs to be changed to support another segmentation strategy for CJK languages, maybe just a byte limit.
The text was updated successfully, but these errors were encountered: