You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm hoping that we can get to the point where we fully support the following languages.
English
Spanish
German
French
Russian
Japanese
Hindi
Farsi
Chinese
Arabic
I started adding unit tests for these languages for a few tokenizers here https://github.com/RubixML/ML/tree/master/tests/Tokenizers - however, it doesn't look like we support all the langugaes. I only speak English so it's hard for me to tell. Could we get some help from the community to verify that our Tokenizers support all of these languages and, if not, contribute a fix?
I'm hoping that we can get to the point where we fully support the following languages.
I started adding unit tests for these languages for a few tokenizers here https://github.com/RubixML/ML/tree/master/tests/Tokenizers - however, it doesn't look like we support all the langugaes. I only speak English so it's hard for me to tell. Could we get some help from the community to verify that our Tokenizers support all of these languages and, if not, contribute a fix?
https://github.com/RubixML/ML/tree/master/src/Tokenizers
Thank you!
The text was updated successfully, but these errors were encountered: