You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, this is just an information to people who want to do NER.
I have found jieba tokenizer is not very good at tokenizing chinese surnames. For example: "我姓林" will be tokenized to "我" and "姓林". So I want to use yaha tokenizer instead.
And so far I had make my own yaha_tokenizer.py and conduct some corresponding change in registry.py. For people who also want to do NER, you can visit my repository: https://github.com/keineahnung2345/Rasa_NLU_Chi
and find the two files.
The text was updated successfully, but these errors were encountered:
Hello, this is just an information to people who want to do NER.
I have found jieba tokenizer is not very good at tokenizing chinese surnames. For example: "我姓林" will be tokenized to "我" and "姓林". So I want to use yaha tokenizer instead.
And so far I had make my own yaha_tokenizer.py and conduct some corresponding change in registry.py. For people who also want to do NER, you can visit my repository:
https://github.com/keineahnung2345/Rasa_NLU_Chi
and find the two files.
The text was updated successfully, but these errors were encountered: