-
-
Notifications
You must be signed in to change notification settings - Fork 292
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Chinese Search #3
Comments
I would say that it even works for Chinese. The stemming process simply would do nothing since the stemming concept is not applicable in Chinese. If you take a look at the demo page, try to search for: 指原の乱 Regarding your second question, I am not sure what you meant. If you have some text in your database then yes, it can be searched. Where else could the text be if it's not in the db? |
after post this issue, I read the code ofthe project. the second question, I found the answer after read the code. typed use ipad, it is not convenient thx for your reply. |
I think the current tokenization process should also work for Chinese. It's a simple regular expression that breaks text into words. After that, each word is stemmed. The stemming concept cannot be applied to Chinese but to Indo-European group of languages, so the stemming will simply be ignored and will do nothing to the word. |
Chinese a bit complex, the test results are not good,I think have a Chinese Tokenizer analyzer is better |
* commit '1e3135846c74efe9818ef5517b8499b24c1f0eb5': remove changes removed removed default - order - order requests_foreign date de - most change from kaidl
thanks for your answer.
The text was updated successfully, but these errors were encountered: