Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No results found error when searching in Japanese #3769

Closed
hieunv5501 opened this issue May 23, 2023 · 4 comments
Closed

No results found error when searching in Japanese #3769

hieunv5501 opened this issue May 23, 2023 · 4 comments
Labels
bug Something isn't working as expected language Anything related to languages

Comments

@hieunv5501
Copy link

I am using Meilisearch for japanese data. In my data there is a sentence that is : 日東電工株式会社の経営者 .
I'm having a problem that when I search with words like: 日東, 日東電工, 日東電工株式会社. I can't find any results.

Data
image

Search with words like: 日東, 日東電工, 日東電工株式会社
image
image

I am in need of a solution to this or a version that fixes this.

Meilisearch version: v0.29.3

@curquiza curquiza added support Issues related to support questions language Anything related to languages labels May 23, 2023
@oluademola
Copy link

Hello @hieunv5501. Sorry for that. It could be an issue with the Japanese tokenizer. However, your version of meilisearch is quite old, and we have made some improvements since then. Could you please try using the latest version? It might fix the issue for you.

@hieunv5501
Copy link
Author

hieunv5501 commented May 24, 2023

Hello @hieunv5501. Sorry for that. It could be an issue with the Japanese tokenizer. However, your version of meilisearch is quite old, and we have made some improvements since then. Could you please try using the latest version? It might fix the issue for you.

I tried using v1.1.1 version but still the problem is not solved.

image

@ManyTheFish
Copy link
Member

ManyTheFish commented May 24, 2023

Short version (@oluademola): it's a bug coming from uncertainty in the Language detection when no Hiragana/katakana characters are present in the query. (related to #3565)

Hello @hieunv5501,
Your issue seems related to #3565, we found out that if your dataset contains several documents without hiragana or katana characters, then Meilisearch struggles to make the difference between Chinese and Japanese when the search query only contains kanjis characters.
So far we didn't completely fix the issue, however, there is a prototype that deactivates Chinese Language detection forcing Meilisearch to always consider CJ characters as Japanese characters, this prototype is up to date with the last stable Meilisearch version so don't hesitate to use it:
meilisearch/product#532 (comment)

In addition to that, I want to make you know that we have a dedicated product discussion to speak about Japanese Language specificities and how Meilisearch could improve in supporting it, we have some really active contributors here, so don't hesitate to follow the discussion, ask questions or even suggest possible improvements!

meilisearch/product#532

Thank you for your report!

@ManyTheFish ManyTheFish added bug Something isn't working as expected and removed support Issues related to support questions labels May 24, 2023
@curquiza
Copy link
Member

Thank you Many for the detail answer, closing this issue in favor of the issues and discussions Many linked
Let us know @hieunv5501 if we were wrong and it does not fix your issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working as expected language Anything related to languages
Projects
None yet
Development

No branches or pull requests

4 participants