-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"More Like This" feature #466
Comments
Hey, this is not currently on the roadmap. MeiliSearch do have pagination if you want to have more results based on a single search but that's it. |
Hello @tpayet, My use case is pretty simple. I am using a "More Like This" to implement a very crude item-to-item [1] product recommendation based on a few actions a user can perform. e.g: "Product A" is similar to "Product B", if the user "Likes" "Product A", she might also like "Product B." Currently, I am using Whoosh [2] to perform this task; however, I am also planning to integrate Meilisearch in this project, it would be very helpful to have such a feature on Meilisearch so I could remove my dependency on Whoosh. I guess I can implement something similar with Meilisearch by somehow extracting the "Key terms" of a document and performing a search with the terms. [1] https://en.wikipedia.org/wiki/Item-item_collaborative_filtering |
I understand ^^ It is not currently on our roadmap, but if you are willing to implement something, @Kerollmops can help you and guides you through the understanding of MeiliSearch :) Let me know if you are interested, we can invite you on a Slack, it will be easier for us to collaborate |
@tpayet I could try, but, besides a few experiments, I really don't have a good experience with Rust. |
Hey @dimiro1, I though a little bit about this "More Like This" feature. As I understand it, to find documents that looks like a given document or a group of documents we could take all the rarest words in the origin document or group of documents and search for all other documents that have those words too (remove the most common words when not enough documents are found). A good improvement of this method could be to also use the synonyms of those rare words. The problem with this approach is that there is not way to get all the words related to a given document, MeiliSearch use an inverted index: the key is the word and the value is the list of documents ids containing the given word. So retrieving the words contained in a document or a group of documents can only be done by iterating through all the words and searching for those that are in the documents. It can take an huge amount of time as it is O(n * m) where n is the number of words and m is the size of all the lists of documents ids associated with each words. |
@Kerollmops Thanks for your reply, your explanation makes sense to me. A possible solution would be to start storing the keywords together with the documents. Unfortunately, this has the disadvantages of increasing the index size and will require some sort of transaction mechanism to make sure the keywords are in sync. |
Hello, Thanks for this feature proposal! I will close this issue in favor of our public roadmap. I invite everyone interested in this feature to update it on the roadmap. |
Do you have any plans to implement a "More Like This"[1, 2] feature?
[1] https://lucene.apache.org/core/8_4_1/queries/org/apache/lucene/queries/mlt/MoreLikeThis.html
[2] https://whoosh.readthedocs.io/en/latest/api/searching.html?highlight=more%20like#whoosh.searching.Hit.more_like_this
The text was updated successfully, but these errors were encountered: