-
-
Notifications
You must be signed in to change notification settings - Fork 630
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tantivy NDCG Benchmarking Information Retrieval (BEIR) #2455
Comments
I don't have time to debug this thing. One thing you can do is pick one specific example where tantivy is outperformed by BM25, and use "explain". The usual suspects are
Tantivy has an explain function telling precisely how tantivy came up with a given score. https://docs.rs/tantivy/latest/tantivy/query/trait.Query.html#method.explain |
Thank you @fulmicoton, I've gone through each of the usual suspects and verify each. I also ran the task against Lucene which yielded more similar scores. Look like this is working as intended.
Regarding why Beir got such highscore, their "BM25" retrieval task is just a wrapper around ElasticSearch. I'm evaluating ElasticSearch now, will update the result soon. |
Updated result with ElasticSearch evaluation and increase the retrieval task complexity from single field to multifield. The current result look reasonable as ElasticSearch default to do a bit more than BM25. I'll contact Beir about their specifics test since their result look a bit too pretty. I'll close this issue. Thank you @fulmicoton!
|
I created a repo to evaluate Tantivy retrieval using measurement like ndcg, map, and recall. I'm following Beir method for this evaluation. In the project, I use tantivy to index and retrieve document from multiple datasets. The retrieval result is saved in a tsv and then loaded into python for scoring with pytrec_eval (which is what Beir is built upon).
Currently, my current result is suspiciously low in comparison to the baseline BM25-flat published on Beir leaderboard.
I was following tantivy example to index and search, not sure if this is the best way?
If you can have a look let me know if there's anything wonky with my retrieval implementation, that would be very much appreciated.
The text was updated successfully, but these errors were encountered: