-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FIx:Correction of the calculation of the MRR indicator in the RAG benchmark #1228
Conversation
PR Description updated to latest commit (a1e8018)
|
PR Review
✨ Review tool usage guide:Overview: The tool can be triggered automatically every time a new PR is opened, or can be invoked manually by commenting on any PR.
See the review usage page for a comprehensive guide on using this tool. |
PR Code Suggestions
✨ Improve tool usage guide:Overview:
See the improve usage page for a comprehensive guide on using this tool. |
User description
When calculating the MRR metric, we only calculate the first correctly recalled countdown sort score and return that score
It depends on the specifics of your document as to how to determine if the documents are being recalled correctly; here we use a direct determination of equality, but in fact this is not an optimal approach
Type
bug_fix
Description
mean_reciprocal_rank
function inbase.py
to improve the calculation logic:mrr_sum
which was previously conditional onreference_docs
being non-empty.Changes walkthrough
base.py
Refactor mean_reciprocal_rank Calculation Logic
metagpt/rag/benchmark/base.py
mean_reciprocal_rank
to iterate overnodes first and then reference documents.
text
is indoc
.mrr_sum
upon finding amatch.
function.