Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MRR to quepid as a communal scorer. #525

Merged
merged 4 commits into from
Jan 26, 2023
Merged

Add MRR to quepid as a communal scorer. #525

merged 4 commits into from
Jan 26, 2023

Conversation

david-fisher
Copy link
Contributor

Description

Add reciprocal rank as a communal Scorer

Motivation and Context

Closes #523 adds a useful metric for known item search evaluation.

How Has This Been Tested?

Local install of quepid started with bin/setup_docker followed by bin/docker server

Screenshots or GIFs (if appropriate):

Screen_Recording_2022-06-03_at_3_20_10_PM_AdobeCreativeCloudExpress
Screen Shot 2022-06-03 at 4 02 00 PM

Types of changes

  • [] Bug fix (non-breaking change which fixes an issue)
  • Improvement (non-breaking change which improves existing functionality)
  • New feature (non-breaking change which adds new functionality)
  • [] Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code follows the code style of this project.
  • [] My change requires a change to the documentation.
  • [] I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • [] I have added tests to cover my changes.
  • All new and existing tests passed.

@david-fisher david-fisher marked this pull request as ready for review June 3, 2022 20:10
@epugh epugh temporarily deployed to quepid-pr-525 June 3, 2022 21:22 Inactive
@epugh epugh changed the title #523 Add MRR to quepid as a communal scorer. Add MRR to quepid as a communal scorer. Jun 3, 2022
@epugh
Copy link
Member

epugh commented Jun 3, 2022

I took the ticket name out of the title, because it gets confusing that the ticket isn't the pr, if that makes sense...

@epugh
Copy link
Member

epugh commented Jun 3, 2022

Is this MRR or RR? It appears to say RR in the code?

epugh
epugh previously requested changes Jun 3, 2022
Copy link
Member

@epugh epugh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this should either be rr@10.js or mrr@10.js???

@david-fisher
Copy link
Contributor Author

Looks like this should either be rr@10.js or mrr@10.js???

rr@10 for a single query, mrr@10 for a set of queries.

Much like AP@10 is actually MAP@10 for a collection of queries.

@david-fisher david-fisher requested a review from epugh June 4, 2022 16:56
@david-fisher
Copy link
Contributor Author

Looks like this should either be rr@10.js or mrr@10.js???

Did I misunderstand your change request? The new file is already named rr@10.js. What are you asking for here?

@epugh
Copy link
Member

epugh commented Jun 7, 2022

Looks like this should either be rr@10.js or mrr@10.js???

Did I misunderstand your change request? The new file is already named rr@10.js. What are you asking for here?

The title of the issue is "Add MRR", so maybe it should be "Add RR"?

@david-fisher
Copy link
Contributor Author

Looks like this should either be rr@10.js or mrr@10.js???

Did I misunderstand your change request? The new file is already named rr@10.js. What are you asking for here?

The title of the issue is "Add MRR", so maybe it should be "Add RR"?

RR for a single query, MRR for a set of queries. Quepid will be displaying MRR, the javascript computes RR for each individual query, which quepid averages together. In general, the aggregate name is used when referring to a metric that is being averaged across queries. Either name is fine in the issue.

@epugh
Copy link
Member

epugh commented Jun 7, 2022

Looks like this should either be rr@10.js or mrr@10.js???

Did I misunderstand your change request? The new file is already named rr@10.js. What are you asking for here?

The title of the issue is "Add MRR", so maybe it should be "Add RR"?

RR for a single query, MRR for a set of queries. Quepid will be displaying MRR, the javascript computes RR for each individual query, which quepid averages together. In general, the aggregate name is used when referring to a metric that is being averaged across queries. Either name is fine in the issue.

I guess I may need to take this on faith. When we refer to DCG, we have a file named DCG.js. I would assume that if we are referring to MRR, we would have a file named MRR.js, and if there was a seperate metric called RR, then it would have a scorer named RR.js?

I don't mean to be obtuse here, but the goal of Quepid is to make metrics etc simple and something I can explain to everyday users.. So I feel like if we are adding MRR to Quepid, then the file should be called mrr.js.

@epugh
Copy link
Member

epugh commented Jun 7, 2022

Okay, now I am super confused. Is this MRR or RR? Or does this NOT follow the pattern that we have of P, AP, DCG, NDCG etc, and is a new naming pattern?

@david-fisher
Copy link
Contributor Author

I guess I may need to take this on faith. When we refer to DCG, we have a file named DCG.js. I would assume that if we are referring to MRR, we would have a file named MRR.js, and if there was a seperate metric called RR, then it would have a scorer named RR.js?

There is only one metric, reciprocal rank. When we average the reciprocal rank scores for multiple queries, the result is called mean reciprocal rank.

From the perspective of what the code is computing, rr@10.js computes the reciprocal rank for a single query. The Quepid app uses that RR value to average across the set of queries in the collection to produce the MRR score for the full set.

The same is true of the computation being called AP@10, it computes a single value for a query, which is then averaged to produce what should be called MAP@10 in the Quepid display. One metric, averaged across multiple queries.

@david-fisher
Copy link
Contributor Author

Okay, now I am super confused. Is this MRR or RR? Or does this NOT follow the pattern that we have of P, AP, DCG, NDCG etc, and is a new naming pattern?

AP should be named MAP, mean average precision. All of the metric names have been established by the evaluation metric community. The names used by trec_eval should be considered canonical. Note, nDCG does not get called MnDCG when averaged. P@k does not get called MP@k when averaged. Only Mean Reciprocal Rank and Mean Average Precision have the M named variant for the aggregate score across a set of queries.

@david-fisher david-fisher dismissed epugh’s stale review July 26, 2022 13:24

File is already named as suggested

@epugh epugh merged commit dfdcf78 into main Jan 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add MRR (mean reciprocal rank) as a scorer
2 participants