Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

problem with implem of mAP@R ? #331

Closed
drasros opened this issue May 25, 2021 · 3 comments
Closed

problem with implem of mAP@R ? #331

drasros opened this issue May 25, 2021 · 3 comments
Labels
question A general question about the library

Comments

@drasros
Copy link

drasros commented May 25, 2021

Hi,

Thanks for open-sourcing this repo!
I want to report a problem which I noticed the the XBM repo which uses code similar to yours for mAP@R calculation (I am not sure who's based on who, opening this issue on both).

I suspect there is a mistake in the implementation of mAP@R:
Line 99, you divide by max_possible_matches_per_row for this query (which for mAP@R is equal to R for each query).
But what should be done is divide by the actual number of relevant items in the R-best-ranked ones for this query (as is done for mAP 2 line later).
wikipedia page on AP:
image
(by the way, for mAP, as you do line 98, one has to decide what to do with queries which have no relevant items and on which AP is therefore not defined. You use max_possible_values_per_row=0 for these rows, but I am not sure this is the most standard choice, maybe it's good to give to option to set AP=0, or AP=1, or exclude these rows...)

Regards,
A

@drasros
Copy link
Author

drasros commented May 25, 2021

(To confirm this problem I ran some test of the xbm implem vs sklearn implem. Your implem is slightly different so I don't 100% claim that it is wrong, just suspect it and want to kindly report the suspiscion )

@KevinMusgrave
Copy link
Owner

KevinMusgrave commented May 25, 2021

I want to report a problem which I noticed the the XBM repo which uses code similar to yours for mAP@R calculation (I am not sure who's based on who, opening this issue on both).

The accuracy calculation code originates from this repo (pytorch-metric-learning)

I suspect there is a mistake in the implementation of mAP@R:
Line 99, you divide by max_possible_matches_per_row for this query (which for mAP@R is equal to R for each query).
But what should be done is divide by the actual number of relevant items in the R-best-ranked ones for this query (as is done for mAP 2 line later).

I think the implementation is correct. Here is the definition I'm using

image

The divisor is different from the usual mAP. I probably could have chosen a better name than "mAP@R".

(by the way, for mAP, as you do line 98, one has to decide what to do with queries which have no relevant items and on which AP is therefore not defined. You use max_possible_values_per_row=0 for these rows, but I am not sure this is the most standard choice, maybe it's good to give to option to set AP=0, or AP=1, or exclude these rows...)

Good point. Actually if you're using AccuracyCalculator (and not the mean_average_precision function directly) then the queries with no relevant items should be filtered out in this function:

def try_getting_not_lone_labels(knn_labels, query_labels, not_lone_query_mask):
if not any(not_lone_query_mask):
return None, None
return (
knn_labels[not_lone_query_mask],
query_labels[not_lone_query_mask],
)

which is called here:

def calculate_mean_average_precision(
self,
knn_labels,
query_labels,
not_lone_query_mask,
embeddings_come_from_same_source,
**kwargs
):
knn_labels, query_labels = try_getting_not_lone_labels(
knn_labels, query_labels, not_lone_query_mask
)
if knn_labels is None:
return 0
return mean_average_precision(
knn_labels,
query_labels[:, None],
embeddings_come_from_same_source,
self.avg_of_avgs,
self.label_comparison_fn,
)

@drasros
Copy link
Author

drasros commented May 26, 2021

The divisor is different from the usual mAP. I probably could have chosen a better name than "mAP@R".

Thanks for your reply. I see. Yes, the name is somewhat misleading in the sense that this metric is not the mean of AP over the subset of R top-ranked candidates, but rather the full-mAP where the sum is truncated to relevants in top-R.

(the same confusion can arise for mAP@k, what do we do the integral on?.. I am not sure what the most standard thing is. )

@drasros drasros closed this as completed May 26, 2021
@KevinMusgrave KevinMusgrave added the question A general question about the library label May 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question A general question about the library
Projects
None yet
Development

No branches or pull requests

2 participants