problem with implem of mAP@R ? #331

drasros · 2021-05-25T15:24:50Z

Hi,

Thanks for open-sourcing this repo!
I want to report a problem which I noticed the the XBM repo which uses code similar to yours for mAP@R calculation (I am not sure who's based on who, opening this issue on both).

I suspect there is a mistake in the implementation of mAP@R:
Line 99, you divide by max_possible_matches_per_row for this query (which for mAP@R is equal to R for each query).
But what should be done is divide by the actual number of relevant items in the R-best-ranked ones for this query (as is done for mAP 2 line later).
wikipedia page on AP:

(by the way, for mAP, as you do line 98, one has to decide what to do with queries which have no relevant items and on which AP is therefore not defined. You use max_possible_values_per_row=0 for these rows, but I am not sure this is the most standard choice, maybe it's good to give to option to set AP=0, or AP=1, or exclude these rows...)

Regards,
A

drasros · 2021-05-25T15:29:05Z

(To confirm this problem I ran some test of the xbm implem vs sklearn implem. Your implem is slightly different so I don't 100% claim that it is wrong, just suspect it and want to kindly report the suspiscion )

KevinMusgrave · 2021-05-25T19:08:39Z

I want to report a problem which I noticed the the XBM repo which uses code similar to yours for mAP@R calculation (I am not sure who's based on who, opening this issue on both).

The accuracy calculation code originates from this repo (pytorch-metric-learning)

I suspect there is a mistake in the implementation of mAP@R:
Line 99, you divide by max_possible_matches_per_row for this query (which for mAP@R is equal to R for each query).
But what should be done is divide by the actual number of relevant items in the R-best-ranked ones for this query (as is done for mAP 2 line later).

I think the implementation is correct. Here is the definition I'm using

The divisor is different from the usual mAP. I probably could have chosen a better name than "mAP@R".

(by the way, for mAP, as you do line 98, one has to decide what to do with queries which have no relevant items and on which AP is therefore not defined. You use max_possible_values_per_row=0 for these rows, but I am not sure this is the most standard choice, maybe it's good to give to option to set AP=0, or AP=1, or exclude these rows...)

Good point. Actually if you're using AccuracyCalculator (and not the mean_average_precision function directly) then the queries with no relevant items should be filtered out in this function:

pytorch-metric-learning/src/pytorch_metric_learning/utils/accuracy_calculator.py

Lines 185 to 191 in 9559b21

    
           def try_getting_not_lone_labels(knn_labels, query_labels, not_lone_query_mask): 
        
               if not any(not_lone_query_mask): 
        
                   return None, None 
        
               return ( 
        
                   knn_labels[not_lone_query_mask], 
        
                   query_labels[not_lone_query_mask], 
        
               )

which is called here:

pytorch-metric-learning/src/pytorch_metric_learning/utils/accuracy_calculator.py

Lines 308 to 328 in 9559b21

    
           def calculate_mean_average_precision( 
        
               self, 
        
               knn_labels, 
        
               query_labels, 
        
               not_lone_query_mask, 
        
               embeddings_come_from_same_source, 
        
               **kwargs 
        
           ): 
        
               knn_labels, query_labels = try_getting_not_lone_labels( 
        
                   knn_labels, query_labels, not_lone_query_mask 
        
               ) 
        
               if knn_labels is None: 
        
                   return 0 
        
               return mean_average_precision( 
        
                   knn_labels, 
        
                   query_labels[:, None], 
        
                   embeddings_come_from_same_source, 
        
                   self.avg_of_avgs, 
        
                   self.label_comparison_fn, 
        
               )

drasros · 2021-05-26T06:26:46Z

The divisor is different from the usual mAP. I probably could have chosen a better name than "mAP@R".

Thanks for your reply. I see. Yes, the name is somewhat misleading in the sense that this metric is not the mean of AP over the subset of R top-ranked candidates, but rather the full-mAP where the sum is truncated to relevants in top-R.

(the same confusion can arise for mAP@k, what do we do the integral on?.. I am not sure what the most standard thing is. )

drasros closed this as completed May 26, 2021

drasros mentioned this issue May 26, 2021

map@R metric implem seems mistaken msight-tech/research-xbm#11

Closed

KevinMusgrave added the question A general question about the library label May 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

problem with implem of mAP@R ? #331

problem with implem of mAP@R ? #331

drasros commented May 25, 2021

drasros commented May 25, 2021

KevinMusgrave commented May 25, 2021 •

edited

Loading

drasros commented May 26, 2021

problem with implem of mAP@R ? #331

problem with implem of mAP@R ? #331

Comments

drasros commented May 25, 2021

drasros commented May 25, 2021

KevinMusgrave commented May 25, 2021 • edited Loading

drasros commented May 26, 2021

KevinMusgrave commented May 25, 2021 •

edited

Loading