Wrong Calculation of Mean Average Precision #1184

RostyslavUA · 2022-08-15T15:15:05Z

🐛 Bug

Wrong calculation of mean Average Precision (mAP). I get

{'map': tensor(1.),
 'map_50': tensor(1.),
 'map_75': tensor(-1),
 'map_small': tensor(1.),
 'map_medium': tensor(-1.),
 'map_large': tensor(-1.),
 'mar_1': tensor(1.),
 'mar_10': tensor(1.),
 'mar_100': tensor(1.),
 'mar_small': tensor(1.),
 'mar_medium': tensor(-1.),
 'mar_large': tensor(-1.),
 'map_per_class': tensor([ 1.,  1., -1., -1.]),
 'mar_100_per_class': tensor([ 1.,  1., -1., -1.])}

while I would expect to have 'map': tensor(0.5) and map_per_class': tensor([ 1., 1., 0, 0.]) (see example below)

To Reproduce

Steps to reproduce the behavior...

from torchmetrics.detection.mean_ap import MeanAveragePrecision
metric = MeanAveragePrecision(iou_thresholds=[0.5], class_metrics=True)
preds = [
  dict(
    boxes=torch.Tensor([[0, 0, 20, 20],
                        [30, 30, 50, 50],
                        [70, 70, 90, 90],  # FP
                        [100, 100, 120, 120]]),  # FP
    scores=torch.Tensor([0.6, 0.6, 0.6, 0.6]),
    labels=torch.IntTensor([0, 1, 2, 3]),
  )

]

targets = [
  dict(
    boxes=torch.Tensor([[0, 0, 20, 20],
                        [30, 30, 50, 50]]),
    labels=torch.IntTensor([0, 1]),
  )
]
metric.update(preds, targets)
metric.compute()

Expected behavior

AP per class 2 and 3 must be 0. Thus, the mAP would also become 0.5.

Environment

TM version is 0.9.3 (installed with pip)
Python 3.7.4
Torch 1.10.0+cu111 (installed with pip)
Windows Subsystem for Linux

The text was updated successfully, but these errors were encountered:

github-actions · 2022-08-15T15:16:04Z

Hi! thanks for your contribution!, great first issue!

MihailMihaylov97 · 2022-09-02T13:09:23Z

Any updates on this issue?

dhananjaisharma10 · 2022-09-10T17:07:05Z

Why does the above happen?

Precision and Recall are initialized as tensors containing -1s here.
When the code enters these nested for-loops, the static method MeanAveragePrecision.__calculate_recall_precision_scores is called. For class indices 2 and 3, this method does not cause any change in the corresponding locations in the Precision and Recall tensors (precision[:, :, class_idx, :, :] and recall[:, class_idx, :, :], respectively). This happens because this method returns without any change here for the given example. Hence, all entries are still -1s for these classes.
Since all entries are -1s, this call used for calculating mean AP returns a torch.tensor([-1.0]).

What can be done to solve this issue?

In such a case, the static method MeanAveragePrecision.__calculate_recall_precision_scores can be changed to modify precision and recall when there are no GT bboxes. Basically, your example represents an edge case where there cannot be any True Positives (TP), hence, Precision must be zero. Recall will be undefined in this case since the numbers of both TP and FN are zero. Note that, in your example, you had FP for classes 2 and 3. If they were not present, then Precision would also be undefined for these classes.

Hope it helps! 😄

Borda · 2022-10-19T12:40:06Z

What can be done to solve this issue?

@dhananjaisharma10 would you be interested to send a draft PR with a suggested fix? 🐰

dhananjaisharma10 · 2022-10-19T12:53:58Z

What can be done to solve this issue?

@dhananjaisharma10 would you be interested to send a draft PR with a suggested fix? 🐰

@Borda Sure, I can commence work on this.

Borda · 2022-11-07T11:08:14Z

@dhananjaisharma10, how are you doing, can we help you with something? 🦦

dhananjaisharma10 · 2022-11-07T15:40:20Z

@Borda I was a little held up in the last 2-3 weeks, though I am relatively free now. Is it okay if I provide an update by this weekend?

EDIT: I have commenced work on it.

Borda · 2022-11-07T20:00:09Z

Is it okay if I provide an update by this weekend?

sure, when ever you have time :)

dhananjaisharma10 · 2022-11-09T15:46:31Z

@Borda could you please check the Draft PR and guide further. Thanks.

Borda · 2022-11-09T18:08:42Z

@Borda could you please check the Draft PR and guide further. Thanks.

could you pls open the PR to this repo? 🦦

SkafteNicki · 2023-06-13T12:30:07Z

Hi @RostyslavUA, thanks for raising this issue.
I am trying to revamp our MeanAveragePrecision implementation (PR #1832) to use pycocotools directly and in that process check all issues that has to do with the metric, like this issue. What I am seeing is that the per class mean average precision by pycocotools is also

torch.tensor([ 1.,  1., -1., -1.])

and not torch.tensor([ 1., 1., 0., 0.]) as you are expecting, thus the map value should actually be 1 and not 0.5 as you expect.

The explanation:
First of not a expert in object detection, but I been looking at the original code for some time now. This is just my explanation why the results actually makes sense. For the example you provide we can quickly identify that the number of correct detections must be 1 for class {0,1} and 0 for class {2,3}. To calculate precision/recall we use the formulas:

precision = true object detection / all detected boxes
recall = true object detection / all ground truth boxes

precision is fine, it becomes 1.0 for class {0,1} and 0 for class {2,3}. However, recall is an issue. For class {0,1} it is still 1, but for class {2,3} we will be calculating 0 / 0 because we have no ground truth boxes. Thus for these two cases the recall is undefined, which means that the precision-recall curve is undefined, which means that the average precision for these two classes are undefined. This is the exact reason our current implementation (and pycocotools) assign them a score of -1 which means undefined in some way. As map is defined as the mean over the classwise average precision, that would naively indicate that we should average over two class with a well defined score and two class with a undefined score, again resulting in something undefined. The official implementation goes around this problem by simply filtering away scores marked as undefined with a -1:
https://github.com/cocodataset/cocoapi/blob/8c9bcc3cf640524c4c20a9c40e89cb6a2f2fa0e9/PythonAPI/pycocotools/cocoeval.py#L452-L455
Thus we arrive at a score of 1 and not 0.5.

Hope this makes sense :]

RostyslavUA added bug / fix Something isn't working help wanted Extra attention is needed labels Aug 15, 2022

SkafteNicki added this to the v0.11 milestone Sep 14, 2022

Borda modified the milestones: v0.11, v0.10 Oct 19, 2022

dhananjaisharma10 mentioned this issue Nov 8, 2022

Fix for incorrect mean AP dhananjaisharma10/metrics#1

Draft

4 tasks

dhananjaisharma10 mentioned this issue Nov 10, 2022

Fix for incorrect mean AP #1330

Closed

4 tasks

Borda modified the milestones: v0.10, v0.11 Nov 18, 2022

SkafteNicki modified the milestones: v0.11, v0.12 Nov 22, 2022

rbavery mentioned this issue Mar 22, 2023

MAR_1 is not 1 when COCO mAP is calculated with identical predictions and groundtruth #1644

Closed

This was referenced Jun 5, 2023

MeanAveragePrecision is slow #1024

Closed

Revert MAP implementation to pycocotools backend #1832

Merged

SkafteNicki closed this as completed Jun 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wrong Calculation of Mean Average Precision #1184

Wrong Calculation of Mean Average Precision #1184

RostyslavUA commented Aug 15, 2022 •

edited by Borda

Loading

github-actions bot commented Aug 15, 2022

MihailMihaylov97 commented Sep 2, 2022

dhananjaisharma10 commented Sep 10, 2022 •

edited

Loading

Borda commented Oct 19, 2022

dhananjaisharma10 commented Oct 19, 2022

Borda commented Nov 7, 2022

dhananjaisharma10 commented Nov 7, 2022 •

edited

Loading

Borda commented Nov 7, 2022

dhananjaisharma10 commented Nov 9, 2022

Borda commented Nov 9, 2022

SkafteNicki commented Jun 13, 2023

Wrong Calculation of Mean Average Precision #1184

Wrong Calculation of Mean Average Precision #1184

Comments

RostyslavUA commented Aug 15, 2022 • edited by Borda Loading

🐛 Bug

To Reproduce

Expected behavior

Environment

github-actions bot commented Aug 15, 2022

MihailMihaylov97 commented Sep 2, 2022

dhananjaisharma10 commented Sep 10, 2022 • edited Loading

Why does the above happen?

What can be done to solve this issue?

Borda commented Oct 19, 2022

dhananjaisharma10 commented Oct 19, 2022

Borda commented Nov 7, 2022

dhananjaisharma10 commented Nov 7, 2022 • edited Loading

Borda commented Nov 7, 2022

dhananjaisharma10 commented Nov 9, 2022

Borda commented Nov 9, 2022

SkafteNicki commented Jun 13, 2023

RostyslavUA commented Aug 15, 2022 •

edited by Borda

Loading

dhananjaisharma10 commented Sep 10, 2022 •

edited

Loading

dhananjaisharma10 commented Nov 7, 2022 •

edited

Loading