Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Over-estimated map value #1793

Closed
leomrtinez opened this issue May 16, 2023 · 3 comments · Fixed by #1832
Closed

Over-estimated map value #1793

leomrtinez opened this issue May 16, 2023 · 3 comments · Fixed by #1832
Labels
bug / fix Something isn't working help wanted Extra attention is needed v0.11.x
Milestone

Comments

@leomrtinez
Copy link

leomrtinez commented May 16, 2023

Precision problem

Firstly, thanks for your tool.
For my PhD research project I want to make object detection on planetary surface images. So I implemented a FasterRCNN object detection code which I evaluated with torchmetrics.detection.mean_ap. I try to understand how does the map is really computed and, in order to do so I made a little code with arbitrary ground truth and prediction bounding_boxes. The thing is that the map calculated is far from what I expected.

I implemented the mean_ap function as follow :

from torchmetrics.detection.mean_ap import MeanAveragePrecision
eval_metrics = MeanAveragePrecision(iou_thresholds=[0.5])
eval_metrics.reset()
eval_metrics.update(preds, targets)
result = eval_metrics.compute()
map50 = result['map'].item()

here a figure of what I do :
case2_6

Expected behavior

I understood that for a single ground truth value, it's the highest predicted score which is choose as True Positive detection but here I don't understand the recall value.

Following the definition of the precision and the recall found here : https://en.wikipedia.org/wiki/Precision_and_recall , in this situation, I would except to have a precision of 0.5 (or 0.66 if the 0.5 score prediction box is uncounted) and a recall of 1.0.

But when I print the precision value it gave me this :
print(map50)
0.8350

I really don't understand where the 0.835 come from.

Environment

  • TorchMetrics version == 0.11.4
  • Python version == 3.9.16
  • PyTorch version == 2.0.0

Thanks for reading my problem :)

@leomrtinez leomrtinez added bug / fix Something isn't working help wanted Extra attention is needed labels May 16, 2023
@github-actions
Copy link

Hi! thanks for your contribution!, great first issue!

@SkafteNicki SkafteNicki added this to the future milestone Jun 2, 2023
@tkupek
Copy link
Contributor

tkupek commented Jun 5, 2023

Hi @leomrtinez and thanks for your question.
To understand if this is an explanatory issue or a bug in the calculation, may I ask you to repeat the experiment with the official mAP implementation from the pycocotools package?
If it's more convenient for you, you can use the torchmetrics interface if you switch to torchmetrics==0.6.0, we are using pycocotools as a backend there.

@SkafteNicki
Copy link
Member

Hi @leomrtinez, thanks for raising this issue.

TLDR: 0.8350 is the right value in the case.

Long answer:

I highly recommend going through a tutorial like this on how to manually calculate map metric, but let me try to do it.
First step is that we need to calculate the precision-recall curve, not just the precision recall. To do this we start by numbering them based on confidence. I have done that below:
238666173-64fe13c4-c30d-4123-9f5f-3b3352ec61c4
We note that box 1 and box 3 are true positives and box 2 and box 4 are false positives. Thus in order the cumulative statistics are (remember precision is given as TP / (TP + FP) and recall is TP / number of ground truth boxes):

Box nb Conf Matches Cumulative TP Cumulative FP Precision Recall
1 0.99 TP 1 0 1/(1+0) = 1 1/2
2 0.7 FP 1 1 1/(1+1) = 1/2 1/2
3 0.69 TP 2 1 2/(2+1) = 2/3 2/2 = 1
4 0.5 FP 2 2 2/(2+2) = 1/2 2/2 = 1

I plotted the resulting precision recall curve based on the values in table as the blue curve below:
Untitled2

The next step is calculating the average precision = the area under the curve. This is done through interpolation, because this is a non-smooth function. You can read more about how the interpolation is done here but in this specific case it corresponds to the green curve (the two straight lines). The area under the interpolated curve is simply the average of the two values:

(1 + 2/3) / 2 = 0.833

which corresponds to 0.8350 which you get (due to numerical stuff they are not exactly the same.

What messes with your intuition is probably the interpolation in the end which in some sense over estimates the the area under the curve. But it is the official way to calculate the metric :]

Hope this explains everything.

@Borda Borda modified the milestones: future, v1.0.0 Jun 16, 2023
@Borda Borda added the v0.11.x label Aug 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug / fix Something isn't working help wanted Extra attention is needed v0.11.x
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants
@tkupek @Borda @SkafteNicki @leomrtinez and others