Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The way to set which object is TP when more than one detection overlapping a ground truth seems to be wrong #172

Closed
yijiew opened this issue Sep 22, 2022 · 2 comments

Comments

@yijiew
Copy link

yijiew commented Sep 22, 2022

In the example section, it mentions:

In some images there are more than one detection overlapping a ground truth (Images 2, 3, 4, 5, 6 and 7). For those cases, the predicted box with the highest IOU is considered TP (e.g. in image 1 "E" is TP while "D" is FP because IOU between E and the groundtruth is greater than the IOU between D and the groundtruth). This rule is applied by the PASCAL VOC 2012 metric: "e.g. 5 detections (TP) of a single object is counted as 1 correct detection and 4 false detections”.

I don't think we should decide which detection is TP by IOU only. In the original PASCAL VOC 2012 you sited, it says:

Detections output by a method were assigned to ground truth objects satisfying the overlap criterion in order ranked by the (decreasing) confidence output. Multiple detections of the same object in an image were considered false detections e.g. 5 detections of a single object counted as 1 correct detection and 4 false detections.

It means that we first decide a IOU threshold, then all bboxes that meets the threshold criteria are candidates. And then we select the one with the highest detection score. This one makes more sense because consider that when we are calculating the Precision/Recall, we are actually thresholding the confidence score. The bbox with score lower than the threshold would actually "disappear" from the image. Imagine a case when two detection matches with 1 groundtruth. One with IOU 90%, confidence score 0.2. One with IOU 80%, confidence score 0.8. If we select the IOU threshold to be 0.5, both should meet the criteria. Then let's say we are computing the recall and precision at 0.5. We would consider both detection as false positive, which is not the case because the latter is definitely a true positive.

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@Yonglin5170
Copy link

In the example section, it mentions:

In some images there are more than one detection overlapping a ground truth (Images 2, 3, 4, 5, 6 and 7). For those cases, the predicted box with the highest IOU is considered TP (e.g. in image 1 "E" is TP while "D" is FP because IOU between E and the groundtruth is greater than the IOU between D and the groundtruth). This rule is applied by the PASCAL VOC 2012 metric: "e.g. 5 detections (TP) of a single object is counted as 1 correct detection and 4 false detections”.

I don't think we should decide which detection is TP by IOU only. In the original PASCAL VOC 2012 you sited, it says:

Detections output by a method were assigned to ground truth objects satisfying the overlap criterion in order ranked by the (decreasing) confidence output. Multiple detections of the same object in an image were considered false detections e.g. 5 detections of a single object counted as 1 correct detection and 4 false detections.

It means that we first decide a IOU threshold, then all bboxes that meets the threshold criteria are candidates. And then we select the one with the highest detection score. This one makes more sense because consider that when we are calculating the Precision/Recall, we are actually thresholding the confidence score. The bbox with score lower than the threshold would actually "disappear" from the image. Imagine a case when two detection matches with 1 groundtruth. One with IOU 90%, confidence score 0.2. One with IOU 80%, confidence score 0.8. If we select the IOU threshold to be 0.5, both should meet the criteria. Then let's say we are computing the recall and precision at 0.5. We would consider both detection as false positive, which is not the case because the latter is definitely a true positive.

i agree with you, and this calculation method was used in yolov5 that makes me confused.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants