Evaluation metrics implementation VS pycocotools #13363

aliencaocao · 2024-06-04T15:59:37Z

Search before asking

I have searched the YOLOv8 issues and discussions and found no similar questions.

Question

The map50 and map50:95 results i get by running pycocotools and ultralytic's built in model.val() is very different, and while most of the times they correlate, this is not always the case. What am I missing here?
I need to have a fair comparison VS other models that use pycocotools as evaluation.

Additional

For pycocotools one would get a full printout like:

Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.197
Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.574
Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.073
Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.184
Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.256
Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.100
Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.289
Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.289
Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.296
Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.264

Is there anyway I can reproduce with ultralytic's implemented models?

The text was updated successfully, but these errors were encountered:

glenn-jocher · 2024-06-04T23:08:44Z

Hello,

Thank you for reaching out with your query regarding the differences in evaluation metrics between pycocotools and Ultralytics' built-in model.val() method.

The discrepancies you're observing might be due to several factors, including differences in the IoU thresholds, area ranges, and maximum detections (maxDets) settings used during evaluation. Ultralytics' model.val() method and pycocotools might not use identical default settings for these parameters.

To align the evaluation metrics more closely with those provided by pycocotools, you can adjust the IoU thresholds and other relevant parameters in the YOLOv8 validation configuration to match those used by pycocotools. This should help in achieving a fairer comparison between different models.

If you need specific guidance on how to adjust these settings or further assistance, please feel free to ask!

aliencaocao · 2024-06-04T23:40:49Z

Yes whats the parameters used by ultralytics and how can i replicate on pycocotools in another repo?

Or, if its easier, how do I replicate pycococools by changing ultralytics implementation?

glenn-jocher · 2024-06-05T07:46:18Z

Hello!

To align the evaluation metrics between Ultralytics and pycocotools, you can adjust the parameters in Ultralytics' validation settings to match those typically used by pycocotools. Here are the key parameters you might consider:

IoU Thresholds: pycocotools often uses a range of IoU thresholds from 0.50 to 0.95 for calculating average precision. Ensure your Ultralytics validation settings use the same range.
Area Ranges: pycocotools categorizes object detections into small, medium, and large based on their area. You can specify similar categorizations in Ultralytics settings if not already set.
Max Detections (maxDets): This parameter is crucial for calculating average recall. Set this to the same values used in pycocotools (e.g., 1, 10, 100).

To adjust these settings in Ultralytics, you can modify the validation configuration file or pass these parameters directly through the CLI or Python API. For example:

model.val(data='dataset.yaml', imgsz=640, conf=0.25, iou_thres=0.6, max_det=100)

This should help you achieve comparable evaluation metrics between the two tools. If you need more specific adjustments or further assistance, please let me know! 🚀

aliencaocao · 2024-06-10T03:05:04Z

Thank you.

glenn-jocher · 2024-06-10T09:55:00Z

Hello,

Thank you for reaching out! To help us investigate the issue effectively, could you please provide a minimum reproducible code example? This will allow us to replicate the problem on our end and work towards a solution. You can find guidelines on how to create a minimum reproducible example here.

Additionally, please ensure that you are using the latest versions of torch and ultralytics. If you haven't already, you can upgrade your packages with the following commands:

pip install --upgrade torch
pip install --upgrade ultralytics

Once you've updated your packages and provided the reproducible code, we'll be able to dive deeper into the issue. If you have any other questions or need further assistance, feel free to ask! 😊

aliencaocao added the question Further information is requested label Jun 4, 2024

aliencaocao closed this as completed Jun 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evaluation metrics implementation VS pycocotools #13363

Evaluation metrics implementation VS pycocotools #13363

aliencaocao commented Jun 4, 2024

glenn-jocher commented Jun 4, 2024

aliencaocao commented Jun 4, 2024 •

edited

Loading

glenn-jocher commented Jun 5, 2024

aliencaocao commented Jun 10, 2024

glenn-jocher commented Jun 10, 2024

Evaluation metrics implementation VS pycocotools #13363

Evaluation metrics implementation VS pycocotools #13363

Comments

aliencaocao commented Jun 4, 2024

Search before asking

Question

Additional

glenn-jocher commented Jun 4, 2024

aliencaocao commented Jun 4, 2024 • edited Loading

glenn-jocher commented Jun 5, 2024

aliencaocao commented Jun 10, 2024

glenn-jocher commented Jun 10, 2024

aliencaocao commented Jun 4, 2024 •

edited

Loading