Add COCO evaluation metrics #111

NielsRogge · 2021-05-03T13:08:05Z

I'm currently working on adding Facebook AI's DETR model (end-to-end object detection with Transformers) to HuggingFace Transformers. The model is working fine, but regarding evaluation, I'm currently relying on external CocoEvaluator and PanopticEvaluator objects which are defined in the original repository (here and here respectively).

Running these in a notebook gives you nice summaries like this:

It would be great if we could import these metrics from the Datasets library, something like this:

import datasets

metric = datasets.load_metric('coco')

for model_input, gold_references in evaluation_dataset:
    model_predictions = model(model_inputs)
    metric.add_batch(predictions=model_predictions, references=gold_references)

final_score = metric.compute()

I think this would be great for object detection and semantic/panoptic segmentation in general, not just for DETR. Reproducing results of object detection papers would be way easier.

However, object detection and panoptic segmentation evaluation is a bit more complex than accuracy (it's more like a summary of metrics at different thresholds rather than a single one). I'm not sure how to proceed here, but happy to help making this possible.

The text was updated successfully, but these errors were encountered:

bhavitvyamalik · 2021-06-02T09:05:08Z

Hi @NielsRogge,
I'd like to contribute these metrics to datasets. Let's start with CocoEvaluator first? Currently how are are you sending the ground truths and predictions in coco_evaluator?

NielsRogge · 2021-06-02T11:23:00Z

Great!

Here's a notebook that illustrates how I'm using CocoEvaluator: https://drive.google.com/file/d/1VV92IlaUiuPOORXULIuAdtNbBWCTCnaj/view?usp=sharing

The evaluation is near the end of the notebook.

bhavitvyamalik · 2021-06-03T14:34:41Z

I went through the code you've mentioned and I think there are 2 options on how we can go ahead:

Implement how DETR people have done this (they're relying very heavily on the official implementation and they're focussing on torch dataset here. I feel ours should be something generic instead of pytorch specific.
Do this implementation where user can convert its output and ground truth annotation to pre-defined format and then feed it into our function to calculate metrics (looks very similar to you wanted above)

In my opinion, 2nd option looks very clean but I'm still figuring out how's it transforming the box co-ordinates of coco_gt which you've passed to CocoEvaluator (ground truth for evaluation). Since your model output was already converted to COCO api, I faced little problems there.

NielsRogge · 2021-06-04T07:11:27Z

Ok, thanks for the update.

Indeed, the metrics API of Datasets is framework agnostic, so we can't rely on a PyTorch-only implementation.

This file is probably want we need to implement.

kadirnar · 2022-08-08T21:44:09Z

Hi @lvwerra

Do you plan to add a 3rd party application for the COCO map metric?

roboserg · 2023-08-07T00:52:06Z

Is there any update on this? What would be the recommended way of doing COCO eval with Huggingface?

NielsRogge · 2023-08-07T07:53:08Z

Yes there's an update on this. @rafaelpadilla has been working on adding native support for COCO metrics in the evaluate library, check the Space here: https://huggingface.co/spaces/rafaelpadilla/detection_metrics. For now you have to load the metric as follows:

import evaluate

evaluator = evaluate.load("rafaelpadilla/detection_metrics", json_gt=ground_truth_annotations, iou_type="bbox")

but this one is going to be integrated in the main evaluate library.

This is then leveraged to create the open object detection leaderboard: https://huggingface.co/spaces/rafaelpadilla/object_detection_leaderboard.

rafaelpadilla · 2023-08-08T13:57:27Z

Yep, we intend to integrate to evaluate library.

Meanwhile you can use from here https://huggingface.co/spaces/rafaelpadilla/detection_metrics

Update: the code with the evaluate AP metric and its variations was transferred to https://huggingface.co/spaces/hf-vision/detection_metrics

maltelorbach · 2023-12-14T13:39:36Z

Hi,
running

import evaluate
evaluator = evaluate.load("hf-vision/detection_metrics", json_gt=ground_truth_annotations, iou_type="bbox")

results in the following error:

ImportError: To be able to use hf-vision/detection_metrics, you need to install the following dependencies['detection_metrics'] using 'pip install detection_metrics' for instance'

How do I load the metric from the hub? Do I need to download the content of that repository manually first?

I'm running evaluate==0.4.1.

sushil-bharati · 2024-02-03T21:02:30Z

Ran into the same issue @maltelorbach posted on 12/14/2023

sklum · 2024-06-25T19:52:47Z

I spent some time digging into this. The issue is that the hf-vision/detection_metrics metric uses a local module for some coco related dependencies (that's called detection_metrics, which is why you get the ImportError of that flavor). I tried to restructure the space to have a flat directory structure, but then ran into this #189 because certain dependencies aren't loaded (or downloaded), or something. Gave up after that. It seems informative that the object detection example just rolls its own metric code with torchmetrics so it's probably easiest to do that.

NielsRogge · 2024-06-28T10:16:15Z

Yes for now we switched to using Torchmetrics as it already provides a performant implementation with support for distributed training etc. so no need to duplicate it. cc @qubvel

NielsRogge added the enhancement New feature or request label May 3, 2021

mariosasko added the transfer-to-evaluate label Jun 1, 2022

mariosasko transferred this issue from huggingface/datasets Jun 2, 2022

mariosasko removed the transfer-to-evaluate label Jun 2, 2022

lvwerra added metric request and removed enhancement New feature or request labels Aug 3, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add COCO evaluation metrics #111

Add COCO evaluation metrics #111

NielsRogge commented May 3, 2021 •

edited

Loading

bhavitvyamalik commented Jun 2, 2021

NielsRogge commented Jun 2, 2021 •

edited

Loading

bhavitvyamalik commented Jun 3, 2021 •

edited

Loading

NielsRogge commented Jun 4, 2021

kadirnar commented Aug 8, 2022

roboserg commented Aug 7, 2023

NielsRogge commented Aug 7, 2023 •

edited

Loading

rafaelpadilla commented Aug 8, 2023 •

edited

Loading

maltelorbach commented Dec 14, 2023

sushil-bharati commented Feb 3, 2024

sklum commented Jun 25, 2024

NielsRogge commented Jun 28, 2024 •

edited

Loading

Add COCO evaluation metrics #111

Add COCO evaluation metrics #111

Comments

NielsRogge commented May 3, 2021 • edited Loading

bhavitvyamalik commented Jun 2, 2021

NielsRogge commented Jun 2, 2021 • edited Loading

bhavitvyamalik commented Jun 3, 2021 • edited Loading

NielsRogge commented Jun 4, 2021

kadirnar commented Aug 8, 2022

roboserg commented Aug 7, 2023

NielsRogge commented Aug 7, 2023 • edited Loading

rafaelpadilla commented Aug 8, 2023 • edited Loading

maltelorbach commented Dec 14, 2023

sushil-bharati commented Feb 3, 2024

sklum commented Jun 25, 2024

NielsRogge commented Jun 28, 2024 • edited Loading

NielsRogge commented May 3, 2021 •

edited

Loading

NielsRogge commented Jun 2, 2021 •

edited

Loading

bhavitvyamalik commented Jun 3, 2021 •

edited

Loading

NielsRogge commented Aug 7, 2023 •

edited

Loading

rafaelpadilla commented Aug 8, 2023 •

edited

Loading

NielsRogge commented Jun 28, 2024 •

edited

Loading