Speeding up mean_iou metric computation #569

qubvel · 2024-04-08T18:15:13Z

Working with semantic-segmentation example I observed that the mean_iou metric takes quite a significant time for computation (the time is comparable with a training loop).

The cause of such behavior is a conversion of resulted numpy arrays with segmentation maps to dataset format. Currently mean_iou metric supposes all segmentation arrays to be converted to datasets.Sequence(datasets.Sequence(datasets.Value("uint16"))) which means converting every item of the arrays.

This PR aims to speed up the mean_iou by changing the Features type to datasets.Image().

Here is a short script to measure computation time

import time
import numpy as np
import evaluate

image_size = 256
num_images = 100
num_labels = 10

# Prepare some random data
np.random.seed(4215)
references = np.random.rand(num_images, image_size, image_size) * (num_labels - 1)
predictions = np.random.rand(num_images, image_size, image_size) * (num_labels - 1)

references = references.round().astype(np.uint16)
predictions = predictions.round().astype(np.uint16)

# Load the slow and fast implementations
slow_iou = evaluate.load("mean_iou")  # the one from evaluate lib
faster_iou = evaluate.load("./metrics/mean_iou/")  # the local, modified one

# Track the time taken for each implementation
slow_iou_start = time.time()
slow_iou_results = slow_iou.compute(
    predictions=predictions,
    references=references,
    num_labels=num_labels,
    ignore_index=0,
    reduce_labels=False,
)
slow_iou_time = time.time() - slow_iou_start
slow_mean_iou = slow_iou_results["mean_iou"]
print(f"Slow IOU: {slow_mean_iou:.3f} in {slow_iou_time:.2f} seconds")

faster_iou_start = time.time()
faster_iou_results = faster_iou.compute(
    predictions=predictions,
    references=references,
    num_labels=num_labels,
    ignore_index=0,
    reduce_labels=False,
)
faster_iou_time = time.time() - faster_iou_start
faster_mean_iou = faster_iou_results["mean_iou"]
print(f"Faster IOU: {faster_mean_iou:.3f} in {faster_iou_time:.2f} seconds")

# Chech results are the same
assert np.isclose(slow_mean_iou, faster_mean_iou), "IOU values do not match"

# >>> Slow IOU: 0.052 in 11.73 seconds
# >>> Faster IOU: 0.052 in 0.26 seconds

As a result, we get 5-50x speed up in metric computation depending on the number of images, image size, and the number of classes.

P.S. PR also fixes not working example in README for mean_iou (#563).

NielsRogge · 2024-04-08T18:44:12Z

metrics/mean_iou/README.md

 ```python
 >>> import numpy as np
 >>> mean_iou = evaluate.load("mean_iou")
 >>> predicted = np.array([[2, 2, 3], [8, 2, 4], [3, 255, 2]])
 >>> ground_truth = np.array([[1, 2, 2], [8, 2, 1], [3, 255, 1]])
->>> results = mean_iou.compute(predictions=predicted, references=ground_truth, num_labels=10, ignore_index=255)
+>>> results = mean_iou.compute(predictions=[predicted], references=[ground_truth], num_labels=10, ignore_index=255)


Would this be a breaking change? Cause currently the metric works when passing 2D arrays for predicted and ground_truth, but this PR would require it to be a list of 2D arrays?

It seems like it doesn't work with 2D arrays originally, only with a list of them.

I tested it locally and got the following error:

Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Users/piakubovskii/Projects/evaluate/.venv/lib/python3.10/site-packages/evaluate/module.py", line 450, in compute self.add_batch(**inputs) File "/Users/piakubovskii/Projects/evaluate/.venv/lib/python3.10/site-packages/evaluate/module.py", line 541, in add_batch raise ValueError(error_msg) from None ValueError: Predictions and/or references don't match the expected format. Expected format: {'predictions': Sequence(feature=Sequence(feature=Value(dtype='uint16', id=None), length=-1, id=None), length=-1, id=None), 'references': Sequence(feature=Sequence(feature=Value(dtype='uint16', id=None), length=-1, id=None), length=-1, id=None)}, Input predictions: [[ 2 2 3] [ 8 2 4] [ 3 255 2]], Input references: [[ 1 2 2] [ 8 2 1] [ 3 255 1]]

There is also an issue opened 3 weeks ago with the same error:
#563

Maybe backward compatibility have been broken earlier

Ok, cc'ing @lhoestq here, I assume we can safely merge it in that case

indeed metrics .compute() take lists of references and predictions per image, not just a pair of reference and prediction

lhoestq

Nice improvement !

lhoestq · 2024-04-09T16:41:59Z

metrics/mean_iou/README.md

 ```python
 >>> import numpy as np
 >>> mean_iou = evaluate.load("mean_iou")
 >>> predicted = np.array([[2, 2, 3], [8, 2, 4], [3, 255, 2]])
 >>> ground_truth = np.array([[1, 2, 2], [8, 2, 1], [3, 255, 1]])
->>> results = mean_iou.compute(predictions=predicted, references=ground_truth, num_labels=10, ignore_index=255)
+>>> results = mean_iou.compute(predictions=[predicted], references=[ground_truth], num_labels=10, ignore_index=255)


indeed metrics .compute() take lists of references and predictions per image, not just a pair of reference and prediction

NielsRogge

Ok great, let's merge (when the CI is green)

NielsRogge reviewed Apr 8, 2024

View reviewed changes

qubvel mentioned this pull request Apr 9, 2024

Fix and simplify semantic-segmentation example huggingface/transformers#30145

Merged

5 tasks

lhoestq approved these changes Apr 9, 2024

View reviewed changes

NielsRogge approved these changes Apr 9, 2024

View reviewed changes

qubvel mentioned this pull request Apr 10, 2024

Update python to 3.8 #571

Merged

qubvel mentioned this pull request Apr 18, 2024

Release: 0.4.2 #579

Merged

qubvel added 2 commits April 18, 2024 16:09

Change Features from Sequence to Image

c0bacad

Fix example

6da1cd8

qubvel force-pushed the speedup_mean_iou branch from 654b252 to 6da1cd8 Compare April 18, 2024 15:09

lhoestq merged commit edb83af into huggingface:main Apr 18, 2024
6 checks passed

qubvel mentioned this pull request May 1, 2024

Update mean IoU in segformer example huggingface/blog#2039

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speeding up mean_iou metric computation #569

Speeding up mean_iou metric computation #569

qubvel commented Apr 8, 2024 •

edited

NielsRogge Apr 8, 2024 •

edited

qubvel Apr 8, 2024

NielsRogge Apr 8, 2024

lhoestq Apr 9, 2024

lhoestq left a comment

lhoestq Apr 9, 2024

NielsRogge left a comment

Speeding up mean_iou metric computation #569

Speeding up mean_iou metric computation #569

Conversation

qubvel commented Apr 8, 2024 • edited

NielsRogge Apr 8, 2024 • edited

Choose a reason for hiding this comment

qubvel Apr 8, 2024

Choose a reason for hiding this comment

NielsRogge Apr 8, 2024

Choose a reason for hiding this comment

lhoestq Apr 9, 2024

Choose a reason for hiding this comment

lhoestq left a comment

Choose a reason for hiding this comment

lhoestq Apr 9, 2024

Choose a reason for hiding this comment

NielsRogge left a comment

Choose a reason for hiding this comment

qubvel commented Apr 8, 2024 •

edited

NielsRogge Apr 8, 2024 •

edited