Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speeding up mean_iou metric computation #569

Merged
merged 2 commits into from
Apr 18, 2024

Conversation

qubvel
Copy link
Member

@qubvel qubvel commented Apr 8, 2024

Working with semantic-segmentation example I observed that the mean_iou metric takes quite a significant time for computation (the time is comparable with a training loop).

The cause of such behavior is a conversion of resulted numpy arrays with segmentation maps to dataset format. Currently mean_iou metric supposes all segmentation arrays to be converted to datasets.Sequence(datasets.Sequence(datasets.Value("uint16"))) which means converting every item of the arrays.

This PR aims to speed up the mean_iou by changing the Features type to datasets.Image().

Here is a short script to measure computation time

import time
import numpy as np
import evaluate

image_size = 256
num_images = 100
num_labels = 10

# Prepare some random data
np.random.seed(4215)
references = np.random.rand(num_images, image_size, image_size) * (num_labels - 1)
predictions = np.random.rand(num_images, image_size, image_size) * (num_labels - 1)

references = references.round().astype(np.uint16)
predictions = predictions.round().astype(np.uint16)

# Load the slow and fast implementations
slow_iou = evaluate.load("mean_iou")  # the one from evaluate lib
faster_iou = evaluate.load("./metrics/mean_iou/")  # the local, modified one

# Track the time taken for each implementation
slow_iou_start = time.time()
slow_iou_results = slow_iou.compute(
    predictions=predictions,
    references=references,
    num_labels=num_labels,
    ignore_index=0,
    reduce_labels=False,
)
slow_iou_time = time.time() - slow_iou_start
slow_mean_iou = slow_iou_results["mean_iou"]
print(f"Slow IOU: {slow_mean_iou:.3f} in {slow_iou_time:.2f} seconds")

faster_iou_start = time.time()
faster_iou_results = faster_iou.compute(
    predictions=predictions,
    references=references,
    num_labels=num_labels,
    ignore_index=0,
    reduce_labels=False,
)
faster_iou_time = time.time() - faster_iou_start
faster_mean_iou = faster_iou_results["mean_iou"]
print(f"Faster IOU: {faster_mean_iou:.3f} in {faster_iou_time:.2f} seconds")

# Chech results are the same
assert np.isclose(slow_mean_iou, faster_mean_iou), "IOU values do not match"

# >>> Slow IOU: 0.052 in 11.73 seconds
# >>> Faster IOU: 0.052 in 0.26 seconds

As a result, we get 5-50x speed up in metric computation depending on the number of images, image size, and the number of classes.

P.S. PR also fixes not working example in README for mean_iou (#563).

```python
>>> import numpy as np
>>> mean_iou = evaluate.load("mean_iou")
>>> predicted = np.array([[2, 2, 3], [8, 2, 4], [3, 255, 2]])
>>> ground_truth = np.array([[1, 2, 2], [8, 2, 1], [3, 255, 1]])
>>> results = mean_iou.compute(predictions=predicted, references=ground_truth, num_labels=10, ignore_index=255)
>>> results = mean_iou.compute(predictions=[predicted], references=[ground_truth], num_labels=10, ignore_index=255)
Copy link
Contributor

@NielsRogge NielsRogge Apr 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this be a breaking change? Cause currently the metric works when passing 2D arrays for predicted and ground_truth, but this PR would require it to be a list of 2D arrays?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like it doesn't work with 2D arrays originally, only with a list of them.

I tested it locally and got the following error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/piakubovskii/Projects/evaluate/.venv/lib/python3.10/site-packages/evaluate/module.py", line 450, in compute
    self.add_batch(**inputs)
  File "/Users/piakubovskii/Projects/evaluate/.venv/lib/python3.10/site-packages/evaluate/module.py", line 541, in add_batch
    raise ValueError(error_msg) from None
ValueError: Predictions and/or references don't match the expected format.
Expected format: {'predictions': Sequence(feature=Sequence(feature=Value(dtype='uint16', id=None), length=-1, id=None), length=-1, id=None), 'references': Sequence(feature=Sequence(feature=Value(dtype='uint16', id=None), length=-1, id=None), length=-1, id=None)},
Input predictions: [[  2   2   3]
 [  8   2   4]
 [  3 255   2]],
Input references: [[  1   2   2]
 [  8   2   1]
 [  3 255   1]]

There is also an issue opened 3 weeks ago with the same error:
#563

Maybe backward compatibility have been broken earlier

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, cc'ing @lhoestq here, I assume we can safely merge it in that case

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed metrics .compute() take lists of references and predictions per image, not just a pair of reference and prediction

Copy link
Member

@lhoestq lhoestq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice improvement !

```python
>>> import numpy as np
>>> mean_iou = evaluate.load("mean_iou")
>>> predicted = np.array([[2, 2, 3], [8, 2, 4], [3, 255, 2]])
>>> ground_truth = np.array([[1, 2, 2], [8, 2, 1], [3, 255, 1]])
>>> results = mean_iou.compute(predictions=predicted, references=ground_truth, num_labels=10, ignore_index=255)
>>> results = mean_iou.compute(predictions=[predicted], references=[ground_truth], num_labels=10, ignore_index=255)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed metrics .compute() take lists of references and predictions per image, not just a pair of reference and prediction

Copy link
Contributor

@NielsRogge NielsRogge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok great, let's merge (when the CI is green)

@qubvel qubvel mentioned this pull request Apr 10, 2024
@qubvel qubvel mentioned this pull request Apr 18, 2024
@lhoestq lhoestq merged commit edb83af into huggingface:main Apr 18, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants