Fix rescale normalize inconsistencies in fast image processors #36388

yonigozlan · 2025-02-25T04:43:15Z

What does this PR do?

Improve handling of fused rescale and normalize, allow normalizing of integers tensors.
As dicsussed with @qubvel

HuggingFaceDocBuilderDev · 2025-02-25T05:10:05Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

qubvel

Thanks for the update!

qubvel · 2025-02-26T12:36:03Z

src/transformers/image_processing_utils_fast.py

@@ -587,17 +588,17 @@ def preprocess(
        image_mean = tuple(image_mean) if isinstance(image_mean, list) else image_mean
        image_std = tuple(image_std) if isinstance(image_std, list) else image_std

-        image_mean, image_std, interpolation = self._prepare_process_arguments(
+        image_mean, image_std, do_rescale, interpolation = self._prepare_process_arguments(


I would better unpack this to make it explicit.. because it's still confusing what happens inside the function + we are mixing two responsibilities validation and modification, which is not the best practice overall and can cause inability to use this function for other non-standard processors.

BTW, any reason not for doing image_mean, image_std, rescale_factor fusion inside rescale_and_normalize function?

Agreed, I'll split this in more functions.

BTW, any reason not for doing image_mean, image_std, rescale_factor fusion inside rescale_and_normalize function?

Yes the goal was to not recompute "rescaled" image_mean and image_std at every call to preprocess, so computing them outside of rescale_and_normalize allows to use lru_cache

Ok, we can also create a separate function and use it within rescale_and_normalize. I apologize if I seem a bit pushy, but it seems logical to assign this responsibility to the transform itself. This approach is fairly standard in other libraries (e.g., Albumentations), so it's better to follow a common pattern. Anyway, I'm happy to discuss if you think it's better to keep it separate.

@lru_cache def fuse_mean_std_and_rescale_factor(mean: Tuple[float, ...], std: Tuple[float, ...], factor: float): ...

No that sounds better indeed, Thanks!

qubvel · 2025-02-26T12:38:09Z

src/transformers/models/rt_detr/modular_rt_detr.py

+            # Fused rescale and normalize
+            image = self.rescale_and_normalize(image, do_rescale, rescale_factor, do_normalize, image_mean, image_std)


yonigozlan · 2025-02-26T18:45:58Z

@qubvel I made the requested changes :)
I'll need to merge this PR #36406 before fixing siglip2 for this one though

qubvel

Thanks for iterating! ~~Some tests failed, please have a look~~

qubvel · 2025-02-27T11:06:34Z

src/transformers/models/siglip2/image_processing_siglip2_fast.py

@@ -260,7 +260,7 @@ def preprocess(
        image_mean = tuple(image_mean) if isinstance(image_mean, list) else image_mean
        image_std = tuple(image_std) if isinstance(image_std, list) else image_std

-        image_mean, image_std, interpolation = self._prepare_process_arguments(
+        image_mean, image_std, do_rescale, interpolation = self._prepare_process_arguments(


I suppose this should be adjusted

qubvel · 2025-02-27T11:09:35Z

P.S. ahh, ok, I got your comment

ArthurZucker

Approving but make sur args that are not default and need modifications are args otherwise we are compilcating our life a bit

src/transformers/image_processing_utils_fast.py

ArthurZucker · 2025-03-11T10:03:27Z

src/transformers/image_processing_utils_fast.py

+        if size is not None:
+            kwargs["size"] = SizeDict(**get_size_dict(size=size, default_to_square=default_to_square))
+        if crop_size is not None:
+            kwargs["crop_size"] = SizeDict(**get_size_dict(crop_size, param_name="crop_size"))
+        if isinstance(image_mean, list):
+            kwargs["image_mean"] = tuple(image_mean)
+        if isinstance(image_std, list):
+            kwargs["image_std"] = tuple(image_std)
+        if data_format is None:
+            kwargs["data_format"] = ChannelDimension.FIRST


if you modify them, why use kwargs and not an arg? Makes more sense to me

ArthurZucker · 2025-03-11T10:04:20Z

Also we should not have to get each kwarg individually to pass them to the validation, this defeats the kwargs usage

…ze-inconsistencies

yonigozlan added 2 commits February 25, 2025 01:35

fix fused rescale normalize inconsistencies

33ccd7e

fix siglip2 fast image processor

200b50d

yonigozlan requested a review from qubvel February 25, 2025 04:43

yonigozlan requested review from qubvel and removed request for qubvel February 25, 2025 19:40

qubvel reviewed Feb 26, 2025

View reviewed changes

refactor kwargs validation and fused nirmalize rescale

0cdece5

qubvel reviewed Feb 27, 2025

View reviewed changes

ArthurZucker approved these changes Mar 11, 2025

View reviewed changes

yonigozlan and others added 5 commits March 11, 2025 18:41

cleanup kwargs handling in preprocess

48477f8

Merge remote-tracking branch 'upstream/main' into fix-rescale-normali…

cbb7a0a

…ze-inconsistencies

Merge branch 'main' into fix-rescale-normalize-inconsistencies

a8a521d

Merge remote-tracking branch 'upstream/main' into fix-rescale-normali…

52501e6

…ze-inconsistencies

update new procs after refactor

532bc21

yonigozlan merged commit 79254c9 into huggingface:main Mar 13, 2025
23 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix rescale normalize inconsistencies in fast image processors #36388

Fix rescale normalize inconsistencies in fast image processors #36388

Uh oh!

yonigozlan commented Feb 25, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Feb 25, 2025

Uh oh!

qubvel left a comment

Uh oh!

qubvel Feb 26, 2025

Uh oh!

yonigozlan Feb 26, 2025 •

edited

Loading

Uh oh!

qubvel Feb 26, 2025

Uh oh!

yonigozlan Feb 26, 2025

Uh oh!

qubvel Feb 26, 2025

Uh oh!

yonigozlan commented Feb 26, 2025

Uh oh!

qubvel left a comment •

edited

Loading

Uh oh!

qubvel Feb 27, 2025 •

edited

Loading

Uh oh!

qubvel commented Feb 27, 2025

Uh oh!

ArthurZucker left a comment

Uh oh!

Uh oh!

ArthurZucker Mar 11, 2025

Uh oh!

ArthurZucker commented Mar 11, 2025

Uh oh!

Uh oh!

Uh oh!

		# Fused rescale and normalize
		image = self.rescale_and_normalize(image, do_rescale, rescale_factor, do_normalize, image_mean, image_std)

Fix rescale normalize inconsistencies in fast image processors #36388

Fix rescale normalize inconsistencies in fast image processors #36388

Uh oh!

Conversation

yonigozlan commented Feb 25, 2025

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented Feb 25, 2025

Uh oh!

qubvel left a comment

Choose a reason for hiding this comment

Uh oh!

qubvel Feb 26, 2025

Choose a reason for hiding this comment

Uh oh!

yonigozlan Feb 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

qubvel Feb 26, 2025

Choose a reason for hiding this comment

Uh oh!

yonigozlan Feb 26, 2025

Choose a reason for hiding this comment

Uh oh!

qubvel Feb 26, 2025

Choose a reason for hiding this comment

Uh oh!

yonigozlan commented Feb 26, 2025

Uh oh!

qubvel left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

qubvel Feb 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

qubvel commented Feb 27, 2025

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ArthurZucker Mar 11, 2025

Choose a reason for hiding this comment

Uh oh!

ArthurZucker commented Mar 11, 2025

Uh oh!

Uh oh!

Uh oh!

yonigozlan Feb 26, 2025 •

edited

Loading

qubvel left a comment •

edited

Loading

qubvel Feb 27, 2025 •

edited

Loading