Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix rescale normalize inconsistencies in fast image processors #36388

Merged

Conversation

yonigozlan
Copy link
Member

What does this PR do?

Improve handling of fused rescale and normalize, allow normalizing of integers tensors.
As dicsussed with @qubvel

@yonigozlan yonigozlan requested a review from qubvel February 25, 2025 04:43
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@yonigozlan yonigozlan requested review from qubvel and removed request for qubvel February 25, 2025 19:40
Copy link
Member

@qubvel qubvel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update!

@@ -587,17 +588,17 @@ def preprocess(
image_mean = tuple(image_mean) if isinstance(image_mean, list) else image_mean
image_std = tuple(image_std) if isinstance(image_std, list) else image_std

image_mean, image_std, interpolation = self._prepare_process_arguments(
image_mean, image_std, do_rescale, interpolation = self._prepare_process_arguments(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would better unpack this to make it explicit.. because it's still confusing what happens inside the function + we are mixing two responsibilities validation and modification, which is not the best practice overall and can cause inability to use this function for other non-standard processors.

BTW, any reason not for doing image_mean, image_std, rescale_factor fusion inside rescale_and_normalize function?

Copy link
Member Author

@yonigozlan yonigozlan Feb 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, I'll split this in more functions.

BTW, any reason not for doing image_mean, image_std, rescale_factor fusion inside rescale_and_normalize function?

Yes the goal was to not recompute "rescaled" image_mean and image_std at every call to preprocess, so computing them outside of rescale_and_normalize allows to use lru_cache

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, we can also create a separate function and use it within rescale_and_normalize. I apologize if I seem a bit pushy, but it seems logical to assign this responsibility to the transform itself. This approach is fairly standard in other libraries (e.g., Albumentations), so it's better to follow a common pattern. Anyway, I'm happy to discuss if you think it's better to keep it separate.

@lru_cache
def fuse_mean_std_and_rescale_factor(mean: Tuple[float, ...], std: Tuple[float, ...], factor: float):
    ...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No that sounds better indeed, Thanks!

Comment on lines +271 to +272
# Fused rescale and normalize
image = self.rescale_and_normalize(image, do_rescale, rescale_factor, do_normalize, image_mean, image_std)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@yonigozlan
Copy link
Member Author

@qubvel I made the requested changes :)
I'll need to merge this PR #36406 before fixing siglip2 for this one though

Copy link
Member

@qubvel qubvel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for iterating! Some tests failed, please have a look

@@ -260,7 +260,7 @@ def preprocess(
image_mean = tuple(image_mean) if isinstance(image_mean, list) else image_mean
image_std = tuple(image_std) if isinstance(image_std, list) else image_std

image_mean, image_std, interpolation = self._prepare_process_arguments(
image_mean, image_std, do_rescale, interpolation = self._prepare_process_arguments(
Copy link
Member

@qubvel qubvel Feb 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose this should be adjusted

@qubvel
Copy link
Member

qubvel commented Feb 27, 2025

P.S. ahh, ok, I got your comment

Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving but make sur args that are not default and need modifications are args otherwise we are compilcating our life a bit

Comment on lines 600 to 609
if size is not None:
kwargs["size"] = SizeDict(**get_size_dict(size=size, default_to_square=default_to_square))
if crop_size is not None:
kwargs["crop_size"] = SizeDict(**get_size_dict(crop_size, param_name="crop_size"))
if isinstance(image_mean, list):
kwargs["image_mean"] = tuple(image_mean)
if isinstance(image_std, list):
kwargs["image_std"] = tuple(image_std)
if data_format is None:
kwargs["data_format"] = ChannelDimension.FIRST
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you modify them, why use kwargs and not an arg? Makes more sense to me

@ArthurZucker
Copy link
Collaborator

Also we should not have to get each kwarg individually to pass them to the validation, this defeats the kwargs usage

@yonigozlan yonigozlan merged commit 79254c9 into huggingface:main Mar 13, 2025
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants