Resolve mean/std swap for VITDet backbone #2087

ianstenbit · 2023-09-25T18:03:54Z

@tirthasheshpatel does your SAM demo perform as-expected with this change?

tirthasheshpatel

Thanks @ianstenbit for the quick fix! Quoting from #2086 (comment):

Here's the demo I usually run to test that the model is close to the original model at facebookresearch/segment_anything.

Just one non-blocking suggestion: What do you think about keeping the normalization step an opt-in rather than always having it on? This way the users can choose the preprocessing they want on their images.

ianstenbit · 2023-09-26T15:24:05Z

Here's the demo

@tirthasheshpatel what is the purpose of making this optional? You mean the mean/stddev bit I assume?
Throughout KerasCV we've standardized as 0-255 being the default input format, so unless there's a compelling reason I'd like to stick with the standard.

tirthasheshpatel · 2023-09-26T16:13:39Z

what is the purpose of making this optional? You mean the mean/stddev bit I assume?

Yes, I was referring to the mean/stddev part. The reason for doing that is to give users more control over input preprocessing (which might be useful for training). But I am not against changing the 0-255 input precedent (having that is great!). Just something like an imagenet_rescaling=True argument that'd let the users control whether they want that or not. What do you think?

ianstenbit · 2023-09-26T16:33:36Z

what is the purpose of making this optional? You mean the mean/stddev bit I assume?

Yes, I was referring to the mean/stddev part. The reason for doing that is to give users more control over input preprocessing (which might be useful for training). But I am not against changing the 0-255 input precedent (having that is great!). Just something like an imagenet_rescaling=True argument that'd let the users control whether they want that or not. What do you think?

I'm not sure what the value of this would be -- is there a reason that a user would care how the model internally does rescaling, as long as the API is clear about what is expected from the caller?

ianstenbit · 2023-09-26T16:33:40Z

/gcbrun

tirthasheshpatel · 2023-09-26T16:40:23Z

I'm not sure what the value of this would be -- is there a reason that a user would care how the model internally does rescaling, as long as the API is clear about what is expected from the caller?

I just thought there would be cases where the ImageNet rescaling would mess a bit with padded inputs (making borders non-zero). Although there is a way to avoid that (using mean as the padding value), just that some users might find it unintuitive.

Thinking in the wild here :) Please feel free to make a decision based on your best judgment!

ianstenbit · 2023-09-27T16:23:19Z

I'm not sure what the value of this would be -- is there a reason that a user would care how the model internally does rescaling, as long as the API is clear about what is expected from the caller?

I just thought there would be cases where the ImageNet rescaling would mess a bit with padded inputs (making borders non-zero). Although there is a way to avoid that (using mean as the padding value), just that some users might find it unintuitive.

Thinking in the wild here :) Please feel free to make a decision based on your best judgment!

Ahh I see your point here -- I think in this case it's okay. Zero padding will still end up being represented as the same pixel values as all-black pixels, which is logically the same as if no rescaling were done, it's just that the values aren't literally 0.

tirthasheshpatel · 2023-09-27T16:25:02Z

Zero padding will still end up being represented as the same pixel values as all-black pixels, which is logically the same as if no rescaling were done, it's just that the values aren't literally 0.

Ah OK, makes sense. Thanks for clarifying! Feel free to merge!

Resolve mean/std swap for VITDet backbone

10117c5

tirthasheshpatel mentioned this pull request Sep 25, 2023

Update VITDet to conform to KerasCV scaling standards #2086

Merged

tirthasheshpatel approved these changes Sep 25, 2023

View reviewed changes

ianstenbit marked this pull request as ready for review September 26, 2023 15:24

ianstenbit requested a review from jbischof September 26, 2023 15:24

ianstenbit requested a review from mattdangerw September 27, 2023 16:23

jbischof approved these changes Sep 28, 2023

View reviewed changes

ianstenbit merged commit 0edf88c into keras-team:master Sep 28, 2023
8 of 9 checks passed

ianstenbit deleted the vitdet-mean-std branch September 28, 2023 15:38

ghost pushed a commit to y-vectorfield/keras-cv that referenced this pull request Nov 16, 2023

Resolve mean/std swap for VITDet backbone (keras-team#2087)

0952ae3

yuvraj-wale pushed a commit to yuvraj-wale/keras-cv that referenced this pull request Feb 8, 2024

Resolve mean/std swap for VITDet backbone (keras-team#2087)

3feef5b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Resolve mean/std swap for VITDet backbone #2087

Resolve mean/std swap for VITDet backbone #2087

ianstenbit commented Sep 25, 2023

tirthasheshpatel left a comment

ianstenbit commented Sep 26, 2023

tirthasheshpatel commented Sep 26, 2023

ianstenbit commented Sep 26, 2023

ianstenbit commented Sep 26, 2023

tirthasheshpatel commented Sep 26, 2023

ianstenbit commented Sep 27, 2023

tirthasheshpatel commented Sep 27, 2023

Resolve mean/std swap for VITDet backbone #2087

Resolve mean/std swap for VITDet backbone #2087

Conversation

ianstenbit commented Sep 25, 2023

tirthasheshpatel left a comment

Choose a reason for hiding this comment

ianstenbit commented Sep 26, 2023

tirthasheshpatel commented Sep 26, 2023

ianstenbit commented Sep 26, 2023

ianstenbit commented Sep 26, 2023

tirthasheshpatel commented Sep 26, 2023

ianstenbit commented Sep 27, 2023

tirthasheshpatel commented Sep 27, 2023