Skip to content

Conversation

james77777778
Copy link
Contributor

The SD3 guide needs to be updated once we have a new release of keras-hub, as keras-team/keras-hub#1951 has been merged

cc @divyashreepathihalli


backbone = keras_hub.models.StableDiffusion3Backbone.from_preset(
"stable_diffusion_3_medium", height=512, width=512, dtype="float16"
"stable_diffusion_3_medium", image_shape=(512, 512, 3), dtype="float16"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does the channel axis need to be specified? Other APIs that an image_shape arg only do height/width

Copy link
Contributor Author

@james77777778 james77777778 Oct 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it true? I think most of the backbones in kerashub expect (h, w, c) format using image_shape.
https://github.com/search?q=repo%3Akeras-team%2Fkeras-hub%20image_shape&type=code

This change was requested by @divyashreepathihalli and I agree that it is more consistent with other backbone APIs

Additionally, even though most users won’t do this, it is still valid to train a diffusion model with the non-standard RGB images.

EDITED:
We need to specify channel axis to correctly instantiate VAE image encoder.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, sounds good

@fchollet fchollet merged commit 17518e1 into keras-team:master Oct 26, 2024
2 checks passed
@james77777778 james77777778 deleted the update-sd3 branch December 27, 2024 07:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants