Add Ernie-Image modular pipeline#13498
Conversation
|
Maybe it could live as a custom pipeline on the Hub like https://huggingface.co/krea/krea-realtime-video/tree/main? |
|
@sayakpaul @yiyixuxu can do the krea pattern by keeping a minimal ErnieImageModularPipeline subclass + MODULAR_PIPELINE_MAPPING entry in diffusers. Want me to restructure the PR that way? |
|
Doing it the Krea way wouldn't require any changes to core Diffusers no? |
|
you were right, the fully hub-only pattern works. moved everything to https://huggingface.co/akshan-main/ernie-image-modular and it loads end-to-end with zero diffusers changes: from diffusers.modular_pipelines import ModularPipeline
pipe = ModularPipeline.from_pretrained("akshan-main/ernie-image-modular", trust_remote_code=True)
pipe.load_components(trust_remote_code=True)No custom ErnieImageModularPipeline class needed. inlined the pipeline properties (vae_scale_factor, num_channels_latents, text_in_dim) into the blocks via direct components.vae.config.* / components.transformer.config.* reads. Should I close this pr @sayakpaul |
|
hey @sayakpaul, over the last few weeks, I've profiled QwenImage and QwenImageEdit to identify cudaStreamSynchronize calls causing per-step latency, which led to the QwenImage RoPE sync fix (#13406, merged), plus modular pipelines for LTX (#13378, merged) and HunyuanVideo 1.5 (#13389, merged), and an HV1.5 I2V bug fix (#13439, merged). I've also published modular upscale hub blocks for SDXL, Flux1, and Z-Image, and just pushed an Ernie-Image modular hub repo. Would this qualify me for MVP recognition, or is there more I should do to get there? happy to keep contributing either way, just wanted to check. |
|
ooh but I'd really love to have official modular support for ERNIE-Image — it's a really good model, trained from scratch, and the Baidu team is committed to releasing more checkpoints and building their own community around it I think we can release official blocks for text-to-image, image-to-image, and edit pipelines (not yet released) and encourage the community to build more custom stuff on hub using our official blocks |
What does this PR do?
Adds the modular pipeline for ErnieImage (
ErnieImageAutoBlocks+ErnieImageModularPipeline).Parity verified on A100, bf16, 50 steps, 1024x1024 with
baidu/ERNIE-Image:Colab Notebook: https://colab.research.google.com/gist/akshan-main/f25801763d573209464d6bfd685d708e/modular-ernie-image.ipynb
Addresses #13389 (comment).
Before submitting
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
@yiyixuxu @sayakpaul