Address ernie-image review findings #13577#13663
Address ernie-image review findings #13577#13663akshan-main wants to merge 4 commits intohuggingface:mainfrom
Conversation
There was a problem hiding this comment.
can we also take the opportunity to change the auto classes for the real classes, it gets confusing for some users that pass the text encoder e.g. quantization and also kind of annoying to get the warning all the time.
|
@asomoza switched |
| bn_mean = self.vae.bn.running_mean.view(1, -1, 1, 1).to(device) | ||
| bn_std = torch.sqrt(self.vae.bn.running_var.view(1, -1, 1, 1) + 1e-5).to(device) |
There was a problem hiding this comment.
dtype casting to be safe and for consistency with modular
| bn_mean = self.vae.bn.running_mean.view(1, -1, 1, 1).to(device) | |
| bn_std = torch.sqrt(self.vae.bn.running_var.view(1, -1, 1, 1) + 1e-5).to(device) | |
| bn_mean = self.vae.bn.running_mean.view(1, -1, 1, 1).to(device=device, dtype=latents.dtype) | |
| bn_std = torch.sqrt(self.vae.bn.running_var.view(1, -1, 1, 1) + 1e-5).to(device=device, dtype=latents.dtype) |
There could be a TODO regarding vae.config.batch_norm_eps, it should be used in the future if the checkpoint config is changed
| images = (images.clamp(-1, 1) + 1) / 2 | ||
| images = images.cpu().permute(0, 2, 3, 1).float().numpy() | ||
|
|
||
| if output_type == "pil": | ||
| images = [Image.fromarray((img * 255).astype("uint8")) for img in images] | ||
| if output_type == "pil": | ||
| images = [Image.fromarray((img * 255).astype("uint8")) for img in images] |
There was a problem hiding this comment.
Can VaeImageProcessor be used here? cc @yiyixuxu Enforcing VaeImageProcessor could be another agent review rule?
There was a problem hiding this comment.
Switched both standard and modular to VaeImageProcessor.postprocess. Also fixes output_type="pt" in the standard pipeline (was returning numpy).
What does this PR do?
Partial fix for #13577. Addresses 1, 2, 5 per @yiyixuxu's scope
ErnieImageAutoPromptEnhancerSteptoConditionalPipelineBlockssouse_pe=Falseactually skips the prompt enhancer (AutoPipelineBlocksselected on presence, not truthiness).1e-5(matches training; the hub config currently reports1e-4).output_type=\"latent\"so it runsmaybe_free_model_hooks()and honorsreturn_dict, matching the QwenImage/Flux2 pattern.Before submitting
ernie-imagemodel/pipeline review #13577Who can review?
@yiyixuxu