-
Notifications
You must be signed in to change notification settings - Fork 6.6k
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
Hi,
I have tried to use an example from here https://huggingface.co/docs/diffusers/main/en/api/pipelines/controlnet (posted below). However, when a model is passed that has a Unet that expects 4 channels the pipeline breaks with the following error
RuntimeError: Given groups=1, weight of size [320, 9, 3, 3], expected input[2, 4, 64, 64] to have 9 channels, but got 4 channels instead
I raise this problem as it seems that in the source code it seems the handling of 4 vs 9 channel unet is handled
https://github.com/huggingface/diffusers/blob/v0.26.2-patch/src/diffusers/pipelines/controlnet/pipeline_controlnet_inpaint.py#L1532
Reproduction
# !pip install transformers accelerate
import cv2
from PIL import Image
from diffusers.pipelines.stable_diffusion import safety_checker
from diffusers import StableDiffusionControlNetInpaintPipeline, ControlNetModel, DDIMScheduler, DPMSolverMultistepScheduler
from diffusers.utils import load_image
import numpy as np
import torch
init_image = load_image(
"https://huggingface.co/datasets/diffusers/test-arrays/resolve/main/stable_diffusion_inpaint/boy.png"
)
init_image = init_image.resize((512, 512))
generator = torch.Generator(device="cpu").manual_seed(1)
mask_image = load_image(
"https://huggingface.co/datasets/diffusers/test-arrays/resolve/main/stable_diffusion_inpaint/boy_mask.png"
)
mask_image = mask_image.resize((512, 512))
def make_canny_condition(image):
image = np.array(image)
image = cv2.Canny(image, 100, 200)
image = image[:, :, None]
image = np.concatenate([image, image, image], axis=2)
image = Image.fromarray(image)
return image
control_image = make_canny_condition(init_image)
controlnet = ControlNetModel.from_pretrained(
"lllyasviel/control_v11p_sd15_canny", torch_dtype=torch.float16
)
pipe = StableDiffusionControlNetInpaintPipeline.from_single_file(
"./models/majicmixRealistic_v7-inpainting.safetensors", controlnet=controlnet, torch_dtype=torch.float16,
safety_checker = None,
requires_safety_checker = False
)
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config, algorithm_type = 'sde-dpmsolver++', use_karras_sigmas = True)
pipe.enable_model_cpu_offload()
# generate image
image = pipe(
"a handsome man with ray-ban sunglasses",
num_inference_steps=20,
generator=generator,
image=init_image,
mask_image=mask_image,
control_image=control_image
).images[0]
Logs
No response
System Info
diffusersversion: 0.26.2- Platform: Linux-5.15.0-1039-aws-x86_64-with-glibc2.31
- Python version: 3.11.7
- PyTorch version (GPU?): 2.1.2+cu121 (True)
- Huggingface_hub version: 0.20.3
- Transformers version: 4.37.2
- Accelerate version: 0.21.0
- xFormers version: 0.0.23.post1
- Using GPU in script?:
- Using distributed or parallel set-up in script?:
Who can help?
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working