[Bug]: IPAdapter, RuntimeError: Expected query, key, and value to have the same dtype, but got query.dtype: c10::Half key.dtype: float and value.dtype: float instead. #2208

frankjiang · 2023-11-01T06:43:35Z

Is there an existing issue for this?

I have searched the existing issues and checked the recent builds/commits of both this extension and the webui

What happened?

IPAdapter cannot run correctly.

Steps to reproduce the problem

Img2Img
ControlNet (latest)
Choose IPAdapter
Choose ip-adapter_clip_sd15 (default)
Choose ip-adapter-plus-face_sd15 [71693645] (default)
Add prompts
Generate

What should have happened?

raise a RuntimeError

Commit where the problem happens

webui: 5ef669de080814067961f28357256e8fe27544f4
controlnet: 3011ff6

What browsers do you use to access the UI ?

No response

Command Line Arguments

No

List of enabled extensions

Console logs

*** Error completing request
*** Arguments: ('task(510w65ya0s7jt96)', 0, '', '', ['Asian Boy Portrait'], <PIL.Image.Image image mode=RGBA size=512x512 at 0x2A90DE920>, None, None, None, None, None, None, 20, 'DPM++ 2M Karras', 4, 0, 1, 1, 1, 7, 1.5, 0.75, 0, 512, 512, 1, 0, 0, 32, 0, '', '', '', [], False, [], '', <gradio.routes.Request object at 0x32fc100a0>, 0, False, '', 0.8, -1, False, -1, 0, 0, 0, False, 'MultiDiffusion', False, True, 1024, 1024, 96, 96, 48, 4, 'None', 2, False, 10, 1, 1, 64, False, False, False, False, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 512, 64, True, True, True, False, <scripts.animatediff_ui.AnimateDiffProcess object at 0x36fc1c880>, <scripts.controlnet_ui.controlnet_ui_group.UiControlNetUnit object at 0x32fb90400>, <scripts.controlnet_ui.controlnet_ui_group.UiControlNetUnit object at 0x32fb902b0>, <scripts.controlnet_ui.controlnet_ui_group.UiControlNetUnit object at 0x2afde87f0>, '* `CFG Scale` should be 2 or lower.', True, True, '', '', True, 50, True, 1, 0, False, 4, 0.5, 'Linear', 'None', '<p style="margin-bottom:0.75em">Recommended settings: Sampling Steps: 80-100, Sampler: Euler a, Denoising strength: 0.8</p>', 128, 8, ['left', 'right', 'up', 'down'], 1, 0.05, 128, 4, 0, ['left', 'right', 'up', 'down'], False, False, 'positive', 'comma', 0, False, False, '', '<p style="margin-bottom:0.75em">Will upscale the image by the selected scale factor; use width and height sliders to set tile size</p>', 64, 0, 2, 1, '', [], 0, '', [], 0, '', [], True, False, False, False, 0, False, None, None, False, None, None, False, None, None, False, 50) {}
    Traceback (most recent call last):
      File "/Users/frank/git/thirdparty/stable-diffusion-webui/modules/call_queue.py", line 57, in f
        res = list(func(*args, **kwargs))
      File "/Users/frank/git/thirdparty/stable-diffusion-webui/modules/call_queue.py", line 36, in f
        res = func(*args, **kwargs)
      File "/Users/frank/git/thirdparty/stable-diffusion-webui/modules/img2img.py", line 208, in img2img
        processed = process_images(p)
      File "/Users/frank/git/thirdparty/stable-diffusion-webui/modules/processing.py", line 732, in process_images
        res = process_images_inner(p)
      File "/Users/frank/git/thirdparty/stable-diffusion-webui/extensions/sd-webui-controlnet/scripts/batch_hijack.py", line 42, in processing_process_images_hijack
        return getattr(processing, '__controlnet_original_process_images_inner')(p, *args, **kwargs)
      File "/Users/frank/git/thirdparty/stable-diffusion-webui/modules/processing.py", line 867, in process_images_inner
        samples_ddim = p.sample(conditioning=p.c, unconditional_conditioning=p.uc, seeds=p.seeds, subseeds=p.subseeds, subseed_strength=p.subseed_strength, prompts=p.prompts)
      File "/Users/frank/git/thirdparty/stable-diffusion-webui/extensions/sd-webui-controlnet/scripts/hook.py", line 451, in process_sample
        return process.sample_before_CN_hack(*args, **kwargs)
      File "/Users/frank/git/thirdparty/stable-diffusion-webui/modules/processing.py", line 1528, in sample
        samples = self.sampler.sample_img2img(self, self.init_latent, x, conditioning, unconditional_conditioning, image_conditioning=self.image_conditioning)
      File "/Users/frank/git/thirdparty/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 188, in sample_img2img
        samples = self.launch_sampling(t_enc + 1, lambda: self.func(self.model_wrap_cfg, xi, extra_args=self.sampler_extra_args, disable=False, callback=self.callback_state, **extra_params_kwargs))
      File "/Users/frank/git/thirdparty/stable-diffusion-webui/modules/sd_samplers_common.py", line 261, in launch_sampling
        return func()
      File "/Users/frank/git/thirdparty/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 188, in <lambda>
        samples = self.launch_sampling(t_enc + 1, lambda: self.func(self.model_wrap_cfg, xi, extra_args=self.sampler_extra_args, disable=False, callback=self.callback_state, **extra_params_kwargs))
      File "/Users/frank/git/thirdparty/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
        return func(*args, **kwargs)
      File "/Users/frank/git/thirdparty/stable-diffusion-webui/repositories/k-diffusion/k_diffusion/sampling.py", line 594, in sample_dpmpp_2m
        denoised = model(x, sigmas[i] * s_in, **extra_args)
      File "/Users/frank/git/thirdparty/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
        return forward_call(*args, **kwargs)
      File "/Users/frank/git/thirdparty/stable-diffusion-webui/modules/sd_samplers_cfg_denoiser.py", line 169, in forward
        x_out = self.inner_model(x_in, sigma_in, cond=make_condition_dict(cond_in, image_cond_in))
      File "/Users/frank/git/thirdparty/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
        return forward_call(*args, **kwargs)
      File "/Users/frank/git/thirdparty/stable-diffusion-webui/repositories/k-diffusion/k_diffusion/external.py", line 112, in forward
        eps = self.get_eps(input * c_in, self.sigma_to_t(sigma), **kwargs)
      File "/Users/frank/git/thirdparty/stable-diffusion-webui/repositories/k-diffusion/k_diffusion/external.py", line 138, in get_eps
        return self.inner_model.apply_model(*args, **kwargs)
      File "/Users/frank/git/thirdparty/stable-diffusion-webui/modules/sd_hijack_utils.py", line 17, in <lambda>
        setattr(resolved_obj, func_path[-1], lambda *args, **kwargs: self(*args, **kwargs))
      File "/Users/frank/git/thirdparty/stable-diffusion-webui/modules/sd_hijack_utils.py", line 26, in __call__
        return self.__sub_func(self.__orig_func, *args, **kwargs)
      File "/Users/frank/git/thirdparty/stable-diffusion-webui/modules/sd_hijack_unet.py", line 48, in apply_model
        return orig_func(self, x_noisy.to(devices.dtype_unet), t.to(devices.dtype_unet), cond, **kwargs).float()
      File "/Users/frank/git/thirdparty/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/models/diffusion/ddpm.py", line 858, in apply_model
        x_recon = self.model(x_noisy, t, **cond)
      File "/Users/frank/git/thirdparty/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
        return forward_call(*args, **kwargs)
      File "/Users/frank/git/thirdparty/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/models/diffusion/ddpm.py", line 1335, in forward
        out = self.diffusion_model(x, t, context=cc)
      File "/Users/frank/git/thirdparty/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
        return forward_call(*args, **kwargs)
      File "/Users/frank/git/thirdparty/stable-diffusion-webui/extensions/sd-webui-controlnet/scripts/hook.py", line 858, in forward_webui
        raise e
      File "/Users/frank/git/thirdparty/stable-diffusion-webui/extensions/sd-webui-controlnet/scripts/hook.py", line 855, in forward_webui
        return forward(*args, **kwargs)
      File "/Users/frank/git/thirdparty/stable-diffusion-webui/extensions/sd-webui-controlnet/scripts/hook.py", line 762, in forward
        h = module(h, emb, context)
      File "/Users/frank/git/thirdparty/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
        return forward_call(*args, **kwargs)
      File "/Users/frank/git/thirdparty/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/diffusionmodules/openaimodel.py", line 84, in forward
        x = layer(x, context)
      File "/Users/frank/git/thirdparty/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
        return forward_call(*args, **kwargs)
      File "/Users/frank/git/thirdparty/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/attention.py", line 334, in forward
        x = block(x, context=context[i])
      File "/Users/frank/git/thirdparty/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
        return forward_call(*args, **kwargs)
      File "/Users/frank/git/thirdparty/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/attention.py", line 269, in forward
        return checkpoint(self._forward, (x, context), self.parameters(), self.checkpoint)
      File "/Users/frank/git/thirdparty/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/diffusionmodules/util.py", line 121, in checkpoint
        return CheckpointFunction.apply(func, len(inputs), *args)
      File "/Users/frank/git/thirdparty/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/autograd/function.py", line 506, in apply
        return super().apply(*args, **kwargs)  # type: ignore[misc]
      File "/Users/frank/git/thirdparty/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/diffusionmodules/util.py", line 136, in forward
        output_tensors = ctx.run_function(*ctx.input_tensors)
      File "/Users/frank/git/thirdparty/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/attention.py", line 273, in _forward
        x = self.attn2(self.norm2(x), context=context) + x
      File "/Users/frank/git/thirdparty/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
        return forward_call(*args, **kwargs)
      File "/Users/frank/git/thirdparty/stable-diffusion-webui/extensions/sd-webui-controlnet/scripts/controlmodel_ipadapter.py", line 246, in attn_forward_hacked
        out = out + f(self, x, q)
      File "/Users/frank/git/thirdparty/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
        return func(*args, **kwargs)
      File "/Users/frank/git/thirdparty/stable-diffusion-webui/extensions/sd-webui-controlnet/scripts/controlmodel_ipadapter.py", line 406, in forward
        ip_out = torch.nn.functional.scaled_dot_product_attention(q, ip_k, ip_v, attn_mask=None, dropout_p=0.0, is_causal=False)
    RuntimeError: Expected query, key, and value to have the same dtype, but got query.dtype: c10::Half key.dtype: float and value.dtype: float instead.

---

Additional information

Also occurs in other ip-adapter models, e.g. ip-adapter-plus_sd15 [c817b455]

The text was updated successfully, but these errors were encountered:

Seal-Pavel · 2023-11-04T09:05:09Z

same issue

undeadx1 · 2023-11-05T17:04:35Z

same issue too. env : m1 mac

Idmon · 2023-11-06T14:52:01Z

Same here. IP-Adapter been buggy and can't get it to work

Osato28 · 2023-11-16T08:16:40Z

Same here. M1 Mac 8GB, Sonoma 14.1.1.

Information that might be related: Sonoma has previously caused an fp16-related issue with NeuralNet on PyTorch 2.1.0, but that particular problem was solved by updating to 2.2.0.dev20231012. (Issue AUTOMATIC1111/stable-diffusion-webui#13419)

Attempted solutions:
Launching SD with --no-half "fixes" the problem by forcing all fp16 values into fp32, but it also slows down each iteration by 8-12 times (from 2 to 16-20 seconds, in my case).
UPD: Tried enabling the "Upcast cross attention layer to float32" option in Settings -> Stable Diffusion. Didn't work.

beltonk · 2023-11-17T06:41:17Z

Same here. M1 Max

beltonk · 2023-11-17T08:20:01Z

This works for me:

Patching https://github.com/Mikubill/sd-webui-controlnet/blob/main/scripts/controlmodel_ipadapter.py#L430
to
ip_out = torch.nn.functional.scaled_dot_product_attention(q, ip_k.half(), ip_v.half(), attn_mask=None, dropout_p=0.0, is_causal=False)

to convert ip_k & ip_v from float to c10:Half by adding .half() for each.

Although I'm not sure if this is the right thing to do, I'm able to generate images with SD 1.5 and SDXL with style transfer using ControlNet + IP Adapter.

huchenlei · 2023-11-21T16:46:58Z

This works for me:

Patching https://github.com/Mikubill/sd-webui-controlnet/blob/main/scripts/controlmodel_ipadapter.py#L430 to ip_out = torch.nn.functional.scaled_dot_product_attention(q, ip_k.half(), ip_v.half(), attn_mask=None, dropout_p=0.0, is_causal=False)

to convert ip_k & ip_v from float to c10:Half by adding .half() for each.

Although I'm not sure if this is the right thing to do, I'm able to generate images with SD 1.5 and SDXL with style transfer using ControlNet + IP Adapter.

Anyone verify this solution on their Mac? I do not have an MacOS machine to verify this patch. I will merge this patch to main branch once it is verified.

Osato28 · 2023-11-21T18:25:32Z

This works for me:
Patching https://github.com/Mikubill/sd-webui-controlnet/blob/main/scripts/controlmodel_ipadapter.py#L430 to ip_out = torch.nn.functional.scaled_dot_product_attention(q, ip_k.half(), ip_v.half(), attn_mask=None, dropout_p=0.0, is_causal=False)
to convert ip_k & ip_v from float to c10:Half by adding .half() for each.
Although I'm not sure if this is the right thing to do, I'm able to generate images with SD 1.5 and SDXL with style transfer using ControlNet + IP Adapter.

Anyone verify this solution on their Mac? I do not have an MacOS machine to verify this patch. I will merge this patch to main branch once it is verified.

I can't compare the results to an Nvidia machine, so I'm going to post a detailed report with image samples just in case this fix caused some weirdness that I can't detect.

My apologies if this response is a bit long; I'd rather be thorough than miss something that an Nvidia owner would notice.

TL;DR:

Tested on txt2img and img2img. Didn't find any issues.
Outputs in both modes are highly accurate and reproducible.
The slowdown due to IPAdapter seems to be within 15% of the original s/it value.

Testing parameters:

Processor: M1 8GB.

OS: Sonoma 14.1.1.

PyTorch version: 2.2.0.dev20231012

Webui arguments on launch: --skip-torch-cuda-test --upcast-sampling --opt-sub-quad-attention --use-cpu interrogate.

Resolutions: 512x512 and 512x768.

IPAdapter settings: ip-adapter_clip -> ip-adapter-plus-face_sd15, Low VRAM, Control Weight 0.7, Steps 0.5-1.0.

Attaching XY grids below to display the results.

Model: Deliberate v2.

Sampler: DPM++ 2M Karras, sampling steps: 20.

Prompt: female nurse, black hair.

Negative prompt: nsfw, disfigured, (deformed), ugly, saturated, doll, cgi, calligraphy, mismatched eyes, poorly drawn, b&w, blurry, missing, ((malformed)), ((out of frame)), model, letters, mangled, old, surreal, ((bad anatomy)), ((deformed legs)), ((deformed arms)).

IPAdapter image:

512x512. No issues. Average time per iteration: 1.555 s/it without ControlNet, 1.6 s/it with IPAdapter.

512x768. No issues. Average time per iteration: 2.75 s/it without ControlNet, 2.965 s/it with IPAdapter

Reproducibility test: generating from the same seed three times, IPAdapter turned on, to see if outputs will differ from each other. No issues.

img2img test (using only one seed, testing for accuracy and reproducibility at the same time). No issues.

beltonk · 2023-11-21T18:34:45Z

@Osato28 So the fix works for you too, right? Do you spot anything weird in your generations?

Your generations look pretty cool to me. I'm bad in tuning settings for nice outputs...

If the output does work for Apple Silicon, my only concern is about the --upcast-sampling, --no-half settings, etc. I have a feeling they are related to the error. simply typecasting by .half() might break users not using Apple Silicon. I only have a M1 Max, so unable to test for other PC / GPU / CPU...

By the way, My COMMANDLINE_ARGS is:

"--skip-torch-cuda-test --upcast-sampling --opt-sub-quad-attention --medvram --use-cpu Interrogate --no-half-vae --disable-safe-unpickle --autolaunch",

which I thought is optimized for Apple Silicon

Osato28 · 2023-11-21T19:23:37Z

@beltonk I didn't spot anything weird and I can't test it on non-Apple Silicon.

Hence the overly detailed test results: I'm hoping that if there is anything weird, it will be caught by someone with a more traditional GPU.

Thank you for posting that fix, by the way. I couldn't make heads or tails of how IPAdapter worked, and I didn't have the courage to blindly typecast values until the error message went away.

Offtopic:

Prettiness is not due to prompt engineering but due to the model, Deliberate v2. It's as stable and balanced as models get: it would probably give better results with a shorter negative prompt, I just stopped optimizing that prompt halfway.
As for COMMANDLINE_ARGS, I simply kept the most minimal set that prevented crashes and kept performance reasonably high. I didn't optimize it besides that. --medvram does seem to improve performance with heavier ControlNet models, though; added it to my args, thank you.

But I'm afraid that both of those discussions are outside the scope of this issue.

If you wish to initiate testing on several Apple Silicon machines to find an optimal set of COMMANDLINE_ARGS, I think it would be better to start a separate discussion issue in the main AUTOMATIC1111 repo.

axeldelafosse · 2023-11-27T01:20:07Z

Thank you @beltonk -- your fix worked for me too!

Lichtfabrik · 2023-12-03T21:27:23Z

Thx @beltonk -- works for me as well!

Osniackal · 2023-12-17T14:02:23Z

The fix of @beltonk worked for me on m2 mac mini

MrSegundus · 2023-12-23T04:25:46Z

Worked here! (Mac, M2 / 1111 v 1.7)

alamyrjunior · 2024-01-19T12:21:44Z

This works for me:

Patching https://github.com/Mikubill/sd-webui-controlnet/blob/main/scripts/controlmodel_ipadapter.py#L430 to ip_out = torch.nn.functional.scaled_dot_product_attention(q, ip_k.half(), ip_v.half(), attn_mask=None, dropout_p=0.0, is_causal=False)

to convert ip_k & ip_v from float to c10:Half by adding .half() for each.

Although I'm not sure if this is the right thing to do, I'm able to generate images with SD 1.5 and SDXL with style transfer using ControlNet + IP Adapter.

which file should I change? Cant find controlmodel_ipadapter.py

xuyang16 · 2024-08-05T09:54:01Z

Thank you,@huchenlei
#2348

huchenlei added the MacOS MacOS related issue label Nov 21, 2023

beltonk mentioned this issue Nov 21, 2023

[Feature Request] Could you please consider adding Start & End Control Step for IP Adapter? Stability-AI/StableSwarmUI#175

Closed

huchenlei mentioned this issue Dec 23, 2023

🐛 Fix dtype issue in IPAdapter for MacOS #2348

Merged

huchenlei closed this as completed in #2348 Dec 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: IPAdapter, RuntimeError: Expected query, key, and value to have the same dtype, but got query.dtype: c10::Half key.dtype: float and value.dtype: float instead. #2208

[Bug]: IPAdapter, RuntimeError: Expected query, key, and value to have the same dtype, but got query.dtype: c10::Half key.dtype: float and value.dtype: float instead. #2208

frankjiang commented Nov 1, 2023

Seal-Pavel commented Nov 4, 2023

undeadx1 commented Nov 5, 2023

Idmon commented Nov 6, 2023

Osato28 commented Nov 16, 2023 •

edited

Loading

beltonk commented Nov 17, 2023

beltonk commented Nov 17, 2023 •

edited

Loading

huchenlei commented Nov 21, 2023

Osato28 commented Nov 21, 2023 •

edited

Loading

beltonk commented Nov 21, 2023 •

edited

Loading

Osato28 commented Nov 21, 2023 •

edited

Loading

axeldelafosse commented Nov 27, 2023

Lichtfabrik commented Dec 3, 2023

Osniackal commented Dec 17, 2023

MrSegundus commented Dec 23, 2023

alamyrjunior commented Jan 19, 2024

xuyang16 commented Aug 5, 2024

[Bug]: IPAdapter, RuntimeError: Expected query, key, and value to have the same dtype, but got query.dtype: c10::Half key.dtype: float and value.dtype: float instead. #2208

[Bug]: IPAdapter, RuntimeError: Expected query, key, and value to have the same dtype, but got query.dtype: c10::Half key.dtype: float and value.dtype: float instead. #2208

Comments

frankjiang commented Nov 1, 2023

Is there an existing issue for this?

What happened?

Steps to reproduce the problem

What should have happened?

Commit where the problem happens

What browsers do you use to access the UI ?

Command Line Arguments

List of enabled extensions

Console logs

Additional information

Seal-Pavel commented Nov 4, 2023

undeadx1 commented Nov 5, 2023

Idmon commented Nov 6, 2023

Osato28 commented Nov 16, 2023 • edited Loading

beltonk commented Nov 17, 2023

beltonk commented Nov 17, 2023 • edited Loading

huchenlei commented Nov 21, 2023

Osato28 commented Nov 21, 2023 • edited Loading

beltonk commented Nov 21, 2023 • edited Loading

Osato28 commented Nov 21, 2023 • edited Loading

axeldelafosse commented Nov 27, 2023

Lichtfabrik commented Dec 3, 2023

Osniackal commented Dec 17, 2023

MrSegundus commented Dec 23, 2023

alamyrjunior commented Jan 19, 2024

xuyang16 commented Aug 5, 2024

Osato28 commented Nov 16, 2023 •

edited

Loading

beltonk commented Nov 17, 2023 •

edited

Loading

Osato28 commented Nov 21, 2023 •

edited

Loading

beltonk commented Nov 21, 2023 •

edited

Loading

Osato28 commented Nov 21, 2023 •

edited

Loading