Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Remove of --no-half cause errors under MacOS with any Torch version, but almost all samplers produce only noise with it and latest nightly builds #8555

Open
1 task done
Vitkarus opened this issue Mar 12, 2023 · 37 comments · Fixed by #10201
Labels
bug-report Report of a bug, yet to be confirmed platform:mac Issues that apply to Apple OS X, M1, M2, etc

Comments

@Vitkarus
Copy link

Is there an existing issue for this?

  • I have searched the existing issues and checked the recent builds/commits

What happened?

I've tested different versions of Torch to possibly find one that works with --no-half but no luck.

1.14.0.dev20221025 I'm currently using works fine but throws errors without the --ho-half argument. The latest nightly version 2.1.0.dev20230312 seems to work with this argument and gives a really noticeable performance boost, but almost all samplers break on it.

My results

With --no-half:
There are no errors, but all samplers apart from DDIM and PLMS produce only noise as final results, these two gives out normal pictures. Also new UniPC produce something that looks like a bit less then noise, but still really messy.

Without --no-half:
Errors while using everything except DDIM and PLMS. They also works around 40% faster then with --no-half.

Without --no-half and with --disable-nan-check:
Just black images instead of noise.

Steps to reproduce the problem

I was just changing startup arguments

What should have happened?

Other samplers should work too, I guess

Commit where the problem happens

3c922d9

What platforms do you use to access the UI ?

MacOS

What browsers do you use to access the UI ?

Mozilla Firefox

Command Line Arguments

--opt-sub-quad-attention --skip-torch-cuda-test --upcast-sampling --use-cpu interrogate --no-half

List of extensions

No

Console logs

Error completing request
Arguments: ('task(9rr6te8wtxyte2o)', 'watermelon', '', [], 20, 16, False, False, 1, 1, 7, -1.0, -1.0, 0, 0, 0, False, 512, 512, False, 0.7, 2, 'Latent', 0, 0, 0, [], 0, False, False, 'positive', 'comma', 0, False, False, '', 1, '', 0, '', 0, '', True, False, False, False, 0) {}
Traceback (most recent call last):
  File "/stable-diffusion-webui/modules/call_queue.py", line 56, in f
    res = list(func(*args, **kwargs))
  File "/stable-diffusion-webui/modules/call_queue.py", line 37, in f
    res = func(*args, **kwargs)
  File "/stable-diffusion-webui/modules/txt2img.py", line 56, in txt2img
    processed = process_images(p)
  File "/stable-diffusion-webui/modules/processing.py", line 486, in process_images
    res = process_images_inner(p)
  File "/stable-diffusion-webui/modules/processing.py", line 635, in process_images_inner
    samples_ddim = p.sample(conditioning=c, unconditional_conditioning=uc, seeds=seeds, subseeds=subseeds, subseed_strength=p.subseed_strength, prompts=prompts)
  File "/stable-diffusion-webui/modules/processing.py", line 835, in sample
    samples = self.sampler.sample(self, x, conditioning, unconditional_conditioning, image_conditioning=self.txt2img_image_conditioning(x))
  File "/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 351, in sample
    samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={
  File "/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 227, in launch_sampling
    return func()
  File "/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 351, in <lambda>
    samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={
  File "/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/stable-diffusion-webui/repositories/k-diffusion/k_diffusion/sampling.py", line 553, in sample_dpmpp_sde
    denoised = model(x, sigmas[i] * s_in, **extra_args)
  File "/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 145, in forward
    devices.test_for_nans(x_out, "unet")
  File "/stable-diffusion-webui/modules/devices.py", line 152, in test_for_nans
    raise NansException(message)

modules.devices.NansException: A tensor with all NaNs was produced in Unet. This could be either because there's not enough precision to represent the picture, or because your video card does not support half type. Try setting the "Upcast cross attention layer to float32" option in Settings > Stable Diffusion or using the --no-half commandline argument to fix this. Use --disable-nan-check commandline argument to disable this check.

Additional information

Intel Mac with RX 6600XT, MacOS 13.2.1

@Vitkarus Vitkarus added the bug-report Report of a bug, yet to be confirmed label Mar 12, 2023
@Homemaderobot
Copy link

Homemaderobot commented Mar 14, 2023

I'm getting similar on my 32GB M1 Max - MacOS 13.2.
Torch v1.12.1
Commit a9fed7c

Default Command Line Arguments: --upcast-sampling --no-half-vae --use-cpu interrogate

Problem is with v2-1_768 (2.0, 1.5 & 1.4 work fine). With SD2.1 it errors at 0% with all sampling methods (except DDIM, PLMS & UniPC):

Error completing request
Arguments: ('task(ou61msdjo5m7nj1)', 'photo of a man', '', [], 10, 15, False, False, 1, 1, 7, -1.0, -1.0, 0, 0, 0, False, 768, 768, False, 0.7, 2, 'Latent', 0, 0, 0, [], 0, False, False, 'positive', 'comma', 0, False, False, '', 1, '', 0, '', 0, '', True, False, False, False, 0) {}
Traceback (most recent call last):
  File "/Users/js/stable-diffusion-webui/modules/call_queue.py", line 56, in f
    res = list(func(*args, **kwargs))
  File "/Users/js/stable-diffusion-webui/modules/call_queue.py", line 37, in f
    res = func(*args, **kwargs)
  File "/Users/js/stable-diffusion-webui/modules/txt2img.py", line 56, in txt2img
    processed = process_images(p)
  File "/Users/js/stable-diffusion-webui/modules/processing.py", line 486, in process_images
    res = process_images_inner(p)
  File "/Users/js/stable-diffusion-webui/modules/processing.py", line 636, in process_images_inner
    samples_ddim = p.sample(conditioning=c, unconditional_conditioning=uc, seeds=seeds, subseeds=subseeds, subseed_strength=p.subseed_strength, prompts=prompts)
  File "/Users/js/stable-diffusion-webui/modules/processing.py", line 836, in sample
    samples = self.sampler.sample(self, x, conditioning, unconditional_conditioning, image_conditioning=self.txt2img_image_conditioning(x))
  File "/Users/js/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 351, in sample
    samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={
  File "/Users/js/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 227, in launch_sampling
    return func()
  File "/Users/js/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 351, in <lambda>
    samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={
  File "/Users/js/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/Users/js/stable-diffusion-webui/repositories/k-diffusion/k_diffusion/sampling.py", line 594, in sample_dpmpp_2m
    denoised = model(x, sigmas[i] * s_in, **extra_args)
  File "/Users/js/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/Users/js/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 145, in forward
    devices.test_for_nans(x_out, "unet")
  File "/Users/js/stable-diffusion-webui/modules/devices.py", line 152, in test_for_nans
    raise NansException(message)
modules.devices.NansException: A tensor with all NaNs was produced in Unet. This could be either because there's not enough precision to represent the picture, or because your video card does not support half type. Try setting the "Upcast cross attention layer to float32" option in Settings > Stable Diffusion or using the --no-half commandline argument to fix this. Use --disable-nan-check commandline argument to disable this check.

With DDIM, PLMS & UniPC it gets to 100% with black square and no saved image:

Error completing request:41,  1.36s/it]
Arguments: ('task(ao35ifugi48be7m)', 'photo of a man', '', [], 10, 19, False, False, 1, 1, 7, -1.0, -1.0, 0, 0, 0, False, 768, 768, False, 0.7, 2, 'Latent', 0, 0, 0, [], 0, False, False, 'positive', 'comma', 0, False, False, '', 1, '', 0, '', 0, '', True, False, False, False, 0) {}
Traceback (most recent call last):
  File "/Users/js/stable-diffusion-webui/modules/call_queue.py", line 56, in f
    res = list(func(*args, **kwargs))
  File "/Users/js/stable-diffusion-webui/modules/call_queue.py", line 37, in f
    res = func(*args, **kwargs)
  File "/Users/js/stable-diffusion-webui/modules/txt2img.py", line 56, in txt2img
    processed = process_images(p)
  File "/Users/js/stable-diffusion-webui/modules/processing.py", line 486, in process_images
    res = process_images_inner(p)
  File "/Users/js/stable-diffusion-webui/modules/processing.py", line 640, in process_images_inner
    devices.test_for_nans(x, "vae")
  File "/Users/js/stable-diffusion-webui/modules/devices.py", line 152, in test_for_nans
    raise NansException(message)
modules.devices.NansException: A tensor with all NaNs was produced in VAE. Use --disable-nan-check commandline argument to disable this check.

Have tried with all extensions off and have deleted /venv directory and get same results.

Have been absorbed in controlnet with SD1.5 for a while so not sure how long has been a problem.

Thanks.

@yuhuihu
Copy link

yuhuihu commented Mar 16, 2023

I'm getting similar on my 32GB M1 Max - MacOS 13.2. Torch v1.12.1 Commit a9fed7c

Default Command Line Arguments: --upcast-sampling --no-half-vae --use-cpu interrogate

Problem is with v2-1_768 (2.0, 1.5 & 1.4 work fine). With SD2.1 it errors at 0% with all sampling methods (except DDIM, PLMS & UniPC):

Error completing request
Arguments: ('task(ou61msdjo5m7nj1)', 'photo of a man', '', [], 10, 15, False, False, 1, 1, 7, -1.0, -1.0, 0, 0, 0, False, 768, 768, False, 0.7, 2, 'Latent', 0, 0, 0, [], 0, False, False, 'positive', 'comma', 0, False, False, '', 1, '', 0, '', 0, '', True, False, False, False, 0) {}
Traceback (most recent call last):
  File "/Users/js/stable-diffusion-webui/modules/call_queue.py", line 56, in f
    res = list(func(*args, **kwargs))
  File "/Users/js/stable-diffusion-webui/modules/call_queue.py", line 37, in f
    res = func(*args, **kwargs)
  File "/Users/js/stable-diffusion-webui/modules/txt2img.py", line 56, in txt2img
    processed = process_images(p)
  File "/Users/js/stable-diffusion-webui/modules/processing.py", line 486, in process_images
    res = process_images_inner(p)
  File "/Users/js/stable-diffusion-webui/modules/processing.py", line 636, in process_images_inner
    samples_ddim = p.sample(conditioning=c, unconditional_conditioning=uc, seeds=seeds, subseeds=subseeds, subseed_strength=p.subseed_strength, prompts=prompts)
  File "/Users/js/stable-diffusion-webui/modules/processing.py", line 836, in sample
    samples = self.sampler.sample(self, x, conditioning, unconditional_conditioning, image_conditioning=self.txt2img_image_conditioning(x))
  File "/Users/js/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 351, in sample
    samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={
  File "/Users/js/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 227, in launch_sampling
    return func()
  File "/Users/js/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 351, in <lambda>
    samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={
  File "/Users/js/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/Users/js/stable-diffusion-webui/repositories/k-diffusion/k_diffusion/sampling.py", line 594, in sample_dpmpp_2m
    denoised = model(x, sigmas[i] * s_in, **extra_args)
  File "/Users/js/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/Users/js/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 145, in forward
    devices.test_for_nans(x_out, "unet")
  File "/Users/js/stable-diffusion-webui/modules/devices.py", line 152, in test_for_nans
    raise NansException(message)
modules.devices.NansException: A tensor with all NaNs was produced in Unet. This could be either because there's not enough precision to represent the picture, or because your video card does not support half type. Try setting the "Upcast cross attention layer to float32" option in Settings > Stable Diffusion or using the --no-half commandline argument to fix this. Use --disable-nan-check commandline argument to disable this check.

With DDIM, PLMS & UniPC it gets to 100% with black square and no saved image:

Error completing request:41,  1.36s/it]
Arguments: ('task(ao35ifugi48be7m)', 'photo of a man', '', [], 10, 19, False, False, 1, 1, 7, -1.0, -1.0, 0, 0, 0, False, 768, 768, False, 0.7, 2, 'Latent', 0, 0, 0, [], 0, False, False, 'positive', 'comma', 0, False, False, '', 1, '', 0, '', 0, '', True, False, False, False, 0) {}
Traceback (most recent call last):
  File "/Users/js/stable-diffusion-webui/modules/call_queue.py", line 56, in f
    res = list(func(*args, **kwargs))
  File "/Users/js/stable-diffusion-webui/modules/call_queue.py", line 37, in f
    res = func(*args, **kwargs)
  File "/Users/js/stable-diffusion-webui/modules/txt2img.py", line 56, in txt2img
    processed = process_images(p)
  File "/Users/js/stable-diffusion-webui/modules/processing.py", line 486, in process_images
    res = process_images_inner(p)
  File "/Users/js/stable-diffusion-webui/modules/processing.py", line 640, in process_images_inner
    devices.test_for_nans(x, "vae")
  File "/Users/js/stable-diffusion-webui/modules/devices.py", line 152, in test_for_nans
    raise NansException(message)
modules.devices.NansException: A tensor with all NaNs was produced in VAE. Use --disable-nan-check commandline argument to disable this check.

Have tried with all extensions off and have deleted /venv directory and get same results.

Have been absorbed in controlnet with SD1.5 for a while so not sure how long has been a problem.

Thanks.

Totally , the same issue on my device.

@Scaperista
Copy link

I have the same problem after installing the fresh torch 2.0 GA, without --no-halfs I get the
input types 'tensor<1x77x1xf16>' and 'tensor<1xf32>' are not broadcast compatible
and with --no-halfs most of the samplers just produce some gibberish images with noise.
This is on an non M1 Mac running macOS 12.6.3.
Before the torch upgrade everything worked quite well aside from some random MPS problems regarding memory watermarks.

@brkirch
Copy link
Collaborator

brkirch commented Mar 22, 2023

1.14.0.dev20221025 I'm currently using works fine but throws errors without the --ho-half argument.

So all samplers work correctly with that build? Would you be able to test builds from the dates in between that and 2.1.0.dev20230312 to determine the latest build that works correctly? I haven’t been able to reproduce this issue but if I knew the exact date of the last working PyTorch build then I could likely determine more about this issue and make a workaround.

Edit: Actually, if you could give me the traceback from the error you get from using 1.14.0.dev20221025 without --no-half, that may give me a better idea of what is going wrong.

@Vitkarus
Copy link
Author

@brkirch I'm not entirely sure how to use it with 1.14.0 now because it's installed in my Conda environment and the WebUI always tries to create its own venv and install dependencies there. Maybe I can somehow force WebUI to skip creating another venv? Before that I used the old version, which I deleted

@Scaperista
Copy link

So just tested some more torch versions from nightly and the earliest versio (torch-2.0.0.dev20230128 torchvision-0.15.0.dev20230128) showed the same effects for me, as the latest one (torch-2.1.0.dev20230327 torchvision-0.16.0.dev2023032).
Without --no-halfs I get for the latest nightly: RuntimeError: "upsample_nearest2d_channels_last" not implemented for 'Half' and with --no-halfs only noise is generated as image e.g. when using the euler sampler.

@Awethon
Copy link

Awethon commented Mar 29, 2023

@brkirch
Same for me, just pulled master branch and updated dependencies.
webui config is default
2.1_768 model, stable torch 2.0, M1 Max.

For DPM++ 2M/SDE Karras/non-Karras:

  |  File "/Users/username/Documents/stable-diffusion-webui/modules/txt2img.py", line 56, in txt2img
  |    processed = process_images(p)
  |  File "/Users/username/Documents/stable-diffusion-webui/modules/processing.py", line 503, in process_images
  |    res = process_images_inner(p)
  |  File "/Users/username/Documents/stable-diffusion-webui/modules/processing.py", line 653, in process_images_inner
  |    samples_ddim = p.sample(conditioning=c, unconditional_conditioning=uc, seeds=seeds, subseeds=subseeds, subseed_strength=p.subseed_strength, prompts=prompts)
  |  File "/Users/username/Documents/stable-diffusion-webui/modules/processing.py", line 869, in sample
  |    samples = self.sampler.sample(self, x, conditioning, unconditional_conditioning, image_conditioning=self.txt2img_image_conditioning(x))
  |  File "/Users/username/Documents/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 358, in sample
  |    samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={
  |  File "/Users/username/Documents/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 234, in launch_sampling
  |    return func()
  |  File "/Users/username/Documents/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 358, in <lambda>
  |    samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={
  |  File "/Users/username/Documents/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
  |    return func(*args, **kwargs)
  |  File "/Users/username/Documents/stable-diffusion-webui/repositories/k-diffusion/k_diffusion/sampling.py", line 128, in sample_euler
  |    denoised = model(x, sigma_hat * s_in, **extra_args)
  |  File "/Users/username/Documents/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
  |    return forward_call(*args, **kwargs)
  |  File "/Users/username/Documents/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 152, in forward
  |    devices.test_for_nans(x_out, "unet")
  |  File "/Users/username/Documents/stable-diffusion-webui/modules/devices.py", line 152, in test_for_nans
  |    raise NansException(message)
  |modules.devices.NansException: A tensor with all NaNs was produced in Unet. This could be either because there's not enough precision to represent the picture, or because your video card does not support half type. Try setting the "Upcast cross attention layer to float32" option in Settings > Stable Diffusion or using the --no-half commandline argument to fix this. Use --disable-nan-check commandline argument to disable this check.

For DDIM:

  File "/Users/username/Documents/stable-diffusion-webui/modules/txt2img.py", line 56, in txt2img
    processed = process_images(p)
  File "/Users/username/Documents/stable-diffusion-webui/modules/processing.py", line 503, in process_images
    res = process_images_inner(p)
  File "/Users/username/Documents/stable-diffusion-webui/modules/processing.py", line 657, in process_images_inner
    devices.test_for_nans(x, "vae")
  File "/Users/username/Documents/stable-diffusion-webui/modules/devices.py", line 152, in test_for_nans
    raise NansException(message)
modules.devices.NansException: A tensor with all NaNs was produced in VAE. Use --disable-nan-check commandline argument to disable this check.

@imtiger
Copy link

imtiger commented Mar 31, 2023

I have the same problem after installing the fresh torch 2.0 GA, without --no-halfs I get the input types 'tensor<1x77x1xf16>' and 'tensor<1xf32>' are not broadcast compatible and with --no-halfs most of the samplers just produce some gibberish images with noise. This is on an non M1 Mac running macOS 12.6.3. Before the torch upgrade everything worked quite well aside from some random MPS problems regarding memory watermarks.

me too

@imtiger
Copy link

imtiger commented Mar 31, 2023

So just tested some more torch versions from nightly and the earliest versio (torch-2.0.0.dev20230128 torchvision-0.15.0.dev20230128) showed the same effects for me, as the latest one (torch-2.1.0.dev20230327 torchvision-0.16.0.dev2023032). Without --no-halfs I get for the latest nightly: RuntimeError: "upsample_nearest2d_channels_last" not implemented for 'Half' and with --no-halfs only noise is generated as image e.g. when using the euler sampler.

have you solved this ? i also have the problem

@GoJerry
Copy link

GoJerry commented Apr 9, 2023

RuntimeError

I also met the same problem, can you tell me you solved? thanks
Macos M1

@GoJerry
Copy link

GoJerry commented Apr 9, 2023

So just tested some more torch versions from nightly and the earliest versio (torch-2.0.0.dev20230128 torchvision-0.15.0.dev20230128) showed the same effects for me, as the latest one (torch-2.1.0.dev20230327 torchvision-0.16.0.dev2023032). Without --no-halfs I get for the latest nightly: RuntimeError: "upsample_nearest2d_channels_last" not implemented for 'Half' and with --no-halfs only noise is generated as image e.g. when using the euler sampler.

have you solved this ? i also have the problem

me too!! how to solve this?

@hstk30
Copy link

hstk30 commented Apr 11, 2023

Error completing request
Arguments: ('task(vim8g0n4kh0utdt)', 'create a classic woman with the pearl necklace', '', [], 20, 0, False, False, 1, 1, 7, -1.0, -1.0, 0, 0, 0, False, 512, 512, False, 0.7, 2, 'Latent', 0, 0, 0, [], 0, False, False, 'positive', 'comma', 0, False, False, '', 1, '', 0, '', 0, '', True, False, False, False, 0) {}
Traceback (most recent call last):
  File "/Users/hstk/code_ground/source_code/stable-diffusion-webui/modules/call_queue.py", line 56, in f
    res = list(func(*args, **kwargs))
  File "/Users/hstk/code_ground/source_code/stable-diffusion-webui/modules/call_queue.py", line 37, in f
    res = func(*args, **kwargs)
  File "/Users/hstk/code_ground/source_code/stable-diffusion-webui/modules/txt2img.py", line 56, in txt2img
    processed = process_images(p)
  File "/Users/hstk/code_ground/source_code/stable-diffusion-webui/modules/processing.py", line 503, in process_images
    res = process_images_inner(p)
  File "/Users/hstk/code_ground/source_code/stable-diffusion-webui/modules/processing.py", line 653, in process_images_inner
    samples_ddim = p.sample(conditioning=c, unconditional_conditioning=uc, seeds=seeds, subseeds=subseeds, subseed_strength=p.subseed_strength, prompts=prompts)
  File "/Users/hstk/code_ground/source_code/stable-diffusion-webui/modules/processing.py", line 869, in sample
    samples = self.sampler.sample(self, x, conditioning, unconditional_conditioning, image_conditioning=self.txt2img_image_conditioning(x))
  File "/Users/hstk/code_ground/source_code/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 358, in sample
    samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={
  File "/Users/hstk/code_ground/source_code/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 234, in launch_sampling
    return func()
  File "/Users/hstk/code_ground/source_code/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 358, in <lambda>
    samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={
  File "/Users/hstk/code_ground/source_code/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/Users/hstk/code_ground/source_code/stable-diffusion-webui/repositories/k-diffusion/k_diffusion/sampling.py", line 145, in sample_euler_ancestral
    denoised = model(x, sigmas[i] * s_in, **extra_args)
  File "/Users/hstk/code_ground/source_code/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/Users/hstk/code_ground/source_code/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 152, in forward
    devices.test_for_nans(x_out, "unet")
  File "/Users/hstk/code_ground/source_code/stable-diffusion-webui/modules/devices.py", line 152, in test_for_nans
    raise NansException(message)
modules.devices.NansException: A tensor with all NaNs was produced in Unet. This could be either because there's not enough precision to represent the picture, or because your video card does not support half type. Try setting the "Upcast cross attention layer to float32" option in Settings > Stable Diffusion or using the --no-half commandline argument to fix this. Use --disable-nan-check commandline argument to disable this check.

Me too! In my Apple M1 Pro 16G.

@FreeBlues
Copy link

Try torch 1.12.1
Someone said that the 1.12.1 is the only version can work on Mac, other version all have issue.

@hstk30
Copy link

hstk30 commented Apr 11, 2023

Try torch 1.12.1 Someone said that the 1.12.1 is the only version can work on Mac, other version all have issue.

Still not work for me.

@digital-pers0n
Copy link

Try torch 1.12.1 Someone said that the 1.12.1 is the only version can work on Mac, other version all have issue.

I think there is something broken in Ventura. I can use torch 1.12 and 1.13 with no problems in Monterey (I tested rx 570 8Gb and rx 6800 xt), also torch 2.x works correctly if you launch SD with the --cpu all .
In Ventura nothing is working for me.

@ZhelenZ
Copy link

ZhelenZ commented Apr 28, 2023

It may relate to the nightly PyTorch version. I encountered the same issue and tried revert back to my older version and problem solved!
Check out this link below. It well explained how and why : )
#7453 (comment)

@congzai520
Copy link

我也是

@ZXBmmt
Copy link

ZXBmmt commented May 7, 2023

edit “stable-diffusion-webui/webui-user.sh”

find COMMANDLINE_ARGS variable

Use the following configuration:
export COMMANDLINE_ARGS="--skip-torch-cuda-test --upcast-sampling --opt-split-attention --lowvram --no-half --use-cpu all"

@oivoodoo
Copy link

oivoodoo commented May 7, 2023

Hi @ZXBmmt

it worked for me!

Thank you!

(Apple M1 Pro)

@Vitkarus
Copy link
Author

Vitkarus commented May 7, 2023

I updated to MacOS 13.3.1 and installed latest commit [5ab7f21], but unfortunately things got even worse. Now even the DDIM sampler produces wrong images without --no-half
2023-05-07 в 17 14 24

@iWooda
Copy link

iWooda commented May 16, 2023

edit “stable-diffusion-webui/webui-user.sh”

find COMMANDLINE_ARGS variable

Use the following configuration: export COMMANDLINE_ARGS="--skip-torch-cuda-test --upcast-sampling --opt-split-attention --lowvram --no-half --use-cpu all"

It's not worked。。

@ctrlaltdylan
Copy link

Same issue, MacOS Monterey (12.6.2 (21G320)) on master branch.

@huanDreamer
Copy link

edit “stable-diffusion-webui/webui-user.sh”

find COMMANDLINE_ARGS variable

Use the following configuration: export COMMANDLINE_ARGS="--skip-torch-cuda-test --upcast-sampling --opt-split-attention --lowvram --no-half --use-cpu all"

it works for macbook m1, thanks

@finalcolor
Copy link

Open webui-macos-env.sh file with your textedit.

Change :
export COMMANDLINE_ARGS="--skip-torch-cuda-test --upcast-sampling --no-half-vae --use-cpu interrogate"

To :
export COMMANDLINE_ARGS="--skip-torch-cuda-test --upcast-sampling --no-half-vae --no-half --use-cpu interrogate"

@iWooda
Copy link

iWooda commented Jun 3, 2023

Open webui-macos-env.sh file with your textedit.

Change : export COMMANDLINE_ARGS="--skip-torch-cuda-test --upcast-sampling --no-half-vae --use-cpu interrogate"

To : export COMMANDLINE_ARGS="--skip-torch-cuda-test --upcast-sampling --no-half-vae --no-half --use-cpu interrogate"

Thanks, It worked. There is Another solution without change any files, That is use the Start command line:
./webui.sh --precision full --no-half Also worked for me.

@finalcolor
Copy link

great, you re very clever!

@finalcolor
Copy link

So just tested some more torch versions from nightly and the earliest versio (torch-2.0.0.dev20230128 torchvision-0.15.0.dev20230128) showed the same effects for me, as the latest one (torch-2.1.0.dev20230327 torchvision-0.16.0.dev2023032). Without --no-halfs I get for the latest nightly: RuntimeError: "upsample_nearest2d_channels_last" not implemented for 'Half' and with --no-halfs only noise is generated as image e.g. when using the euler sampler.

have you solved this ? i also have the problem

Open webui-macos-env.sh file with your textedit.

Change :
export COMMANDLINE_ARGS="--skip-torch-cuda-test --upcast-sampling --no-half-vae --use-cpu interrogate"

To :
export COMMANDLINE_ARGS="--skip-torch-cuda-test --upcast-sampling --no-half-vae --no-half --use-cpu interrogate"

第二种解决方案:
That is use the Start command line:
./webui.sh --precision full --no-half

@Muscleape
Copy link

edit “stable-diffusion-webui/webui-user.sh”

find COMMANDLINE_ARGS variable

Use the following configuration: export COMMANDLINE_ARGS="--skip-torch-cuda-test --upcast-sampling --opt-split-attention --lowvram --no-half --use-cpu all"

Hi @ZXBmmt

it worked for me (MacOS 13.2.1 16G commit baf6946

Thank you!

(Apple M1 Pro)

@howyeah
Copy link

howyeah commented Jun 6, 2023

Open webui-macos-env.sh file with your textedit.
Change : export COMMANDLINE_ARGS="--skip-torch-cuda-test --upcast-sampling --no-half-vae --use-cpu interrogate"
To : export COMMANDLINE_ARGS="--skip-torch-cuda-test --upcast-sampling --no-half-vae --no-half --use-cpu interrogate"

Thanks, It worked. There is Another solution without change any files, That is use the Start command line: ./webui.sh --precision full --no-half Also worked for me.

Thank you so much! It worked.

@befriend1314
Copy link

./webui.sh --no-half

我这样解决了

@matto80
Copy link

matto80 commented Jun 9, 2023

thank you!!

@akx akx added the platform:mac Issues that apply to Apple OS X, M1, M2, etc label Jun 13, 2023
@adevart
Copy link

adevart commented Jun 23, 2023

edit “stable-diffusion-webui/webui-user.sh”

find COMMANDLINE_ARGS variable

Use the following configuration: export COMMANDLINE_ARGS="--skip-torch-cuda-test --upcast-sampling --opt-split-attention --lowvram --no-half --use-cpu all"

Thanks, this worked for me to fix the error. I didn't need the part --use-cpu all. This runs on the CPU and is slower than the GPU. It's not as slow on the CPU as I thought it would be though and it only seems to use half of my CPU.

Time for the same 512x512 image was 1:20 with CPU and 0:20 with M1 Max GPU. This is roughly the same time as my Nvidia 3060 mobile GPU. The Nvidia GPU has the fan running loud but the M1 Max barely gets warm. Even after hundreds of images, there was no fan noise at all. This is great that I can use Stable Diffusion on the Mac and models like OpenJourney are working the same as they do on Windows.

@mclark4386
Copy link

--opt-split-attention --lowvram --no-half --use-cpu all

What were your command line args, if you don't mind?

@adevart
Copy link

adevart commented Jun 23, 2023

--opt-split-attention --lowvram --no-half --use-cpu all

What were your command line args, if you don't mind?

I used:

export COMMANDLINE_ARGS="--skip-torch-cuda-test --upcast-sampling --opt-split-attention --lowvram --no-half"

@NateLevin1
Copy link

NateLevin1 commented Jun 25, 2023

This is the correct solution, according to how you are 'supposed' to do it:

Open webui-user.sh.

Replace the line that says:

#export COMMANDLINE_ARGS=""

With:

export COMMANDLINE_ARGS="$COMMANDLINE_ARGS --no-half"

Then everything works great!

@maxwwc
Copy link

maxwwc commented Jul 24, 2023

edit “stable-diffusion-webui/webui-user.sh”

find COMMANDLINE_ARGS variable

Use the following configuration: export COMMANDLINE_ARGS="--skip-torch-cuda-test --upcast-sampling --opt-split-attention --lowvram --no-half --use-cpu all"

you help a lot. Thanks.

MacOS 12.6.6, Intel.

@AndiDog
Copy link

AndiDog commented Dec 22, 2023

I had NansException and other errors on M1 Pro 32 GB RAM, but upgrading some packages (in virtual environment: pip install --upgrade torch torchvision transformers) helped. That is, use newer versions than installed by webui. SD XL 1.0 base + refiner work that way (txt2img and img2img refinement), while I only got errors before.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug-report Report of a bug, yet to be confirmed platform:mac Issues that apply to Apple OS X, M1, M2, etc
Projects
None yet
Development

Successfully merging a pull request may close this issue.