Apple MPS error in unet_2d_condition.py #358

FahimF · 2022-09-05T09:42:51Z

Describe the bug

When you use an LMSDiscreteScheduler on an Apple Silicon machine, you'll get the following error:
Cannot convert a MPS Tensor to float64 dtype as the MPS framework doesn't support float64. Please use float32 instead

The offending line is 134 in unet_2d_condition.py.

The current code is:
timesteps = timesteps[None].to(sample.device)

Changing that to the following stops the crash:
timesteps = timesteps[None].long().to(sample.device)

However, I believe you'd really want to do a check to see if the current device is MPS and only do the format conversion if you are on MPS?

Reproduction

When you use an LMSDiscreteScheduler on an Apple Silicon machine you should see the crash.

Logs

No response

System Info

The current main branch from the repo since that appears to be different from the current release version (0.2.4?)

anton-l · 2022-09-05T12:04:34Z

Hi @FahimF! There's ongoing work to support mps for schedulers, feel free to check in on the progress at #355
cc @pcuenca

Fixes #358.

pcuenca · 2022-09-08T08:02:24Z

This will fixed by 12f6670 when the branch is merged. Thanks again for your help @FahimF!

FahimF · 2022-09-08T08:05:25Z

@pcuenca Thank you for the fixes! Much obliged! Will wait for the merge eagerly 😄

FahimF · 2022-09-09T07:49:13Z

Unfortunately, with the 0.3.0 release installed, this issue crops up on line 95 in scheduling_utils.py. Sorry 😢

Update: tagging @pcuenca since the ticket is closed and not sure if anybody gets notified.

pcuenca · 2022-09-09T09:58:57Z

Hi @FahimF! Works for me. Would you mind sharing a code snippet so I can try to reproduce? Also, some information about your setup could be useful. Thanks a lot!

FahimF · 2022-09-09T10:14:26Z

@pcuenca Thank you for taking a look. Let me try to remove the fix I put in the code and come up with something simple to demonstrate the issue. I know at least one of those was while using the StableDiffusionImg2ImgPipeline but I don't know if that was requisite ... I was in a rush and so simply fixed the code and moved on. Should have kept better records. Sorry.

Here's what I have in my console from that particular crash:

Traceback (most recent call last):
  File "/Users/fahim/miniforge3/envs/ml/lib/python3.8/tkinter/__init__.py", line 1892, in __call__
    return self.func(*args)
  File "gui.py", line 191, in generate_images
    result = pipe(prompt=g_prompt, init_image=image, strength=strength,
  File "/Users/fahim/miniforge3/envs/ml/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/Users/fahim/miniforge3/envs/ml/lib/python3.8/site-packages/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_img2img.py", line 205, in __call__
    init_latents = self.scheduler.add_noise(init_latents, noise, timesteps).to(self.device)
  File "/Users/fahim/miniforge3/envs/ml/lib/python3.8/site-packages/diffusers/schedulers/scheduling_lms_discrete.py", line 189, in add_noise
    sigmas = self.match_shape(self.sigmas[ts], noise)
  File "/Users/fahim/miniforge3/envs/ml/lib/python3.8/site-packages/diffusers/schedulers/scheduling_utils.py", line 95, in match_shape
    values = values.to(broadcast_array.device)
TypeError: Cannot convert a MPS Tensor to float64 dtype as the MPS framework doesn't support float64. Please use float32 instead.

I see from the error above that I was also using LMSDiscreteScheduler. Perhaps that would be enough to replicate the issue at your end? If not, please let me know and I'll come up with a simple bit of code since my current code has all sorts of other stuff like a tkinter GUI 😄

As far as set up goes, latest pytorch nightly (installed today) and diffusers 0.3.0 (installed today) and on a 2021 MBP. If you need any additional info (I don't know what would help and what won't) please let me know and I'll provide.

Update:
As far as the code goes, I believe the most relevant bits would be the two following lines:

sched = LMSDiscreteScheduler(beta_start=0.00085, beta_end=0.012, beta_schedule="scaled_linear")
pipe = StableDiffusionImg2ImgPipeline.from_pretrained("stable-diffusion-v1-4", scheduler=sched).to(device)

pcuenca · 2022-09-09T10:59:14Z

Thanks a lot, @FahimF, I'll test the image to image pipeline and report back.

FahimF · 2022-09-12T02:39:42Z

Just updating that this issue might have been due to something in the PyTorch nightlies. I could not generate valid images via img2img either and then updated PyTorch nightly (and did a clean install) and that issue went away. So I tested for this one and this one's gone too ...

pcuenca · 2022-09-12T07:23:12Z

Actually I could reproduce this issue using the img2img pipeline as you said with the LMSDiscreteScheduler. I created a PR that we should merge soon :)

FahimF · 2022-09-12T07:25:30Z

Cool 😄 I did see that you had a PR but just letting you know just in case I sent you on a wild-goose chase. I really have no idea what happened but at least two bugs I had 3 days ago have disappeared with the PyTorch nightly from yesterday.

pcuenca · 2022-09-12T07:31:38Z

That's interesting, PyTorch must have merged some fixes maybe? We'll have to test again in case they are falling back to CPU and performance degrades. Thanks!

FahimF · 2022-09-12T07:33:39Z

Sure thing 😄 If you need any additional info, please let me know but the nightly build that I'm running where I don't have the issues is:
torch 1.13.0.dev20220911
torchaudio 0.13.0.dev20220911
torchvision 0.14.0.dev20220911

FahimF · 2022-09-12T16:00:18Z

@pcuenca Sorry to bug you about a totally separate issue, but I tagged you in a closed ticket about an issue which was fixed but still persists (In a different file) here: #239 (comment)

Just mentioning since I don't know if you get notifications for closed tickets 😄 If you'd prefer that I create a new ticket for that, I can do so. Please let me know.

pcuenca · 2022-09-13T09:23:21Z

Hi @FahimF, actually the branch in #450 fixes the other issue. I'll close this one and reopen #239 instead.

* Fix LMS scheduler indexing in `add_noise` #358. * Fix DDIM and DDPM indexing with mps device. * Verify format is PyTorch before using `.to()`

@patrickvonplaten

* Initial support for mps in Stable Diffusion pipeline. * Initial "warmup" implementation when using mps. * Make some deterministic tests pass with mps. * Disable training tests when using mps. * SD: generate latents in CPU then move to device. This is especially important when using the mps device, because generators are not supported there. See for example pytorch/pytorch#84288. In addition, the other pipelines seem to use the same approach: generate the random samples then move to the appropriate device. After this change, generating an image in MPS produces the same result as when using the CPU, if the same seed is used. * Remove prints. * Pass AutoencoderKL test_output_pretrained with mps. Sampling from `posterior` must be done in CPU. * Style * Do not use torch.long for log op in mps device. * Perform incompatible padding ops in CPU. UNet tests now pass. See pytorch/pytorch#84535 * Style: fix import order. * Remove unused symbols. * Remove MPSWarmupMixin, do not apply automatically. We do apply warmup in the tests, but not during normal use. This adopts some PR suggestions by @patrickvonplaten. * Add comment for mps fallback to CPU step. * Add README_mps.md for mps installation and use. * Apply `black` to modified files. * Restrict README_mps to SD, show measures in table. * Make PNDM indexing compatible with mps. Addresses huggingface#239. * Do not use float64 when using LDMScheduler. Fixes huggingface#358. * Fix typo identified by @patil-suraj Co-authored-by: Suraj Patil <surajp815@gmail.com> * Adapt example to new output style. * Restore 1:1 results reproducibility with CompVis. However, mps latents need to be generated in CPU because generators don't work in the mps device. * Move PyTorch nightly to requirements. * Adapt `test_scheduler_outputs_equivalence` ton MPS. * mps: skip training tests instead of ignoring silently. * Make VQModel tests pass on mps. * mps ddim tests: warmup, increase tolerance. * ScoreSdeVeScheduler indexing made mps compatible. * Make ldm pipeline tests pass using warmup. * Style * Simplify casting as suggested in PR. * Add Known Issues to readme. * `isort` import order. * Remove _mps_warmup helpers from ModelMixin. And just make changes to the tests. * Skip tests using unittest decorator for consistency. * Remove temporary var. * Remove spurious blank space. * Remove unused symbol. * Remove README_mps. Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Fix LMS scheduler indexing in `add_noise` huggingface#358. * Fix DDIM and DDPM indexing with mps device. * Verify format is PyTorch before using `.to()`

FahimF added the bug Something isn't working label Sep 5, 2022

pcuenca added a commit that referenced this issue Sep 5, 2022

Do not use float64 when using LDMScheduler.

12f6670

Fixes #358.

patrickvonplaten assigned pcuenca Sep 7, 2022

pcuenca closed this as completed in 5dda173 Sep 8, 2022

pcuenca reopened this Sep 9, 2022

pcuenca added a commit that referenced this issue Sep 9, 2022

Fix LMS scheduler indexing in add_noise #358.

6a0c1db

pcuenca mentioned this issue Sep 9, 2022

Fix MPS scheduler indexing when using mps #450

Merged

pcuenca closed this as completed Sep 13, 2022

pcuenca added a commit that referenced this issue Sep 14, 2022

Fix MPS scheduler indexing when using mps (#450)

1a69c6f

* Fix LMS scheduler indexing in `add_noise` #358. * Fix DDIM and DDPM indexing with mps device. * Verify format is PyTorch before using `.to()`

danielraffel mentioned this issue Oct 8, 2022

Support for M1 macs MPS api nateraw/stable-diffusion-videos#38

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Apple MPS error in unet_2d_condition.py #358

Apple MPS error in unet_2d_condition.py #358

FahimF commented Sep 5, 2022

anton-l commented Sep 5, 2022

pcuenca commented Sep 8, 2022

FahimF commented Sep 8, 2022

FahimF commented Sep 9, 2022 •

edited

pcuenca commented Sep 9, 2022

FahimF commented Sep 9, 2022 •

edited

pcuenca commented Sep 9, 2022

FahimF commented Sep 12, 2022

pcuenca commented Sep 12, 2022

FahimF commented Sep 12, 2022

pcuenca commented Sep 12, 2022

FahimF commented Sep 12, 2022

FahimF commented Sep 12, 2022

pcuenca commented Sep 13, 2022

Apple MPS error in unet_2d_condition.py #358

Apple MPS error in unet_2d_condition.py #358

Comments

FahimF commented Sep 5, 2022

Describe the bug

Reproduction

Logs

System Info

anton-l commented Sep 5, 2022

pcuenca commented Sep 8, 2022

FahimF commented Sep 8, 2022

FahimF commented Sep 9, 2022 • edited

pcuenca commented Sep 9, 2022

FahimF commented Sep 9, 2022 • edited

pcuenca commented Sep 9, 2022

FahimF commented Sep 12, 2022

pcuenca commented Sep 12, 2022

FahimF commented Sep 12, 2022

pcuenca commented Sep 12, 2022

FahimF commented Sep 12, 2022

FahimF commented Sep 12, 2022

pcuenca commented Sep 13, 2022

FahimF commented Sep 9, 2022 •

edited

FahimF commented Sep 9, 2022 •

edited