-
Notifications
You must be signed in to change notification settings - Fork 5.9k
indices should be either on cpu or on the same device as the indexed tensor (cpu) #239
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@patil-suraj could you take a look here? |
I downloaded the source and tracked the problem down to this line...
|
Solved: |
Is |
I think it was supported but 'device' was cuda, and it's trying to do something with numpy which uses cpu |
This is working fine for me. Here's the code snippet I tried.
import requests
from PIL import Image
from io import BytesIO
from torch import autocast
from image_to_image import *
device = "cuda"
pipei2i = StableDiffusionImg2ImgPipeline.from_pretrained(
"CompVis/stable-diffusion-v1-4",
revision="fp16",
torch_dtype=torch.float16,
use_auth_token=True
).to(device)
response = requests.get('https://pbs.twimg.com/media/Fa1_7_vWYAEwfX-.png')
init_image = Image.open(BytesIO(response.content)).convert("RGB")
init_image = init_image.resize((512, 512))
init_image = preprocess(init_image)
prompt = "a cat, artstation"
with autocast("cuda"):
outputs = pipei2i(prompt=prompt, init_image=init_image, strength=0.75, num_inference_steps=75,guidance_scale=0.75) |
Strange it wouldn't on mine then! I got it to work with the timesteps.cpu() thing anyways so not a problem anymore. |
@andydhancock thanks! On Apple M1 it worked in cpu removing FP16 and import requests
from PIL import Image
from io import BytesIO
from torch import autocast
from image_to_image import *
device = 'cuda' if torch.cuda.is_available() else 'cpu'
print(f'running on {device}')
pipei2i = StableDiffusionImg2ImgPipeline.from_pretrained(
"CompVis/stable-diffusion-v1-4",
#revision="fp16",
#torch_dtype=torch.float16,
use_auth_token=True
).to(device)
response = requests.get('https://pbs.twimg.com/media/Fa1_7_vWYAEwfX-.png')
init_image = Image.open(BytesIO(response.content)).convert("RGB")
init_image = init_image.resize((512, 512))
init_image = preprocess(init_image)
prompt = "a cat, artstation"
samples = 1
outputs = []
if device=='cuda':
with autocast("cuda"):
outputs = pipei2i(prompt=[prompt]*samples,
init_image=init_image,
#strength=0.75,
num_inference_steps=50,
guidance_scale=7.5)
else:
outputs = pipei2i(prompt=[prompt]*samples,
init_image=init_image,
#strength=0.75,
num_inference_steps=50,
guidance_scale=7.5) A question, the default value |
Yeah, guidance_scale is 7.5, strength 0.75.. typo.. they're variables in my actual code |
Wait why was this closed? Shouldn't this be fixed in source? Or is the Collab someone else's problem? |
Doesn't this work for you @cherrerajobs: import requests
from PIL import Image
from io import BytesIO
from torch import autocast
from diffusers import StableDiffusionImg2ImgPipeline
device = "cuda"
pipei2i = StableDiffusionImg2ImgPipeline.from_pretrained(
"CompVis/stable-diffusion-v1-4",
revision="fp16",
torch_dtype=torch.float16,
use_auth_token=True
).to(device)
response = requests.get('https://pbs.twimg.com/media/Fa1_7_vWYAEwfX-.png')
init_image = Image.open(BytesIO(response.content)).convert("RGB")
init_image = init_image.resize((512, 512))
prompt = "a cat, artstation"
with autocast("cuda"):
outputs = pipei2i(prompt=prompt, init_image=init_image, strength=0.75, num_inference_steps=75,guidance_scale=0.75) It should work now on master |
Using the code from main branch and the following code, it fails on Apple Silicon devices: import torch
import requests
from PIL import Image
from io import BytesIO
from torch import autocast
from diffusers import StableDiffusionImg2ImgPipeline
device = torch.device("cuda" if torch.cuda.is_available() else "mps" if torch.backends.mps.is_available() else "cpu")
pipei2i = StableDiffusionImg2ImgPipeline.from_pretrained(
"stable-diffusion-v1-4",
).to(device)
response = requests.get('https://pbs.twimg.com/media/Fa1_7_vWYAEwfX-.png')
init_image = Image.open(BytesIO(response.content)).convert("RGB")
init_image = init_image.resize((512, 512))
prompt = "a cat, artstation"
with autocast("cpu"):
outputs = pipei2i(prompt=prompt, init_image=init_image, strength=0.75, num_inference_steps=75,guidance_scale=0.75) The error is as follows:
It appears that If I change references to |
Interesting! I don't have a mac computer - @patil-suraj @pcuenca do you have one? Could you give it a try maybe? :-) |
@patrickvonplaten Thanks for looking into this. I'm able to get around this currently by simply modifying code to do the following instead of the existing code at line 268 in scheduling_pndm.py:
Then I usee |
Thanks guys 😄 Is there anything I should have done differently about this bug report? Just asking in case I should have reported MPS issues elsewhere? I do have two other issues (both on MPS) which are not crashes but the behaviour is different to what is expected (and what is observed on non-MPS devices). So just wondering if I should report or wait till you guys are done with the MPS changes first? |
@FahimF Thanks for reporting this! By all means, do report any other issues you've found. If you are sure they are related with the |
@pcuenca Thanks, will do 🙂 And yes, both are |
* Initial support for mps in Stable Diffusion pipeline. * Initial "warmup" implementation when using mps. * Make some deterministic tests pass with mps. * Disable training tests when using mps. * SD: generate latents in CPU then move to device. This is especially important when using the mps device, because generators are not supported there. See for example pytorch/pytorch#84288. In addition, the other pipelines seem to use the same approach: generate the random samples then move to the appropriate device. After this change, generating an image in MPS produces the same result as when using the CPU, if the same seed is used. * Remove prints. * Pass AutoencoderKL test_output_pretrained with mps. Sampling from `posterior` must be done in CPU. * Style * Do not use torch.long for log op in mps device. * Perform incompatible padding ops in CPU. UNet tests now pass. See pytorch/pytorch#84535 * Style: fix import order. * Remove unused symbols. * Remove MPSWarmupMixin, do not apply automatically. We do apply warmup in the tests, but not during normal use. This adopts some PR suggestions by @patrickvonplaten. * Add comment for mps fallback to CPU step. * Add README_mps.md for mps installation and use. * Apply `black` to modified files. * Restrict README_mps to SD, show measures in table. * Make PNDM indexing compatible with mps. Addresses #239. * Do not use float64 when using LDMScheduler. Fixes #358. * Fix typo identified by @patil-suraj Co-authored-by: Suraj Patil <surajp815@gmail.com> * Adapt example to new output style. * Restore 1:1 results reproducibility with CompVis. However, mps latents need to be generated in CPU because generators don't work in the mps device. * Move PyTorch nightly to requirements. * Adapt `test_scheduler_outputs_equivalence` ton MPS. * mps: skip training tests instead of ignoring silently. * Make VQModel tests pass on mps. * mps ddim tests: warmup, increase tolerance. * ScoreSdeVeScheduler indexing made mps compatible. * Make ldm pipeline tests pass using warmup. * Style * Simplify casting as suggested in PR. * Add Known Issues to readme. * `isort` import order. * Remove _mps_warmup helpers from ModelMixin. And just make changes to the tests. * Skip tests using unittest decorator for consistency. * Remove temporary var. * Remove spurious blank space. * Remove unused symbol. * Remove README_mps. Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
This issue appears to have cropped up again in the 0.3.0 release in Update: tagging @pcuenca since the ticket is closed and not sure if anybody gets notified. |
Should we maybe open a new issue here stating that this concerns mostly "mps"? |
I just opened #501. If we keep discovering mps issues we could create a label for that. |
* Initial support for mps in Stable Diffusion pipeline. * Initial "warmup" implementation when using mps. * Make some deterministic tests pass with mps. * Disable training tests when using mps. * SD: generate latents in CPU then move to device. This is especially important when using the mps device, because generators are not supported there. See for example pytorch/pytorch#84288. In addition, the other pipelines seem to use the same approach: generate the random samples then move to the appropriate device. After this change, generating an image in MPS produces the same result as when using the CPU, if the same seed is used. * Remove prints. * Pass AutoencoderKL test_output_pretrained with mps. Sampling from `posterior` must be done in CPU. * Style * Do not use torch.long for log op in mps device. * Perform incompatible padding ops in CPU. UNet tests now pass. See pytorch/pytorch#84535 * Style: fix import order. * Remove unused symbols. * Remove MPSWarmupMixin, do not apply automatically. We do apply warmup in the tests, but not during normal use. This adopts some PR suggestions by @patrickvonplaten. * Add comment for mps fallback to CPU step. * Add README_mps.md for mps installation and use. * Apply `black` to modified files. * Restrict README_mps to SD, show measures in table. * Make PNDM indexing compatible with mps. Addresses huggingface#239. * Do not use float64 when using LDMScheduler. Fixes huggingface#358. * Fix typo identified by @patil-suraj Co-authored-by: Suraj Patil <surajp815@gmail.com> * Adapt example to new output style. * Restore 1:1 results reproducibility with CompVis. However, mps latents need to be generated in CPU because generators don't work in the mps device. * Move PyTorch nightly to requirements. * Adapt `test_scheduler_outputs_equivalence` ton MPS. * mps: skip training tests instead of ignoring silently. * Make VQModel tests pass on mps. * mps ddim tests: warmup, increase tolerance. * ScoreSdeVeScheduler indexing made mps compatible. * Make ldm pipeline tests pass using warmup. * Style * Simplify casting as suggested in PR. * Add Known Issues to readme. * `isort` import order. * Remove _mps_warmup helpers from ModelMixin. And just make changes to the tests. * Skip tests using unittest decorator for consistency. * Remove temporary var. * Remove spurious blank space. * Remove unused symbol. * Remove README_mps. Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Describe the bug
image_to_image.py line 92 throws the error above.
I've tried adding .to(self.device) to the 3 parameters.
Device should be 'cuda' though.
Reproduction
Logs
No response
System Info
The text was updated successfully, but these errors were encountered: