Fixes to run on CPU and MPS #36

WojtekKowaluk · 2023-04-06T15:21:42Z

Some changes required to run it on CPU and other devices.

trolley813 · 2023-04-07T07:25:55Z

kandinsky2/kandinsky2_1_model.py

@@ -30,6 +30,8 @@ def __init__(
    ):
        self.config = config
        self.device = device
+        if not torch.has_cuda:


However, this does not work upon attempting to run on CPU when having CUDA-enabled version of PyTorch installed (an example use-case: I have a CUDA device, but I want to generate a higher-resolution image, and I don't have enough VRAM to do that on the GPU). Maybe we should check device directly (at least this workaround works for me)?

Suggested change

if not torch.has_cuda:

if device == "cpu":

Good point, but maybe device!="cuda" then, because same is needed for "mps" device (Apple Silicon), I don't know about others.

clarklight · 2023-04-17T11:57:35Z

I tried the above but i am having this error: RuntimeError: Expected one of cpu, cuda, ipu, xpu, mkldnn, opengl, opencl, ideep, hip, ve, fpga, ort, xla, lazy, vulkan, mps, meta, hpu, mtia, privateuseone device type at start of device string: CUDA

trolley813 · 2023-04-17T17:56:35Z

I tried the above but i am having this error: RuntimeError: Expected one of cpu, cuda, ipu, xpu, mkldnn, opengl, opencl, ideep, hip, ve, fpga, ort, xla, lazy, vulkan, mps, meta, hpu, mtia, privateuseone device type at start of device string: CUDA

As far as I understand (from the error message), you wrote CUDA in uppercase in your code, while PyTorch expect lowercase naming.

clarklight · 2023-04-18T03:28:25Z

I tried the above but i am having this error: RuntimeError: Expected one of cpu, cuda, ipu, xpu, mkldnn, opengl, opencl, ideep, hip, ve, fpga, ort, xla, lazy, vulkan, mps, meta, hpu, mtia, privateuseone device type at start of device string: CUDA

As far as I understand (from the error message), you wrote CUDA in uppercase in your code, while PyTorch expect lowercase naming.

Thank you, i managed to get it to work, yer, was trying to push it onto run on CPU, and in the end, the time it takes on the M1 Mac is too crazy, around 40 mins to process. I read on the other thread that it error out due to low gRam on 1070Ti, i ran it on my other window's laptop it also error out due to low gRam. Just dropping down the notes for anyone else that read this thread.

CoruNethron · 2023-04-21T19:51:57Z

@WojtekKowaluk , thank you for this fix

@clarklight , I've tested on M1 SoC 16GB as well and it achieves 8-10 seconds per iteration in my case, but you can try to use an mps device to enable GPU acceleration on that SoC. I've got improve up to 3 seconds per iteration - three times faster with mps

clarklight · 2023-04-22T02:44:43Z

@CoruNethron

To run this on the Mac, i have to use CPU right, because there is no Cuda on the GPU? I just tested it again running on the CPU, i just tried it again still 120 second per it.
Here is the test code, i changed it to run with the CPU. Am i doing anything incorrectly?

from kandinsky2 import get_kandinsky2
model = get_kandinsky2('cpu', task_type='text2img', cache_dir='/tmp/kandinsky2', model_version='2.1', use_flash_attention=False)
images = model.generate_text2img(
"red cat, 4k photo",
num_steps=25,
batch_size=1,
guidance_scale=4,
h=768, w=768,
sampler='p_sampler',
prior_cf_scale=4,
prior_steps="5"
)

CoruNethron · 2023-04-22T03:18:00Z

@clarklight there is no CUDA support in GPU, that's correct. But there is support for another acceleration on the GPU, that's mps, and it can utilize Mac silicon GPU with torch. So, just change cpu to mps as you previously changed cuda to cpu and it should do the trick. I've got about 3 times faster rendering. Also FYI, it takes about 1.25 seconds per iteration on my machine, when resolution is set to 512 by 512. Even faster, that stable diffusion.

clarklight · 2023-04-22T03:28:12Z

@CoruNethron Sweet, thank you! I got it to work! Yes its around 1.3second/it, but the outputted image are not images haha i will try to figure out why.

CoruNethron · 2023-04-22T04:39:27Z

@clarklight I took some ideas about image export with unique file name here:
https://gist.github.com/FurkanGozukara/10bdc0435b708b26bd87a59b6c3d1bc7

clarklight · 2023-04-22T05:36:03Z

@CoruNethron Most of my images are broken for some reason.....but if i run it on the web version, it runs fine...

maxnowack · 2023-05-19T21:55:29Z

I'm getting the following error, if I try to use img2img with mps:

Traceback (most recent call last):
  File "/opt/homebrew/Caskroom/miniforge/base/lib/python3.10/site-packages/gradio/routes.py", line 412, in run_predict
    output = await app.get_blocks().process_api(
  File "/opt/homebrew/Caskroom/miniforge/base/lib/python3.10/site-packages/gradio/blocks.py", line 1299, in process_api
    result = await self.call_function(
  File "/opt/homebrew/Caskroom/miniforge/base/lib/python3.10/site-packages/gradio/blocks.py", line 1021, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "/opt/homebrew/Caskroom/miniforge/base/lib/python3.10/site-packages/anyio/to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/opt/homebrew/Caskroom/miniforge/base/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "/opt/homebrew/Caskroom/miniforge/base/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "/Users/maxnowack/code/kubin/src/ui_blocks/i2i.py", line 65, in generate
    return generate_fn(params)
  File "/Users/maxnowack/code/kubin/src/webui.py", line 28, in <lambda>
    i2i_ui(generate_fn=lambda params: kubin.model.i2i(params), shared=ui_shared, tabs=ui_tabs)
  File "/Users/maxnowack/code/kubin/src/models/model_kd2.py", line 125, in i2i
    current_batch = self.kandinsky.generate_img2img(
  File "/opt/homebrew/Caskroom/miniforge/base/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/opt/homebrew/Caskroom/miniforge/base/lib/python3.10/site-packages/kandinsky2/kandinsky2_1_model.py", line 466, in generate_img2img
    image = q_sample(
  File "/opt/homebrew/Caskroom/miniforge/base/lib/python3.10/site-packages/kandinsky2/utils.py", line 52, in q_sample
    _extract_into_tensor(sqrt_alphas_cumprod, t, x_start.shape) * x_start
  File "/opt/homebrew/Caskroom/miniforge/base/lib/python3.10/site-packages/kandinsky2/model/utils.py", line 18, in _extract_into_tensor
    res = torch.from_numpy(arr).to(device=timesteps.device)[timesteps].float()
TypeError: Cannot convert a MPS Tensor to float64 dtype as the MPS framework doesn't support float64. Please use float32 instead.

WojtekKowaluk · 2023-05-20T03:44:28Z

I have fixed that one, but still getting other errors with img2img

Traceback (most recent call last):
  File "/Users/wojtek/Documents/kubin/venv/lib/python3.10/site-packages/gradio/routes.py", line 412, in run_predict
    output = await app.get_blocks().process_api(
  File "/Users/wojtek/Documents/kubin/venv/lib/python3.10/site-packages/gradio/blocks.py", line 1299, in process_api
    result = await self.call_function(
  File "/Users/wojtek/Documents/kubin/venv/lib/python3.10/site-packages/gradio/blocks.py", line 1021, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "/Users/wojtek/Documents/kubin/venv/lib/python3.10/site-packages/anyio/to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/Users/wojtek/Documents/kubin/venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "/Users/wojtek/Documents/kubin/venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "/Users/wojtek/Documents/kubin/src/ui_blocks/i2i.py", line 65, in generate
    return generate_fn(params)
  File "/Users/wojtek/Documents/kubin/src/webui.py", line 28, in <lambda>
    i2i_ui(generate_fn=lambda params: kubin.model.i2i(params), shared=ui_shared, tabs=ui_tabs)
  File "/Users/wojtek/Documents/kubin/src/models/model_kd2.py", line 127, in i2i
    current_batch = self.kandinsky.generate_img2img(
  File "/Users/wojtek/Documents/kubin/venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/Users/wojtek/Documents/kubin/venv/lib/python3.10/site-packages/kandinsky2/kandinsky2_1_model.py", line 474, in generate_img2img
    return self.generate_img(
  File "/Users/wojtek/Documents/kubin/venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/Users/wojtek/Documents/kubin/venv/lib/python3.10/site-packages/kandinsky2/kandinsky2_1_model.py", line 277, in generate_img
    samples, _ = sampler.sample(
  File "/Users/wojtek/Documents/kubin/venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/Users/wojtek/Documents/kubin/venv/lib/python3.10/site-packages/kandinsky2/model/samplers.py", line 178, in sample
    self.make_schedule(
  File "/Users/wojtek/Documents/kubin/venv/lib/python3.10/site-packages/kandinsky2/model/samplers.py", line 104, in make_schedule
    "betas", to_torch(torch.from_numpy(self.old_diffusion.betas))
  File "/Users/wojtek/Documents/kubin/venv/lib/python3.10/site-packages/kandinsky2/model/samplers.py", line 101, in <lambda>
    to_torch = lambda x: x.clone().detach().to(torch.float32).to("cuda")
  File "/Users/wojtek/Documents/kubin/venv/lib/python3.10/site-packages/torch/cuda/__init__.py", line 239, in _lazy_init
    raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enable

after I change sampler to p_sampler I get another one:

Traceback (most recent call last):
  File "/Users/wojtek/Documents/kubin/venv/lib/python3.10/site-packages/gradio/routes.py", line 412, in run_predict
    output = await app.get_blocks().process_api(
  File "/Users/wojtek/Documents/kubin/venv/lib/python3.10/site-packages/gradio/blocks.py", line 1299, in process_api
    result = await self.call_function(
  File "/Users/wojtek/Documents/kubin/venv/lib/python3.10/site-packages/gradio/blocks.py", line 1021, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "/Users/wojtek/Documents/kubin/venv/lib/python3.10/site-packages/anyio/to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/Users/wojtek/Documents/kubin/venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "/Users/wojtek/Documents/kubin/venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "/Users/wojtek/Documents/kubin/src/ui_blocks/i2i.py", line 65, in generate
    return generate_fn(params)
  File "/Users/wojtek/Documents/kubin/src/webui.py", line 28, in <lambda>
    i2i_ui(generate_fn=lambda params: kubin.model.i2i(params), shared=ui_shared, tabs=ui_tabs)
  File "/Users/wojtek/Documents/kubin/src/models/model_kd2.py", line 141, in i2i
    saved_batch = save_output(self.output_dir, 'img2img', current_batch, params)
  File "/Users/wojtek/Documents/kubin/src/utils/file_system.py", line 38, in save_output
    params_as_json = None if params is None else json.dumps(params, skipkeys=True)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/json/__init__.py", line 238, in dumps
    **kw).encode(obj)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/json/encoder.py", line 199, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/json/encoder.py", line 257, in iterencode
    return _iterencode(o, 0)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type Image is not JSON serializable

maxnowack · 2023-05-20T10:53:21Z

There are still some hardcoded references to cuda in the samplers. I think a solution might be to pass the configured device to the samplers and use that instead of cuda. I'm quite inexperienced with pytorch, so I'm not sure what the implications of this might be.

WojtekKowaluk · 2023-05-20T13:18:44Z

I have fixed samplers, for JSON error I have fixed it here: seruva19/kubin#80

WojtekKowaluk · 2023-05-24T04:49:41Z

@CoruNethron Most of my images are broken for some reason.....but if i run it on the web version, it runs fine...

is this plms_sampler? I think that one is broken. ddim_sampler and p_sampler should work fine :)

ahmad88me · 2024-04-05T11:13:30Z

For mac, MPS can be used. I've also created a pull request to handle mps (69759df)

    if torch.cuda.is_available():
        device = "cuda"
    elif torch.backends.mps.is_available():
        device = "mps"
    else:
        device = "cpu"

WojtekKowaluk mentioned this pull request Apr 6, 2023

Использует память GPU , когда явно указано 'cpu' #26

Open

WojtekKowaluk force-pushed the cpu-mps-fix branch from d5162d4 to 0648ba7 Compare April 7, 2023 05:29

trolley813 reviewed Apr 7, 2023

View reviewed changes

WojtekKowaluk force-pushed the cpu-mps-fix branch from 0648ba7 to bd5bef9 Compare April 7, 2023 12:55

trolley813 mentioned this pull request May 4, 2023

CUDA support #54

Open

Fixes to run on CPU and MPS

4b3d3d5

WojtekKowaluk force-pushed the cpu-mps-fix branch from bd5bef9 to 4b3d3d5 Compare May 14, 2023 23:03

WojtekKowaluk mentioned this pull request May 14, 2023

skip cuda mem_get_info if cuda is not available seruva19/kubin#62

Merged

fix _extract_into_tensor on mps device

ef70508

allow samplers to run on devices other than cuda

8068dd5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixes to run on CPU and MPS #36

Fixes to run on CPU and MPS #36

WojtekKowaluk commented Apr 6, 2023

trolley813 Apr 7, 2023

WojtekKowaluk Apr 7, 2023 •

edited

Loading

clarklight commented Apr 17, 2023 •

edited

Loading

trolley813 commented Apr 17, 2023

clarklight commented Apr 18, 2023

CoruNethron commented Apr 21, 2023

clarklight commented Apr 22, 2023

CoruNethron commented Apr 22, 2023

clarklight commented Apr 22, 2023 •

edited

Loading

CoruNethron commented Apr 22, 2023 •

edited

Loading

clarklight commented Apr 22, 2023 •

edited

Loading

maxnowack commented May 19, 2023

WojtekKowaluk commented May 20, 2023

maxnowack commented May 20, 2023

WojtekKowaluk commented May 20, 2023 •

edited

Loading

WojtekKowaluk commented May 24, 2023

ahmad88me commented Apr 5, 2024

Fixes to run on CPU and MPS #36

Are you sure you want to change the base?

Fixes to run on CPU and MPS #36

Conversation

WojtekKowaluk commented Apr 6, 2023

trolley813 Apr 7, 2023

Choose a reason for hiding this comment

WojtekKowaluk Apr 7, 2023 • edited Loading

Choose a reason for hiding this comment

clarklight commented Apr 17, 2023 • edited Loading

trolley813 commented Apr 17, 2023

clarklight commented Apr 18, 2023

CoruNethron commented Apr 21, 2023

clarklight commented Apr 22, 2023

CoruNethron commented Apr 22, 2023

clarklight commented Apr 22, 2023 • edited Loading

CoruNethron commented Apr 22, 2023 • edited Loading

clarklight commented Apr 22, 2023 • edited Loading

maxnowack commented May 19, 2023

WojtekKowaluk commented May 20, 2023

maxnowack commented May 20, 2023

WojtekKowaluk commented May 20, 2023 • edited Loading

WojtekKowaluk commented May 24, 2023

ahmad88me commented Apr 5, 2024

WojtekKowaluk Apr 7, 2023 •

edited

Loading

clarklight commented Apr 17, 2023 •

edited

Loading

clarklight commented Apr 22, 2023 •

edited

Loading

CoruNethron commented Apr 22, 2023 •

edited

Loading

clarklight commented Apr 22, 2023 •

edited

Loading

WojtekKowaluk commented May 20, 2023 •

edited

Loading