Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AMD Windows - can't generate anything #624

Closed
yorai1212 opened this issue Oct 10, 2023 · 44 comments
Closed

AMD Windows - can't generate anything #624

yorai1212 opened this issue Oct 10, 2023 · 44 comments

Comments

@yorai1212
Copy link

yorai1212 commented Oct 10, 2023

Downloaded the program, pasted the models (2 checkpoint files and one inpaint) (already had them downloaded).
Edited the run.bat file according to what it said under Windows(AMD GPUs)

Tried generating an image, and got an error. I'm pasting my entire CMD log.

C:\Fooocus_win64_2-1-25>.\python_embeded\python.exe -m pip uninstall torch torchvision torchaudio torchtext functorch xformers -y
Found existing installation: torch 2.0.0
Uninstalling torch-2.0.0:
  Successfully uninstalled torch-2.0.0
Found existing installation: torchvision 0.15.1
Uninstalling torchvision-0.15.1:
  Successfully uninstalled torchvision-0.15.1
WARNING: Skipping torchaudio as it is not installed.
WARNING: Skipping torchtext as it is not installed.
WARNING: Skipping functorch as it is not installed.
WARNING: Skipping xformers as it is not installed.

C:\Fooocus_win64_2-1-25>.\python_embeded\python.exe -m pip install torch-directml
Requirement already satisfied: torch-directml in c:\fooocus_win64_2-1-25\python_embeded\lib\site-packages (0.2.0.dev230426)
Collecting torch==2.0.0 (from torch-directml)
  Using cached torch-2.0.0-cp310-cp310-win_amd64.whl (172.3 MB)
WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ReadTimeoutError("HTTPSConnectionPool(host='pypi.org', port=443): Read timed out. (read timeout=15)")': /simple/torchvision/
Collecting torchvision==0.15.1 (from torch-directml)
  Using cached torchvision-0.15.1-cp310-cp310-win_amd64.whl (1.2 MB)
Requirement already satisfied: filelock in c:\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torch==2.0.0->torch-directml) (3.12.2)
Requirement already satisfied: typing-extensions in c:\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torch==2.0.0->torch-directml) (4.7.1)
Requirement already satisfied: sympy in c:\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torch==2.0.0->torch-directml) (1.12)
Requirement already satisfied: networkx in c:\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torch==2.0.0->torch-directml) (3.1)
Requirement already satisfied: jinja2 in c:\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torch==2.0.0->torch-directml) (3.1.2)
Requirement already satisfied: numpy in c:\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torchvision==0.15.1->torch-directml) (1.23.5)
Requirement already satisfied: requests in c:\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torchvision==0.15.1->torch-directml) (2.31.0)
Requirement already satisfied: pillow!=8.3.*,>=5.3.0 in c:\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torchvision==0.15.1->torch-directml) (9.2.0)
Requirement already satisfied: MarkupSafe>=2.0 in c:\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from jinja2->torch==2.0.0->torch-directml) (2.1.3)
Requirement already satisfied: charset-normalizer<4,>=2 in c:\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from requests->torchvision==0.15.1->torch-directml) (3.1.0)
Requirement already satisfied: idna<4,>=2.5 in c:\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from requests->torchvision==0.15.1->torch-directml) (3.4)
Requirement already satisfied: urllib3<3,>=1.21.1 in c:\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from requests->torchvision==0.15.1->torch-directml) (2.0.3)
Requirement already satisfied: certifi>=2017.4.17 in c:\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from requests->torchvision==0.15.1->torch-directml) (2023.5.7)
Requirement already satisfied: mpmath>=0.19 in c:\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from sympy->torch==2.0.0->torch-directml) (1.3.0)
DEPRECATION: torchsde 0.2.5 has a non-standard dependency specifier numpy>=1.19.*; python_version >= "3.7". pip 23.3 will enforce this behaviour change. A possible replacement is to upgrade to a newer version of torchsde or contact the author to suggest that they release a version with a conforming dependency specifiers. Discussion can be found at https://github.com/pypa/pip/issues/12063
Installing collected packages: torch, torchvision
  WARNING: The scripts convert-caffe2-to-onnx.exe, convert-onnx-to-caffe2.exe and torchrun.exe are installed in 'C:\Fooocus_win64_2-1-25\python_embeded\Scripts' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
Successfully installed torch-2.0.0 torchvision-0.15.1

C:\Fooocus_win64_2-1-25>.\python_embeded\python.exe -s Fooocus\entry_with_update.py --directml
Already up-to-date
Update succeeded.
Python 3.10.9 (tags/v3.10.9:1dd9be6, Dec  6 2022, 20:01:21) [MSC v.1934 64 bit (AMD64)]
Fooocus version: 2.1.37
Inference Engine exists and URL is correct.
Inference Engine checkout finished for d1a0abd40b86f3f079b0cc71e49f9f4604831457.
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
Using directml with device:
Total VRAM 1024 MB, total RAM 32671 MB
Set vram state to: NORMAL_VRAM
Device: privateuseone
VAE dtype: torch.float32
Using sub quadratic optimization for cross attention, if you have memory or speed issues try using: --use-split-cross-attention
model_type EPS
adm 2560
Refiner model loaded: C:\Fooocus_win64_2-1-25\Fooocus\models\checkpoints\sd_xl_refiner_1.0_0.9vae.safetensors
model_type EPS
adm 2816
making attention of type 'vanilla' with 512 in_channels
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
making attention of type 'vanilla' with 512 in_channels
missing {'cond_stage_model.clip_l.logit_scale', 'cond_stage_model.clip_l.text_projection', 'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids'}
Base model loaded: C:\Fooocus_win64_2-1-25\Fooocus\models\checkpoints\sd_xl_base_1.0_0.9vae.safetensors
LoRAs loaded: [('sd_xl_offset_example-lora_1.0.safetensors', 0.5), ('None', 0.5), ('None', 0.5), ('None', 0.5), ('None', 0.5)]
Fooocus Expansion engine loaded for privateuseone:0, use_fp16 = False.
loading new
App started successful. Use the app with http://127.0.0.1:7860/ or 127.0.0.1:7860
[Parameters] Adaptive CFG = 7
[Parameters] Sharpness = 2
[Parameters] ADM Scale = 1.5 : 0.8 : 0.3
[Parameters] CFG = 7.0
[Parameters] Sampler = dpmpp_2m_sde_gpu - karras
[Parameters] Steps = 30 - 20
[Fooocus] Initializing ...
[Fooocus] Loading models ...
[Fooocus] Processing prompts ...
[Fooocus] Preparing Fooocus text #1 ...
C:\Fooocus_win64_2-1-25\python_embeded\lib\site-packages\transformers\generation\utils.py:723: UserWarning: The operator 'aten::repeat_interleave.Tensor' is not currently supported on the DML backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at D:\a\_work\1\s\pytorch-directml-plugin\torch_directml\csrc\dml\dml_cpu_fallback.cpp:17.)
  input_ids = input_ids.repeat_interleave(expand_size, dim=0)
[Prompt Expansion] New suffix: intricate, beautiful and elegant, highly detailed, digital painting, artstation, concept art, smooth and sharp focus, illustration, art by tian zi and WLOP and alphonse mucha
[Fooocus] Preparing Fooocus text #2 ...
[Prompt Expansion] New suffix: extremely detailed, digital painting, in the style of Fenghua Zhong and Ruan Jia and jeremy lipking and Peter Mohrbacher, mystical colors, rim light, beautiful Lighting, 8k, stunning scene, raytracing, octane, trending on artstation
[Fooocus] Encoding positive #1 ...
[Fooocus] Encoding positive #2 ...
[Fooocus] Encoding negative #1 ...
[Fooocus] Encoding negative #2 ...
Preparation time: 3.34 seconds
loading new
Moving model to GPU: 6.69 seconds
Traceback (most recent call last):
  File "C:\Fooocus_win64_2-1-25\Fooocus\modules\async_worker.py", line 565, in worker
    handler(task)
  File "C:\Fooocus_win64_2-1-25\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "C:\Fooocus_win64_2-1-25\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "C:\Fooocus_win64_2-1-25\Fooocus\modules\async_worker.py", line 499, in handler
    imgs = pipeline.process_diffusion(
  File "C:\Fooocus_win64_2-1-25\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "C:\Fooocus_win64_2-1-25\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "C:\Fooocus_win64_2-1-25\Fooocus\modules\default_pipeline.py", line 245, in process_diffusion
    sampled_latent = core.ksampler(
  File "C:\Fooocus_win64_2-1-25\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "C:\Fooocus_win64_2-1-25\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "C:\Fooocus_win64_2-1-25\Fooocus\modules\core.py", line 270, in ksampler
    samples = comfy.sample.sample(model, noise, steps, cfg, sampler_name, scheduler, positive, negative, latent_image,
  File "C:\Fooocus_win64_2-1-25\Fooocus\repositories\ComfyUI-from-StabilityAI-Official\comfy\sample.py", line 97, in sample
    samples = sampler.sample(noise, positive_copy, negative_copy, cfg=cfg, latent_image=latent_image, start_step=start_step, last_step=last_step, force_full_denoise=force_full_denoise, denoise_mask=noise_mask, sigmas=sigmas, callback=callback, disable_pbar=disable_pbar, seed=seed)
  File "C:\Fooocus_win64_2-1-25\Fooocus\repositories\ComfyUI-from-StabilityAI-Official\comfy\samplers.py", line 785, in sample
    return sample(self.model, noise, positive, negative, cfg, self.device, sampler(), sigmas, self.model_options, latent_image=latent_image, denoise_mask=denoise_mask, callback=callback, disable_pbar=disable_pbar, seed=seed)
  File "C:\Fooocus_win64_2-1-25\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "C:\Fooocus_win64_2-1-25\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "C:\Fooocus_win64_2-1-25\Fooocus\modules\sample_hijack.py", line 105, in sample_hacked
    samples = sampler.sample(model_wrap, sigmas, extra_args, callback_wrap, noise, latent_image, denoise_mask, disable_pbar)
  File "C:\Fooocus_win64_2-1-25\Fooocus\repositories\ComfyUI-from-StabilityAI-Official\comfy\samplers.py", line 630, in sample
    samples = getattr(k_diffusion_sampling, "sample_{}".format(sampler_name))(model_k, noise, sigmas, extra_args=extra_args, callback=k_callback, disable=disable_pbar, **extra_options)
  File "C:\Fooocus_win64_2-1-25\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "C:\Fooocus_win64_2-1-25\Fooocus\repositories\ComfyUI-from-StabilityAI-Official\comfy\k_diffusion\sampling.py", line 700, in sample_dpmpp_2m_sde_gpu
    noise_sampler = BrownianTreeNoiseSampler(x, sigma_min, sigma_max, seed=extra_args.get("seed", None), cpu=False) if noise_sampler is None else noise_sampler
  File "C:\Fooocus_win64_2-1-25\Fooocus\repositories\ComfyUI-from-StabilityAI-Official\comfy\k_diffusion\sampling.py", line 119, in __init__
    self.tree = BatchedBrownianTree(x, t0, t1, seed, cpu=cpu)
  File "C:\Fooocus_win64_2-1-25\Fooocus\repositories\ComfyUI-from-StabilityAI-Official\comfy\k_diffusion\sampling.py", line 85, in __init__
    self.trees = [torchsde.BrownianTree(t0, w0, t1, entropy=s, **kwargs) for s in seed]
  File "C:\Fooocus_win64_2-1-25\Fooocus\repositories\ComfyUI-from-StabilityAI-Official\comfy\k_diffusion\sampling.py", line 85, in <listcomp>
    self.trees = [torchsde.BrownianTree(t0, w0, t1, entropy=s, **kwargs) for s in seed]
  File "C:\Fooocus_win64_2-1-25\python_embeded\lib\site-packages\torchsde\_brownian\derived.py", line 155, in __init__
    self._interval = brownian_interval.BrownianInterval(t0=t0,
  File "C:\Fooocus_win64_2-1-25\python_embeded\lib\site-packages\torchsde\_brownian\brownian_interval.py", line 540, in __init__
    W = self._randn(initial_W_seed) * math.sqrt(t1 - t0)
  File "C:\Fooocus_win64_2-1-25\python_embeded\lib\site-packages\torchsde\_brownian\brownian_interval.py", line 234, in _randn
    return _randn(size, self._top._dtype, self._top._device, seed)
  File "C:\Fooocus_win64_2-1-25\python_embeded\lib\site-packages\torchsde\_brownian\brownian_interval.py", line 32, in _randn
    generator = torch.Generator(device).manual_seed(int(seed))
RuntimeError: Device type privateuseone is not supported for torch.Generator() api.
Total time: 10.15 seconds
@Ghochsi-Pfysche
Copy link

I second this, but I am jealous that you have 25x faster Total time...
cmd log.txt

@yorai1212
Copy link
Author

I second this, but I am jealous that you have 25x faster Total time... cmd log.txt

Haha, let's first all get an image generated, who knows what's happening here. Not sure those 10.15 seconds actually generated anything. No idea what's happening here.

@congzhengmian
Copy link

Amd 8gb cannot be used

C:\Users\lz\Downloads\Fooocus_win64_2-1-25>.\python_embeded\python.exe -m pip uninstall torch torchvision torchaudio torchtext functorch xformers -y
Found existing installation: torch 2.0.0
Uninstalling torch-2.0.0:
Successfully uninstalled torch-2.0.0
Found existing installation: torchvision 0.15.1
Uninstalling torchvision-0.15.1:
Successfully uninstalled torchvision-0.15.1
WARNING: Skipping torchaudio as it is not installed.
WARNING: Skipping torchtext as it is not installed.
WARNING: Skipping functorch as it is not installed.
WARNING: Skipping xformers as it is not installed.

C:\Users\lz\Downloads\Fooocus_win64_2-1-25>.\python_embeded\python.exe -m pip install torch-directml
Looking in indexes: https://mirrors.aliyun.com/pypi/simple/
Requirement already satisfied: torch-directml in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (0.2.0.dev230426)
Collecting torch==2.0.0 (from torch-directml)
Using cached https://mirrors.aliyun.com/pypi/packages/87/e2/62dbdfc85d3b8f771bc4b1a979ee6a157dbaa8928981dabbf45afc6d13dc/torch-2.0.0-cp310-cp310-win_amd64.whl (172.3 MB)
Collecting torchvision==0.15.1 (from torch-directml)
Using cached https://mirrors.aliyun.com/pypi/packages/03/06/6ba7532c66397defffb79f64cac46f812a29b2f87a4ad65a3e95bc164d62/torchvision-0.15.1-cp310-cp310-win_amd64.whl (1.2 MB)
Requirement already satisfied: filelock in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torch==2.0.0->torch-directml) (3.12.2)
Requirement already satisfied: typing-extensions in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torch==2.0.0->torch-directml) (4.7.1)
Requirement already satisfied: sympy in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torch==2.0.0->torch-directml) (1.12)
Requirement already satisfied: networkx in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torch==2.0.0->torch-directml) (3.1)
Requirement already satisfied: jinja2 in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torch==2.0.0->torch-directml) (3.1.2)
Requirement already satisfied: numpy in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torchvision==0.15.1->torch-directml) (1.23.5)
Requirement already satisfied: requests in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torchvision==0.15.1->torch-directml) (2.31.0)
Requirement already satisfied: pillow!=8.3.,>=5.3.0 in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torchvision==0.15.1->torch-directml) (9.2.0)
Requirement already satisfied: MarkupSafe>=2.0 in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from jinja2->torch==2.0.0->torch-directml) (2.1.3)
Requirement already satisfied: charset-normalizer<4,>=2 in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from requests->torchvision==0.15.1->torch-directml) (3.1.0)
Requirement already satisfied: idna<4,>=2.5 in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from requests->torchvision==0.15.1->torch-directml) (3.4)
Requirement already satisfied: urllib3<3,>=1.21.1 in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from requests->torchvision==0.15.1->torch-directml) (2.0.3)
Requirement already satisfied: certifi>=2017.4.17 in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from requests->torchvision==0.15.1->torch-directml) (2023.5.7)
Requirement already satisfied: mpmath>=0.19 in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from sympy->torch==2.0.0->torch-directml) (1.3.0)
DEPRECATION: torchsde 0.2.5 has a non-standard dependency specifier numpy>=1.19.
; python_version >= "3.7". pip 23.3 will enforce this behaviour change. A possible replacement is to upgrade to a newer version of torchsde or contact the author to suggest that they release a version with a conforming dependency specifiers. Discussion can be found at pypa/pip#12063
Installing collected packages: torch, torchvision
WARNING: The scripts convert-caffe2-to-onnx.exe, convert-onnx-to-caffe2.exe and torchrun.exe are installed in 'C:\Users\lz\Downloads\Fooocus_win64_2-1-25\python_embeded\Scripts' which is not on PATH.
Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
Successfully installed torch-2.0.0 torchvision-0.15.1

C:\Users\lz\Downloads\Fooocus_win64_2-1-25>.\python_embeded\python.exe -s Fooocus\entry_with_update.py --directml
Already up-to-date
Update succeeded.
Python 3.10.9 (tags/v3.10.9:1dd9be6, Dec 6 2022, 20:01:21) [MSC v.1934 64 bit (AMD64)]
Fooocus version: 2.1.37
Inference Engine exists and URL is correct.
Inference Engine checkout finished for d1a0abd40b86f3f079b0cc71e49f9f4604831457.
Running on local URL: http://127.0.0.1:7860

To create a public link, set share=True in launch().
Using directml with device:
Total VRAM 1024 MB, total RAM 32688 MB
Set vram state to: NORMAL_VRAM
Device: privateuseone
VAE dtype: torch.float32
Using sub quadratic optimization for cross attention, if you have memory or speed issues try using: --use-split-cross-attention
model_type EPS
adm 2560
Refiner model loaded: C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\models\checkpoints\sd_xl_refiner_1.0_0.9vae.safetensors
model_type EPS
adm 2816
making attention of type 'vanilla' with 512 in_channels
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
making attention of type 'vanilla' with 512 in_channels
missing {'cond_stage_model.clip_l.text_projection', 'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids', 'cond_stage_model.clip_l.logit_scale'}
Base model loaded: C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\models\checkpoints\sd_xl_base_1.0_0.9vae.safetensors
LoRAs loaded: [('sd_xl_offset_example-lora_1.0.safetensors', 0.5), ('None', 0.5), ('None', 0.5), ('None', 0.5), ('None', 0.5)]
Fooocus Expansion engine loaded for privateuseone:0, use_fp16 = False.
loading new
App started successful. Use the app with http://127.0.0.1:7860/ or 127.0.0.1:7860
[Parameters] Adaptive CFG = 7
[Parameters] Sharpness = 0.812
[Parameters] ADM Scale = 1.5 : 0.8 : 0.3
[Parameters] CFG = 3.06
[Parameters] Sampler = dpmpp_2m_sde_gpu - karras
[Parameters] Steps = 30 - 20
[Fooocus] Initializing ...
[Fooocus] Loading models ...
[Fooocus] Processing prompts ...
[Fooocus] Preparing Fooocus text #1 ...
C:\Users\lz\Downloads\Fooocus_win64_2-1-25\python_embeded\lib\site-packages\transformers\generation\utils.py:723: UserWarning: The operator 'aten::repeat_interleave.Tensor' is not currently supported on the DML backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at D:\a_work\1\s\pytorch-directml-plugin\torch_directml\csrc\dml\dml_cpu_fallback.cpp:17.)
input_ids = input_ids.repeat_interleave(expand_size, dim=0)
[Prompt Expansion] New suffix: intricate, highly detailed, digital painting, artstation, concept art, smooth, sharp focus, illustration, Unreal Engine 5, 8K, art by artgerm and greg rutkowski and alphonse mucha
[Fooocus] Preparing Fooocus text #2 ...
[Prompt Expansion] New suffix: extremely detailed eyes. By Makoto Shinkai, Stanley Artgerm Lau, WLOP, Rossdraws, James Jean, Andrei Riabovitchev, Marc Simonetti, krenz cushart, Sakimichan, D&D trending on ArtStation, digital art
[Fooocus] Encoding positive #1 ...
[Fooocus] Encoding positive #2 ...
[Fooocus] Encoding negative #1 ...
[Fooocus] Encoding negative #2 ...
Preparation time: 8.80 seconds
loading new
ERROR diffusion_model.output_blocks.1.1.transformer_blocks.2.ff.net.0.proj.weight Could not allocate tensor with 52428800 bytes. There is not enough GPU video memory available!
Traceback (most recent call last):
File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\modules\async_worker.py", line 565, in worker
handler(task)
File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\modules\async_worker.py", line 470, in handler
comfy.model_management.load_models_gpu([pipeline.final_unet])
File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\repositories\ComfyUI-from-StabilityAI-Official\comfy\model_management.py", line 397, in load_models_gpu
cur_loaded_model = loaded_model.model_load(lowvram_model_memory)
File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\repositories\ComfyUI-from-StabilityAI-Official\comfy\model_management.py", line 286, in model_load
raise e
File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\repositories\ComfyUI-from-StabilityAI-Official\comfy\model_management.py", line 282, in model_load
self.real_model = self.model.patch_model(device_to=patch_model_to) #TODO: do something with loras and offloading to CPU
File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\repositories\ComfyUI-from-StabilityAI-Official\comfy\model_patcher.py", line 161, in patch_model
temp_weight = comfy.model_management.cast_to_device(weight, device_to, torch.float32, copy=True)
File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\repositories\ComfyUI-from-StabilityAI-Official\comfy\model_management.py", line 498, in cast_to_device
return tensor.to(device, copy=copy).to(dtype)
RuntimeError: Could not allocate tensor with 26214400 bytes. There is not enough GPU video memory available!
Total time: 37.06 seconds

@MOVZX
Copy link

MOVZX commented Oct 11, 2023

image
image

I solved the issue by editing this file:

.\python_embeded\Lib\site-packages\torchsde_brownian\brownian_interval.py

Line 32

generator = torch.Generator(device).manual_seed(int(seed))

to

generator = torch.Generator().manual_seed(int(seed))

@yorai1212
Copy link
Author

yorai1212 commented Oct 11, 2023

torchsde_brownian

Don't have that folder..
image

Also, is it safe to edit the files without lllyasviel confirming we should do that?

Edit: found the file in C:\Fooocus_win64_2-1-25\python_embeded\Lib\site-packages\torchsde_brownian.
Works fine now! thank you!

@ZeroIQ1024
Copy link

I solved the issue by editing this file:

.\python_embeded\Lib\site-packages\torchsde_brownian\brownian_interval.py

Line 32

generator = torch.Generator(device).manual_seed(int(seed))

to

generator = torch.Generator().manual_seed(int(seed))

this worked for me too, thank you so much!

@EllaHGrace
Copy link

There is no torchsde_brownian. What is happening here?

@EllaHGrace
Copy link

it's torchsde_brownian

@EllaHGrace
Copy link

put a "" between "torchsde" and "_brownian"

@EllaHGrace
Copy link

-\\\-

@EllaHGrace
Copy link

put a \ between "torchsde" and "_brownian"

@yorai1212
Copy link
Author

Okay no need to spam LMAO

@hbsszh
Copy link

hbsszh commented Oct 13, 2023

Amd 8gb cannot be used

C:\Users\lz\Downloads\Fooocus_win64_2-1-25>.\python_embeded\python.exe -m pip uninstall torch torchvision torchaudio torchtext functorch xformers -y Found existing installation: torch 2.0.0 Uninstalling torch-2.0.0: Successfully uninstalled torch-2.0.0 Found existing installation: torchvision 0.15.1 Uninstalling torchvision-0.15.1: Successfully uninstalled torchvision-0.15.1 WARNING: Skipping torchaudio as it is not installed. WARNING: Skipping torchtext as it is not installed. WARNING: Skipping functorch as it is not installed. WARNING: Skipping xformers as it is not installed.

C:\Users\lz\Downloads\Fooocus_win64_2-1-25>.\python_embeded\python.exe -m pip install torch-directml Looking in indexes: https://mirrors.aliyun.com/pypi/simple/ Requirement already satisfied: torch-directml in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (0.2.0.dev230426) Collecting torch==2.0.0 (from torch-directml) Using cached https://mirrors.aliyun.com/pypi/packages/87/e2/62dbdfc85d3b8f771bc4b1a979ee6a157dbaa8928981dabbf45afc6d13dc/torch-2.0.0-cp310-cp310-win_amd64.whl (172.3 MB) Collecting torchvision==0.15.1 (from torch-directml) Using cached https://mirrors.aliyun.com/pypi/packages/03/06/6ba7532c66397defffb79f64cac46f812a29b2f87a4ad65a3e95bc164d62/torchvision-0.15.1-cp310-cp310-win_amd64.whl (1.2 MB) Requirement already satisfied: filelock in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torch==2.0.0->torch-directml) (3.12.2) Requirement already satisfied: typing-extensions in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torch==2.0.0->torch-directml) (4.7.1) Requirement already satisfied: sympy in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torch==2.0.0->torch-directml) (1.12) Requirement already satisfied: networkx in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torch==2.0.0->torch-directml) (3.1) Requirement already satisfied: jinja2 in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torch==2.0.0->torch-directml) (3.1.2) Requirement already satisfied: numpy in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torchvision==0.15.1->torch-directml) (1.23.5) Requirement already satisfied: requests in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torchvision==0.15.1->torch-directml) (2.31.0) Requirement already satisfied: pillow!=8.3.,>=5.3.0 in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torchvision==0.15.1->torch-directml) (9.2.0) Requirement already satisfied: MarkupSafe>=2.0 in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from jinja2->torch==2.0.0->torch-directml) (2.1.3) Requirement already satisfied: charset-normalizer<4,>=2 in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from requests->torchvision==0.15.1->torch-directml) (3.1.0) Requirement already satisfied: idna<4,>=2.5 in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from requests->torchvision==0.15.1->torch-directml) (3.4) Requirement already satisfied: urllib3<3,>=1.21.1 in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from requests->torchvision==0.15.1->torch-directml) (2.0.3) Requirement already satisfied: certifi>=2017.4.17 in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from requests->torchvision==0.15.1->torch-directml) (2023.5.7) Requirement already satisfied: mpmath>=0.19 in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from sympy->torch==2.0.0->torch-directml) (1.3.0) DEPRECATION: torchsde 0.2.5 has a non-standard dependency specifier numpy>=1.19.; python_version >= "3.7". pip 23.3 will enforce this behaviour change. A possible replacement is to upgrade to a newer version of torchsde or contact the author to suggest that they release a version with a conforming dependency specifiers. Discussion can be found at pypa/pip#12063 Installing collected packages: torch, torchvision WARNING: The scripts convert-caffe2-to-onnx.exe, convert-onnx-to-caffe2.exe and torchrun.exe are installed in 'C:\Users\lz\Downloads\Fooocus_win64_2-1-25\python_embeded\Scripts' which is not on PATH. Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. Successfully installed torch-2.0.0 torchvision-0.15.1

C:\Users\lz\Downloads\Fooocus_win64_2-1-25>.\python_embeded\python.exe -s Fooocus\entry_with_update.py --directml Already up-to-date Update succeeded. Python 3.10.9 (tags/v3.10.9:1dd9be6, Dec 6 2022, 20:01:21) [MSC v.1934 64 bit (AMD64)] Fooocus version: 2.1.37 Inference Engine exists and URL is correct. Inference Engine checkout finished for d1a0abd40b86f3f079b0cc71e49f9f4604831457. Running on local URL: http://127.0.0.1:7860

To create a public link, set share=True in launch(). Using directml with device: Total VRAM 1024 MB, total RAM 32688 MB Set vram state to: NORMAL_VRAM Device: privateuseone VAE dtype: torch.float32 Using sub quadratic optimization for cross attention, if you have memory or speed issues try using: --use-split-cross-attention model_type EPS adm 2560 Refiner model loaded: C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\models\checkpoints\sd_xl_refiner_1.0_0.9vae.safetensors model_type EPS adm 2816 making attention of type 'vanilla' with 512 in_channels Working with z of shape (1, 4, 32, 32) = 4096 dimensions. making attention of type 'vanilla' with 512 in_channels missing {'cond_stage_model.clip_l.text_projection', 'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids', 'cond_stage_model.clip_l.logit_scale'} Base model loaded: C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\models\checkpoints\sd_xl_base_1.0_0.9vae.safetensors LoRAs loaded: [('sd_xl_offset_example-lora_1.0.safetensors', 0.5), ('None', 0.5), ('None', 0.5), ('None', 0.5), ('None', 0.5)] Fooocus Expansion engine loaded for privateuseone:0, use_fp16 = False. loading new App started successful. Use the app with http://127.0.0.1:7860/ or 127.0.0.1:7860 [Parameters] Adaptive CFG = 7 [Parameters] Sharpness = 0.812 [Parameters] ADM Scale = 1.5 : 0.8 : 0.3 [Parameters] CFG = 3.06 [Parameters] Sampler = dpmpp_2m_sde_gpu - karras [Parameters] Steps = 30 - 20 [Fooocus] Initializing ... [Fooocus] Loading models ... [Fooocus] Processing prompts ... [Fooocus] Preparing Fooocus text #1 ... C:\Users\lz\Downloads\Fooocus_win64_2-1-25\python_embeded\lib\site-packages\transformers\generation\utils.py:723: UserWarning: The operator 'aten::repeat_interleave.Tensor' is not currently supported on the DML backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at D:\a_work\1\s\pytorch-directml-plugin\torch_directml\csrc\dml\dml_cpu_fallback.cpp:17.) input_ids = input_ids.repeat_interleave(expand_size, dim=0) [Prompt Expansion] New suffix: intricate, highly detailed, digital painting, artstation, concept art, smooth, sharp focus, illustration, Unreal Engine 5, 8K, art by artgerm and greg rutkowski and alphonse mucha [Fooocus] Preparing Fooocus text #2 ... [Prompt Expansion] New suffix: extremely detailed eyes. By Makoto Shinkai, Stanley Artgerm Lau, WLOP, Rossdraws, James Jean, Andrei Riabovitchev, Marc Simonetti, krenz cushart, Sakimichan, D&D trending on ArtStation, digital art [Fooocus] Encoding positive #1 ... [Fooocus] Encoding positive #2 ... [Fooocus] Encoding negative #1 ... [Fooocus] Encoding negative #2 ... Preparation time: 8.80 seconds loading new ERROR diffusion_model.output_blocks.1.1.transformer_blocks.2.ff.net.0.proj.weight Could not allocate tensor with 52428800 bytes. There is not enough GPU video memory available! Traceback (most recent call last): File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\modules\async_worker.py", line 565, in worker handler(task) File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\modules\async_worker.py", line 470, in handler comfy.model_management.load_models_gpu([pipeline.final_unet]) File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\repositories\ComfyUI-from-StabilityAI-Official\comfy\model_management.py", line 397, in load_models_gpu cur_loaded_model = loaded_model.model_load(lowvram_model_memory) File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\repositories\ComfyUI-from-StabilityAI-Official\comfy\model_management.py", line 286, in model_load raise e File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\repositories\ComfyUI-from-StabilityAI-Official\comfy\model_management.py", line 282, in model_load self.real_model = self.model.patch_model(device_to=patch_model_to) #TODO: do something with loras and offloading to CPU File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\repositories\ComfyUI-from-StabilityAI-Official\comfy\model_patcher.py", line 161, in patch_model temp_weight = comfy.model_management.cast_to_device(weight, device_to, torch.float32, copy=True) File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\repositories\ComfyUI-from-StabilityAI-Official\comfy\model_management.py", line 498, in cast_to_device return tensor.to(device, copy=copy).to(dtype) RuntimeError: Could not allocate tensor with 26214400 bytes. There is not enough GPU video memory available! Total time: 37.06 seconds

me too!

@f1am3d
Copy link

f1am3d commented Oct 14, 2023

same here

@f1am3d
Copy link

f1am3d commented Oct 14, 2023

This fix has worked, but 2.7 it/s on RX 7900 XTX...

@f1am3d
Copy link

f1am3d commented Oct 14, 2023

This fix has worked, but 2.3 it/s on RX 7900 XTX...

@Maefreric
Copy link

Maefreric commented Oct 21, 2023

Amd 8gb cannot be used
C:\Users\lz\Downloads\Fooocus_win64_2-1-25>.\python_embeded\python.exe -m pip uninstall torch torchvision torchaudio torchtext functorch xformers -y Found existing installation: torch 2.0.0 Uninstalling torch-2.0.0: Successfully uninstalled torch-2.0.0 Found existing installation: torchvision 0.15.1 Uninstalling torchvision-0.15.1: Successfully uninstalled torchvision-0.15.1 WARNING: Skipping torchaudio as it is not installed. WARNING: Skipping torchtext as it is not installed. WARNING: Skipping functorch as it is not installed. WARNING: Skipping xformers as it is not installed.
C:\Users\lz\Downloads\Fooocus_win64_2-1-25>.\python_embeded\python.exe -m pip install torch-directml Looking in indexes: https://mirrors.aliyun.com/pypi/simple/ Requirement already satisfied: torch-directml in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (0.2.0.dev230426) Collecting torch==2.0.0 (from torch-directml) Using cached https://mirrors.aliyun.com/pypi/packages/87/e2/62dbdfc85d3b8f771bc4b1a979ee6a157dbaa8928981dabbf45afc6d13dc/torch-2.0.0-cp310-cp310-win_amd64.whl (172.3 MB) Collecting torchvision==0.15.1 (from torch-directml) Using cached https://mirrors.aliyun.com/pypi/packages/03/06/6ba7532c66397defffb79f64cac46f812a29b2f87a4ad65a3e95bc164d62/torchvision-0.15.1-cp310-cp310-win_amd64.whl (1.2 MB) Requirement already satisfied: filelock in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torch==2.0.0->torch-directml) (3.12.2) Requirement already satisfied: typing-extensions in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torch==2.0.0->torch-directml) (4.7.1) Requirement already satisfied: sympy in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torch==2.0.0->torch-directml) (1.12) Requirement already satisfied: networkx in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torch==2.0.0->torch-directml) (3.1) Requirement already satisfied: jinja2 in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torch==2.0.0->torch-directml) (3.1.2) Requirement already satisfied: numpy in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torchvision==0.15.1->torch-directml) (1.23.5) Requirement already satisfied: requests in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torchvision==0.15.1->torch-directml) (2.31.0) Requirement already satisfied: pillow!=8.3.,>=5.3.0 in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torchvision==0.15.1->torch-directml) (9.2.0) Requirement already satisfied: MarkupSafe>=2.0 in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from jinja2->torch==2.0.0->torch-directml) (2.1.3) Requirement already satisfied: charset-normalizer<4,>=2 in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from requests->torchvision==0.15.1->torch-directml) (3.1.0) Requirement already satisfied: idna<4,>=2.5 in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from requests->torchvision==0.15.1->torch-directml) (3.4) Requirement already satisfied: urllib3<3,>=1.21.1 in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from requests->torchvision==0.15.1->torch-directml) (2.0.3) Requirement already satisfied: certifi>=2017.4.17 in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from requests->torchvision==0.15.1->torch-directml) (2023.5.7) Requirement already satisfied: mpmath>=0.19 in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from sympy->torch==2.0.0->torch-directml) (1.3.0) DEPRECATION: torchsde 0.2.5 has a non-standard dependency specifier numpy>=1.19.; python_version >= "3.7". pip 23.3 will enforce this behaviour change. A possible replacement is to upgrade to a newer version of torchsde or contact the author to suggest that they release a version with a conforming dependency specifiers. Discussion can be found at pypa/pip#12063 Installing collected packages: torch, torchvision WARNING: The scripts convert-caffe2-to-onnx.exe, convert-onnx-to-caffe2.exe and torchrun.exe are installed in 'C:\Users\lz\Downloads\Fooocus_win64_2-1-25\python_embeded\Scripts' which is not on PATH. Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. Successfully installed torch-2.0.0 torchvision-0.15.1
C:\Users\lz\Downloads\Fooocus_win64_2-1-25>.\python_embeded\python.exe -s Fooocus\entry_with_update.py --directml Already up-to-date Update succeeded. Python 3.10.9 (tags/v3.10.9:1dd9be6, Dec 6 2022, 20:01:21) [MSC v.1934 64 bit (AMD64)] Fooocus version: 2.1.37 Inference Engine exists and URL is correct. Inference Engine checkout finished for d1a0abd40b86f3f079b0cc71e49f9f4604831457. Running on local URL: http://127.0.0.1:7860
To create a public link, set share=True in launch(). Using directml with device: Total VRAM 1024 MB, total RAM 32688 MB Set vram state to: NORMAL_VRAM Device: privateuseone VAE dtype: torch.float32 Using sub quadratic optimization for cross attention, if you have memory or speed issues try using: --use-split-cross-attention model_type EPS adm 2560 Refiner model loaded: C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\models\checkpoints\sd_xl_refiner_1.0_0.9vae.safetensors model_type EPS adm 2816 making attention of type 'vanilla' with 512 in_channels Working with z of shape (1, 4, 32, 32) = 4096 dimensions. making attention of type 'vanilla' with 512 in_channels missing {'cond_stage_model.clip_l.text_projection', 'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids', 'cond_stage_model.clip_l.logit_scale'} Base model loaded: C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\models\checkpoints\sd_xl_base_1.0_0.9vae.safetensors LoRAs loaded: [('sd_xl_offset_example-lora_1.0.safetensors', 0.5), ('None', 0.5), ('None', 0.5), ('None', 0.5), ('None', 0.5)] Fooocus Expansion engine loaded for privateuseone:0, use_fp16 = False. loading new App started successful. Use the app with http://127.0.0.1:7860/ or 127.0.0.1:7860 [Parameters] Adaptive CFG = 7 [Parameters] Sharpness = 0.812 [Parameters] ADM Scale = 1.5 : 0.8 : 0.3 [Parameters] CFG = 3.06 [Parameters] Sampler = dpmpp_2m_sde_gpu - karras [Parameters] Steps = 30 - 20 [Fooocus] Initializing ... [Fooocus] Loading models ... [Fooocus] Processing prompts ... [Fooocus] Preparing Fooocus text #1 ... C:\Users\lz\Downloads\Fooocus_win64_2-1-25\python_embeded\lib\site-packages\transformers\generation\utils.py:723: UserWarning: The operator 'aten::repeat_interleave.Tensor' is not currently supported on the DML backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at D:\a_work\1\s\pytorch-directml-plugin\torch_directml\csrc\dml\dml_cpu_fallback.cpp:17.) input_ids = input_ids.repeat_interleave(expand_size, dim=0) [Prompt Expansion] New suffix: intricate, highly detailed, digital painting, artstation, concept art, smooth, sharp focus, illustration, Unreal Engine 5, 8K, art by artgerm and greg rutkowski and alphonse mucha [Fooocus] Preparing Fooocus text #2 ... [Prompt Expansion] New suffix: extremely detailed eyes. By Makoto Shinkai, Stanley Artgerm Lau, WLOP, Rossdraws, James Jean, Andrei Riabovitchev, Marc Simonetti, krenz cushart, Sakimichan, D&D trending on ArtStation, digital art [Fooocus] Encoding positive #1 ... [Fooocus] Encoding positive #2 ... [Fooocus] Encoding negative #1 ... [Fooocus] Encoding negative #2 ... Preparation time: 8.80 seconds loading new ERROR diffusion_model.output_blocks.1.1.transformer_blocks.2.ff.net.0.proj.weight Could not allocate tensor with 52428800 bytes. There is not enough GPU video memory available! Traceback (most recent call last): File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\modules\async_worker.py", line 565, in worker handler(task) File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\modules\async_worker.py", line 470, in handler comfy.model_management.load_models_gpu([pipeline.final_unet]) File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\repositories\ComfyUI-from-StabilityAI-Official\comfy\model_management.py", line 397, in load_models_gpu cur_loaded_model = loaded_model.model_load(lowvram_model_memory) File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\repositories\ComfyUI-from-StabilityAI-Official\comfy\model_management.py", line 286, in model_load raise e File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\repositories\ComfyUI-from-StabilityAI-Official\comfy\model_management.py", line 282, in model_load self.real_model = self.model.patch_model(device_to=patch_model_to) #TODO: do something with loras and offloading to CPU File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\repositories\ComfyUI-from-StabilityAI-Official\comfy\model_patcher.py", line 161, in patch_model temp_weight = comfy.model_management.cast_to_device(weight, device_to, torch.float32, copy=True) File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\repositories\ComfyUI-from-StabilityAI-Official\comfy\model_management.py", line 498, in cast_to_device return tensor.to(device, copy=copy).to(dtype) RuntimeError: Could not allocate tensor with 26214400 bytes. There is not enough GPU video memory available! Total time: 37.06 seconds

me too!

same.

image

a part i found interesting as well was the "Total VRAM 1024 MB," its almost like its grabbing the boards local or something. 🤔

also what seems like a memory leak. just having the gradio live window open it starts ramping and doesnt stop till i close it. 32gb of ram too 🤔

@benjiwright
Copy link

RuntimeError: Could not allocate tensor with 165150720 bytes. There is not enough GPU video memory available!

AMD Radeon RX 6700 XT

@wcalvert
Copy link

After applying the fix above, now getting the allocation error:

RuntimeError: Could not allocate tensor with 6553600 bytes. There is not enough GPU video memory available!

RX 5700XT, Windows 10

@MatSkrzat
Copy link

For memory allocation problem I found quick fix, but use with caution.
Go to \Fooocus\backend\headless\fcbh\model_management.py
In line 95 change mem_total = 1024 * 1024 * 1024 to mem_total = 8192 * 1024 * 1024.
I have 16GB VRAM GPU, so I assigned 8GB, you can try to assign more.
Also if you have like 4GB VRAM then I would suggest to change the line to mem_total = 2048 * 1024 * 1024 it will allocate 2GB VRAM. Of course you can try to use all VRAM :)

@jbournonville
Copy link

jbournonville commented Nov 30, 2023

I have do this but still memory error
mem_total = 4096 * 1024 * 1024 with a Vega 64 8g

@MatSkrzat
Copy link

I have do this but still memory error mem_total = 8192 * 1024 * 1024 with a Vega 64 8g

Restart the app and check TOTAL VRAM printed out on start. In my case when I set it to 12GB it looks like this:

image

@jbournonville
Copy link

I'm at 4096 on VRAM
image

@MatSkrzat
Copy link

I'm at 4096 on VRAM image

Looks ok, I would try generating with different settings. Lower image resolution maybe. Sorry I can't help you more.

@wcalvert
Copy link

wcalvert commented Nov 30, 2023

After making both fixes in this thread, I'm still getting the allocation error.

To create a public link, set share=True in launch(). Using directml with device: Total VRAM 8192 MB, total RAM 16318 MB Set vram state to: NORMAL_VRAM ... snip ... RuntimeError: Could not allocate tensor with 10485760 bytes. There is not enough GPU video memory available!

RX 5700XT, Win 10

@onurusluca
Copy link

One image with all defaults settings took more than 4 minutes. Is this normal?

Pc specs:

  • AMD Radeon RX 6800
  • AMD Ryzen 5 5600x
  • 32 GB RAM

Am I doing something wrong or do I have a bootleneck?


Also the fix worked:
Edit file brownian_interval.py in \python_embeded\Lib\site-packages\torchsde\_brownian

Line 32
Change:
generator = torch.Generator(device).manual_seed(int(seed))

to:
generator = torch.Generator().manual_seed(int(seed))

@benjiwright
Copy link

benjiwright commented Nov 30, 2023

@MatSkrzat Appreciate the suggestion. I tried bumping the memory

As suggested:

Total VRAM 8192 MB, total RAM 32700 MB

Matching the total memory in my video card (12GB)

mem_total = 1024 * 12 * 1024 * 1024 
Total VRAM 12288 MB, total RAM 32700 MB

Both yielded the same out of memory error

RuntimeError: Could not allocate tensor with 165150720 bytes. There is not enough GPU video memory available!

@Lejoser40
Copy link

RuntimeError: Could not allocate tensor with 6553600 bytes. There is not enough GPU video memory available!

AMD Radeon RX6600, windows 10

@ghost
Copy link

ghost commented Nov 30, 2023

Same "out of memory" issue here after using both fixes GPU rx7900xt 20GB

@syedhammad74
Copy link

i did teh torchsde fix but still im getting this error

E:\AI\Fooocus_win64_2-1-791>.\python_embeded\python.exe -s Fooocus\entry_with_update.py --directml
Already up-to-date
Update succeeded.
[System ARGV] ['Fooocus\\entry_with_update.py', '--directml']
Python 3.10.9 (tags/v3.10.9:1dd9be6, Dec  6 2022, 20:01:21) [MSC v.1934 64 bit (AMD64)]
Fooocus version: 2.1.824
Running on local URL:  http://127.0.0.1:7865

To create a public link, set `share=True` in `launch()`.
Using directml with device:
Total VRAM 1024 MB, total RAM 16310 MB
Set vram state to: NORMAL_VRAM
Disabling smart memory management
Device: privateuseone
VAE dtype: torch.float32
Using sub quadratic optimization for cross attention, if you have memory or speed issues try using: --use-split-cross-attention
Refiner unloaded.
Exception in thread Thread-2 (worker):
Traceback (most recent call last):
  File "threading.py", line 1016, in _bootstrap_inner
  File "threading.py", line 953, in run
  File "E:\AI\Fooocus_win64_2-1-791\Fooocus\modules\async_worker.py", line 25, in worker
    import modules.default_pipeline as pipeline
  File "E:\AI\Fooocus_win64_2-1-791\Fooocus\modules\default_pipeline.py", line 253, in <module>
    refresh_everything(
  File "E:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "E:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "E:\AI\Fooocus_win64_2-1-791\Fooocus\modules\default_pipeline.py", line 233, in refresh_everything
    refresh_base_model(base_model_name)
  File "E:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "E:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "E:\AI\Fooocus_win64_2-1-791\Fooocus\modules\default_pipeline.py", line 69, in refresh_base_model
    model_base = core.load_model(filename)
  File "E:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "E:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "E:\AI\Fooocus_win64_2-1-791\Fooocus\modules\core.py", line 152, in load_model
    unet, clip, vae, clip_vision = load_checkpoint_guess_config(ckpt_filename, embedding_directory=path_embeddings)
  File "E:\AI\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\sd.py", line 446, in load_checkpoint_guess_config
    model = model_config.get_model(sd, "model.diffusion_model.", device=inital_load_device)
  File "E:\AI\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\supported_models.py", line 163, in get_model
    out = model_base.SDXL(self, model_type=self.model_type(state_dict, prefix), device=device)
  File "E:\AI\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\model_base.py", line 243, in __init__
    super().__init__(model_config, model_type, device=device)
  File "E:\AI\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\model_base.py", line 40, in __init__
    self.diffusion_model = UNetModel(**unet_config, device=device)
  File "E:\AI\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\ldm\modules\diffusionmodules\openaimodel.py", line 520, in __init__
    ResBlock(
  File "E:\AI\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\ldm\modules\diffusionmodules\openaimodel.py", line 190, in __init__
    operations.conv_nd(dims, self.out_channels, self.out_channels, 3, padding=1, dtype=dtype, device=device)
  File "E:\AI\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\ops.py", line 18, in conv_nd
    return Conv2d(*args, **kwargs)
  File "E:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\nn\modules\conv.py", line 450, in __init__
    super().__init__(
  File "E:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\nn\modules\conv.py", line 137, in __init__
    self.weight = Parameter(torch.empty(
RuntimeError: [enforce fail at ..\c10\core\impl\alloc_cpu.cpp:72] data. DefaultCPUAllocator: not enough memory: you tried to allocate 58982400 bytes.

@Drezir
Copy link

Drezir commented Dec 2, 2023

i did teh torchsde fix but still im getting this error

E:\AI\Fooocus_win64_2-1-791>.\python_embeded\python.exe -s Fooocus\entry_with_update.py --directml
Already up-to-date
Update succeeded.
[System ARGV] ['Fooocus\\entry_with_update.py', '--directml']
Python 3.10.9 (tags/v3.10.9:1dd9be6, Dec  6 2022, 20:01:21) [MSC v.1934 64 bit (AMD64)]
Fooocus version: 2.1.824
Running on local URL:  http://127.0.0.1:7865

To create a public link, set `share=True` in `launch()`.
Using directml with device:
Total VRAM 1024 MB, total RAM 16310 MB
Set vram state to: NORMAL_VRAM
Disabling smart memory management
Device: privateuseone
VAE dtype: torch.float32
Using sub quadratic optimization for cross attention, if you have memory or speed issues try using: --use-split-cross-attention
Refiner unloaded.
Exception in thread Thread-2 (worker):
Traceback (most recent call last):
  File "threading.py", line 1016, in _bootstrap_inner
  File "threading.py", line 953, in run
  File "E:\AI\Fooocus_win64_2-1-791\Fooocus\modules\async_worker.py", line 25, in worker
    import modules.default_pipeline as pipeline
  File "E:\AI\Fooocus_win64_2-1-791\Fooocus\modules\default_pipeline.py", line 253, in <module>
    refresh_everything(
  File "E:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "E:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "E:\AI\Fooocus_win64_2-1-791\Fooocus\modules\default_pipeline.py", line 233, in refresh_everything
    refresh_base_model(base_model_name)
  File "E:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "E:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "E:\AI\Fooocus_win64_2-1-791\Fooocus\modules\default_pipeline.py", line 69, in refresh_base_model
    model_base = core.load_model(filename)
  File "E:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "E:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "E:\AI\Fooocus_win64_2-1-791\Fooocus\modules\core.py", line 152, in load_model
    unet, clip, vae, clip_vision = load_checkpoint_guess_config(ckpt_filename, embedding_directory=path_embeddings)
  File "E:\AI\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\sd.py", line 446, in load_checkpoint_guess_config
    model = model_config.get_model(sd, "model.diffusion_model.", device=inital_load_device)
  File "E:\AI\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\supported_models.py", line 163, in get_model
    out = model_base.SDXL(self, model_type=self.model_type(state_dict, prefix), device=device)
  File "E:\AI\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\model_base.py", line 243, in __init__
    super().__init__(model_config, model_type, device=device)
  File "E:\AI\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\model_base.py", line 40, in __init__
    self.diffusion_model = UNetModel(**unet_config, device=device)
  File "E:\AI\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\ldm\modules\diffusionmodules\openaimodel.py", line 520, in __init__
    ResBlock(
  File "E:\AI\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\ldm\modules\diffusionmodules\openaimodel.py", line 190, in __init__
    operations.conv_nd(dims, self.out_channels, self.out_channels, 3, padding=1, dtype=dtype, device=device)
  File "E:\AI\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\ops.py", line 18, in conv_nd
    return Conv2d(*args, **kwargs)
  File "E:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\nn\modules\conv.py", line 450, in __init__
    super().__init__(
  File "E:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\nn\modules\conv.py", line 137, in __init__
    self.weight = Parameter(torch.empty(
RuntimeError: [enforce fail at ..\c10\core\impl\alloc_cpu.cpp:72] data. DefaultCPUAllocator: not enough memory: you tried to allocate 58982400 bytes.

Same here
RX 580 8GB, Ryzen 5600, 32GB RAM

@strobya
Copy link

strobya commented Dec 2, 2023

Same error after doing both steps

.\python_embeded\Lib\site-packages\torchsde_brownian\brownian_interval.py

Line 32

generator = torch.Generator(device).manual_seed(int(seed))

to

generator = torch.Generator().manual_seed(int(seed))

then tried to alloctae more VRAM

Go to \Fooocus\backend\headless\fcbh\model_management.py

In line 95 change mem_total = 1024 * 1024 * 1024 to mem_total = 8192 * 1024 * 1024.

Still getting same error "could not allocate tensor with 165150720 bytes. there is not enough gpu video memory available!"

Note: I remeber messing with this file "cli_args.py" and I got it working then stopped, wanted to mention it incase someone can make use of the info

Change from

vram_group = parser.add_mutually_exclusive_group()
vram_group.add_argument("--gpu-only", action="store_true", help="Store and run everything (text encoders/CLIP models, etc... on the GPU).")
vram_group.add_argument("--highvram", action="store_true", help="By default models will be unloaded to CPU memory after being used. This option keeps them in GPU memory.")
vram_group.add_argument("--normalvram", action="store_true", help="Used to force normal vram use if lowvram gets automatically enabled.")
vram_group.add_argument("--lowvram", action="store_true", help="Split the unet in parts to use less vram.")
vram_group.add_argument("--novram", action="store_true", help="When lowvram isn't enough.")
vram_group.add_argument("--cpu", action="store_true", help="To use the CPU for everything (slow).")

To

vram_group = parser.add_mutually_exclusive_group()
vram_group.add_argument("--gpu-only", action="store_true", help="Store and run everything (text encoders/CLIP models, etc... on the GPU).")
vram_group.add_argument("--highvram", action="store_false", help="By default models will be unloaded to CPU memory after being used. This option keeps them in GPU memory.")
vram_group.add_argument("--normalvram", action="store_true", help="Used to force normal vram use if lowvram gets automatically enabled.")
vram_group.add_argument("--lowvram", action="store_false", help="Split the unet in parts to use less vram.")
vram_group.add_argument("--novram", action="store_false", help="When lowvram isn't enough.")
vram_group.add_argument("--cpu", action="store_true", help="To use the CPU for everything (slow).")

PC specs 12 GB GPU & 32GB Ram
Windows 11

@ljnath
Copy link

ljnath commented Dec 2, 2023

I am trying to run this on my environment with RX5500M 4GB + Ryzen 5 5600H + 24GB.
Initially I encountered both the Device type privateuseone is not supported for torch.Generator() api. and DefaultCPUAllocator: not enough memory: you tried.. error ; but both these are fixed by the above solutions.

Now I am facing AssertionError: Torch not compiled with CUDA enabled ; can anyone please assist ?

W:\Fooocus_win64_2-1-791>python_embeded\python.exe -s Fooocus\entry_with_update.py --directml  --lowvram
Already up-to-date
Update succeeded.
[System ARGV] ['Fooocus\\entry_with_update.py', '--directml', '--lowvram']
Python 3.10.9 (tags/v3.10.9:1dd9be6, Dec  6 2022, 20:01:21) [MSC v.1934 64 bit (AMD64)]
Fooocus version: 2.1.824
Running on local URL:  http://127.0.0.1:7865

To create a public link, set `share=True` in `launch()`.
Using directml with device:
Total VRAM 4096 MB, total RAM 23906 MB
Set vram state to: LOW_VRAM
Disabling smart memory management
Device: privateuseone
VAE dtype: torch.float32
Using sub quadratic optimization for cross attention, if you have memory or speed issues try using: --use-split-cross-attention
Refiner unloaded.
model_type EPS
adm 2816
Using split attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using split attention in VAE
extra keys {'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids', 'cond_stage_model.clip_l.text_projection', 'cond_stage_model.clip_l.logit_scale'}
Base model loaded: W:\Fooocus_win64_2-1-791\Fooocus\models\checkpoints\juggernautXL_version6Rundiffusion.safetensors
Request to load LoRAs [['sd_xl_offset_example-lora_1.0.safetensors', 0.1], ['None', 1.0], ['None', 1.0], ['None', 1.0], ['None', 1.0]] for model [W:\Fooocus_win64_2-1-791\Fooocus\models\checkpoints\juggernautXL_version6Rundiffusion.safetensors].
Loaded LoRA [W:\Fooocus_win64_2-1-791\Fooocus\models\loras\sd_xl_offset_example-lora_1.0.safetensors] for UNet [W:\Fooocus_win64_2-1-791\Fooocus\models\checkpoints\juggernautXL_version6Rundiffusion.safetensors] with 788 keys at weight 0.1.
Fooocus V2 Expansion: Vocab with 642 words.
Fooocus Expansion engine loaded for cpu, use_fp16 = False.
Requested to load SDXLClipModel
Requested to load GPT2LMHeadModel
Loading 2 new models
App started successful. Use the app with http://127.0.0.1:7865/ or 127.0.0.1:7865
[Parameters] Adaptive CFG = 7
[Parameters] Sharpness = 2
[Parameters] ADM Scale = 1.5 : 0.8 : 0.3
[Parameters] CFG = 4.0
[Parameters] Seed = 8429543823708178985
[Parameters] Sampler = dpmpp_2m_sde_gpu - karras
[Parameters] Steps = 30 - 15
[Fooocus] Initializing ...
[Fooocus] Loading models ...
Refiner unloaded.
[Fooocus] Processing prompts ...
[Fooocus] Preparing Fooocus text #1 ...
[Prompt Expansion] boat in the sea, cinematic, dramatic ambient light, detailed, dynamic, full intricate, elegant, highly elaborate, colorful, vivid, breathtaking, sharp focus, fine detail, symmetry, clear, artistic, color, altered, epic, romantic, scenic, background, professional, enhanced, calm, joyful, unique, awesome, creative, positive, lucid, loving, beautiful
[Fooocus] Preparing Fooocus text #2 ...
[Prompt Expansion] boat in the sea, extremely detailed, magic, perfect, vibrant colors, dramatic, cinematic, artistic, complex, highly color balanced, enigmatic, sharp focus, open atmosphere, warm light, amazing composition, inspired, beautiful surreal, creative, positive, unique, joyful, very inspirational, inspiring, pure, thought, pristine, epic, hopeful, shiny, coherent, cute
[Fooocus] Encoding positive #1 ...
[Fooocus] Encoding positive #2 ...
[Fooocus] Encoding negative #1 ...
[Fooocus] Encoding negative #2 ...
[Parameters] Denoising Strength = 1.0
[Parameters] Initial Latent shape: Image Space (896, 1152)
Preparation time: 19.68 seconds
[Sampler] refiner_swap_method = joint
[Sampler] sigma_min = 0.0291671771556139, sigma_max = 14.614643096923828
Requested to load SDXL
Loading 1 new model
loading in lowvram mode 256.0
Traceback (most recent call last):
  File "W:\Fooocus_win64_2-1-791\Fooocus\modules\async_worker.py", line 803, in worker
    handler(task)
  File "W:\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "W:\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "W:\Fooocus_win64_2-1-791\Fooocus\modules\async_worker.py", line 735, in handler
    imgs = pipeline.process_diffusion(
  File "W:\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "W:\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "W:\Fooocus_win64_2-1-791\Fooocus\modules\default_pipeline.py", line 361, in process_diffusion
    sampled_latent = core.ksampler(
  File "W:\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "W:\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "W:\Fooocus_win64_2-1-791\Fooocus\modules\core.py", line 315, in ksampler
    samples = fcbh.sample.sample(model, noise, steps, cfg, sampler_name, scheduler, positive, negative, latent_image,
  File "W:\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\sample.py", line 93, in sample
    real_model, positive_copy, negative_copy, noise_mask, models = prepare_sampling(model, noise.shape, positive, negative, noise_mask)
  File "W:\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\sample.py", line 86, in prepare_sampling
    fcbh.model_management.load_models_gpu([model] + models, model.memory_required(noise_shape) + inference_memory)
  File "W:\Fooocus_win64_2-1-791\Fooocus\modules\patch.py", line 494, in patched_load_models_gpu
    y = fcbh.model_management.load_models_gpu_origin(*args, **kwargs)
  File "W:\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\model_management.py", line 410, in load_models_gpu
    cur_loaded_model = loaded_model.model_load(lowvram_model_memory)
  File "W:\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\model_management.py", line 298, in model_load
    accelerate.dispatch_model(self.real_model, device_map=device_map, main_device=self.device)
  File "W:\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\accelerate\big_modeling.py", line 371, in dispatch_model
    attach_align_device_hook_on_blocks(
  File "W:\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\accelerate\hooks.py", line 536, in attach_align_device_hook_on_blocks
    attach_align_device_hook_on_blocks(
  File "W:\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\accelerate\hooks.py", line 536, in attach_align_device_hook_on_blocks
    attach_align_device_hook_on_blocks(
  File "W:\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\accelerate\hooks.py", line 506, in attach_align_device_hook_on_blocks
    add_hook_to_module(module, hook)
  File "W:\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\accelerate\hooks.py", line 155, in add_hook_to_module
    module = hook.init_hook(module)
  File "W:\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\accelerate\hooks.py", line 253, in init_hook
    set_module_tensor_to_device(module, name, self.execution_device)
  File "W:\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\accelerate\utils\modeling.py", line 292, in set_module_tensor_to_device
    new_value = old_value.to(device)
  File "W:\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\cuda\__init__.py", line 239, in _lazy_init
    raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled
Total time: 131.78 seconds```

@wcalvert
Copy link

wcalvert commented Dec 2, 2023

@ljnath, try running run.bat instead.

@ljnath
Copy link

ljnath commented Dec 2, 2023

@wcalvert , I tried the same. I face the same error message.

@FelixSaville
Copy link

I'm also having issues with CUDA on AMD hardware.

Hardware:

Windows 10 Pro - 22H2 32GB RAM Ryzen 5800X Radeon 6700XT (12GB VRAM)

Have done the following fixes:

Changed brownian_interval.py - generator = torch.Generator().manual_seed(int(seed)) - Line 32

Changed model_management.py - mem_total = 8192 * 1024 * 1024 - line 95

Changed cli_args.py (See below)

vram_group = parser.add_mutually_exclusive_group() vram_group.add_argument("--gpu-only", action="store_true", help="Store and run everything (text encoders/CLIP models, etc... on the GPU).") vram_group.add_argument("--highvram", action="store_false", help="By default models will be unloaded to CPU memory after being used. This option keeps them in GPU memory.") vram_group.add_argument("--normalvram", action="store_true", help="Used to force normal vram use if lowvram gets automatically enabled.") vram_group.add_argument("--lowvram", action="store_false", help="Split the unet in parts to use less vram.") vram_group.add_argument("--novram", action="store_false", help="When lowvram isn't enough.") vram_group.add_argument("--cpu", action="store_true", help="To use the CPU for everything (slow).")

Still getting CUDA error when running run.bat and I'm unsure why it's making a CUDA dependency call when AMD doesn't even use CUDA as it's only supported on Nvidia, we use ROCm...

@mrtzhsmnn
Copy link

mrtzhsmnn commented Dec 3, 2023

If I apply this fix I get a warning that it is cpu rendering.

:\scratch\Fooocus_win64_2-1-791\Fooocus\modules\anisotropic.py:132: UserWarning: The operator 'aten::std_mean.correction' is not currently supported on the DML backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at D:\a\_work\1\s\pytorch-directml-plugin\torch_directml\csrc\dml\dml_cpu_fallback.cpp:17.)
  s, m = torch.std_mean(g, dim=(1, 2, 3), keepdim=True)

Anybody having similiar issues?

Edit:
I am running a rx6900xt and win11

@KingDamo17
Copy link

I have also applied the fixes;

Changed brownian_interval.py - generator = torch.Generator().manual_seed(int(seed)) - Line 32

Changed model_management.py - mem_total = 8192 * 1024 * 1024 - line 95

Im still getting RuntimeError: Could not allocate tensor with 52428800 bytes. There is not enough GPU video memory available! Total time: 106.90 seconds.

I have a 580 RX and win10. Does anyone know if this means that it's trying to allocate too much or if my hardware is not good enough? I don't have enough experience to know the difference. Thanks

@wcalvert
Copy link

wcalvert commented Dec 3, 2023

I found a comment in another issue that is speculating the problem is due to a deeper code issue, memory allocation inside a loop, which is causing the out of memory problem:

#1078 (comment)

@TDola
Copy link

TDola commented Dec 5, 2023

I tried everything here too on my 64gb of RAM and 6700xt. It always says it is out of memory before it even gets started. Hoping this gets fixed because I was able to run it on my Mac and the results are lovely. But it's an hour per image.

@Osiris-Team
Copy link

I even get a blueenscreen of death named something like "error with gpu memory management".
image

@TMMSantos
Copy link

I tried everything here too on my 64gb of RAM and 6700xt. It always says it is out of memory before it even gets started. Hoping this gets fixed because I was able to run it on my Mac and the results are lovely. But it's an hour per image.

I have exactly the same issue. I got a PC with 64gb of RAM and a 6600M. Same memory problem. I also was able to run it on my Mac but over 1 hour per image.

@Rokurou-lmb
Copy link

Following the advice from this comment #1278 , and reverting back to an older version, fixed the running out of memory runtime error while generating images for me. It's a workaround until a newer version fixes the problem.

@aelfwyne
Copy link

aelfwyne commented Mar 9, 2024

Editing the torchsde file did not work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests