[Bug]: stable diffusion inside Docker crashes with error: ModuleNotFoundError: No module named 'pytorch_lightning.utilities.distributed' #13785

l0ggik · 2023-10-28T20:53:08Z

Is there an existing issue for this?

I have searched the existing issues and checked the recent builds/commits

What happened?

I tried to use the rocm/pytorch Docker image but when starting stable-diffusion-webui following the install instructions i get the following error: ModuleNotFoundError: No module named 'pytorch_lightning.utilities.distributed'

I tried downgrading to pytorch_lightning v1.7.7 and 1.6.5 but with no effect

Has anyone else this problem and know a solution?

Steps to reproduce the problem

Install stable-diffusion-webui with Docker

What should have happened?

webui should have started without error

Sysinfo

Linux Mint 20.1 Ulyssa
RX 580

What browsers do you use to access the UI ?

No response

Console logs

Python 3.9.5 (default, Nov 23 2021, 15:27:38) 
[GCC 9.3.0]
Version: v1.6.0
Commit hash: 5ef669de080814067961f28357256e8fe27544f4
Launching Web UI with arguments: --precision full --no-half --skip-torch-cuda-test
no module 'xformers'. Processing without...
no module 'xformers'. Processing without...
Traceback (most recent call last):
  File "/dockerx/stable-diffusion-webui/launch.py", line 48, in <module>
    main()
  File "/dockerx/stable-diffusion-webui/launch.py", line 44, in main
    start()
  File "/dockerx/stable-diffusion-webui/modules/launch_utils.py", line 432, in start
    import webui
  File "/dockerx/stable-diffusion-webui/webui.py", line 13, in <module>
    initialize.imports()
  File "/dockerx/stable-diffusion-webui/modules/initialize.py", line 33, in imports
    from modules import shared_init
  File "/dockerx/stable-diffusion-webui/modules/shared_init.py", line 5, in <module>
    from modules import shared
  File "/dockerx/stable-diffusion-webui/modules/shared.py", line 5, in <module>
    from modules import shared_cmd_options, shared_gradio_themes, options, shared_items, sd_models_types
  File "/dockerx/stable-diffusion-webui/modules/sd_models_types.py", line 1, in <module>
    from ldm.models.diffusion.ddpm import LatentDiffusion
  File "/dockerx/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/models/diffusion/ddpm.py", line 20, in <module>
    from pytorch_lightning.utilities.distributed import rank_zero_only
ModuleNotFoundError: No module named 'pytorch_lightning.utilities.distributed'

Additional information

No response

gsteinLTU · 2023-10-29T02:46:18Z

I had to remove some extensions to get it working again after a similar error on a non-Docker install.

TheNexter · 2023-11-24T16:51:37Z

Same here using main or dev branch 👍

6600 XT, ubuntu 23.10 using docker

no module 'xformers'. Processing without...
Traceback (most recent call last):
  File "/dockerx/stable-diffusion-webui/launch.py", line 48, in <module>
    main()
  File "/dockerx/stable-diffusion-webui/launch.py", line 44, in main
    start()
  File "/dockerx/stable-diffusion-webui/modules/launch_utils.py", line 432, in start
    import webui
  File "/dockerx/stable-diffusion-webui/webui.py", line 13, in <module>
    initialize.imports()
  File "/dockerx/stable-diffusion-webui/modules/initialize.py", line 33, in imports
    from modules import shared_init
  File "/dockerx/stable-diffusion-webui/modules/shared_init.py", line 5, in <module>
    from modules import shared
  File "/dockerx/stable-diffusion-webui/modules/shared.py", line 5, in <module>
    from modules import shared_cmd_options, shared_gradio_themes, options, shared_items, sd_models_types
  File "/dockerx/stable-diffusion-webui/modules/sd_models_types.py", line 1, in <module>
    from ldm.models.diffusion.ddpm import LatentDiffusion
  File "/dockerx/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/models/diffusion/ddpm.py", line 20, in <module>
    from pytorch_lightning.utilities.distributed import rank_zero_only
ModuleNotFoundError: No module named 'pytorch_lightning.utilities.distributed'

axxapy · 2023-12-28T04:30:01Z

running pip install pytorch-lightning==1.6.5 helped me

source

hchasens · 2024-03-06T22:29:38Z

Can confirm it's still present in docker when using w/ ROCm.

t3dc · 2024-05-07T17:49:12Z

Running into this same issue. As suggested installing pytorch-lighting==1.6.5 helped, but only partly. Then I recieved:

No module named 'timm'

Running pip install timm got me past that but only to another crash:

AttributeError: 'NoneType' object has no attribute '_id' Creating model from config: /dockerx/stable-diffusion-webui/configs/v1-inference.yaml /opt/conda/envs/py_3.9/lib/python3.9/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: resume_downloadis deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, useforce_download=True. warnings.warn( loading stable diffusion model: RuntimeError Traceback (most recent call last): File "/opt/conda/envs/py_3.9/lib/python3.9/threading.py", line 937, in _bootstrap self._bootstrap_inner() File "/opt/conda/envs/py_3.9/lib/python3.9/threading.py", line 980, in _bootstrap_inner self.run() File "/opt/conda/envs/py_3.9/lib/python3.9/threading.py", line 917, in run self._target(*self._args, **self._kwargs) File "/dockerx/stable-diffusion-webui/modules/initialize.py", line 149, in load_model shared.sd_model # noqa: B018 File "/dockerx/stable-diffusion-webui/modules/shared_items.py", line 175, in sd_model return modules.sd_models.model_data.get_sd_model() File "/dockerx/stable-diffusion-webui/modules/sd_models.py", line 620, in get_sd_model load_model() File "/dockerx/stable-diffusion-webui/modules/sd_models.py", line 748, in load_model load_model_weights(sd_model, checkpoint_info, state_dict, timer) File "/dockerx/stable-diffusion-webui/modules/sd_models.py", line 393, in load_model_weights model.load_state_dict(state_dict, strict=False) File "/dockerx/stable-diffusion-webui/modules/sd_disable_initialization.py", line 223, in <lambda> module_load_state_dict = self.replace(torch.nn.Module, 'load_state_dict', lambda *args, **kwargs: load_state_dict(module_load_state_dict, *args, **kwargs)) File "/dockerx/stable-diffusion-webui/modules/sd_disable_initialization.py", line 221, in load_state_dict original(module, state_dict, strict=strict) File "/opt/conda/envs/py_3.9/lib/python3.9/site-packages/torch/nn/modules/module.py", line 2139, in load_state_dict load(self, state_dict) File "/opt/conda/envs/py_3.9/lib/python3.9/site-packages/torch/nn/modules/module.py", line 2127, in load load(child, child_state_dict, child_prefix) # noqa: F821 File "/opt/conda/envs/py_3.9/lib/python3.9/site-packages/torch/nn/modules/module.py", line 2127, in load load(child, child_state_dict, child_prefix) # noqa: F821 File "/opt/conda/envs/py_3.9/lib/python3.9/site-packages/torch/nn/modules/module.py", line 2127, in load load(child, child_state_dict, child_prefix) # noqa: F821 [Previous line repeated 1 more time] File "/opt/conda/envs/py_3.9/lib/python3.9/site-packages/torch/nn/modules/module.py", line 2121, in load module._load_from_state_dict( File "/dockerx/stable-diffusion-webui/modules/sd_disable_initialization.py", line 225, in <lambda> linear_load_from_state_dict = self.replace(torch.nn.Linear, '_load_from_state_dict', lambda *args, **kwargs: load_from_state_dict(linear_load_from_state_dict, *args, **kwargs)) File "/dockerx/stable-diffusion-webui/modules/sd_disable_initialization.py", line 191, in load_from_state_dict module._parameters[name] = torch.nn.parameter.Parameter(torch.zeros_like(param, device=device, dtype=dtype), requires_grad=param.requires_grad) File "/opt/conda/envs/py_3.9/lib/python3.9/site-packages/torch/_meta_registrations.py", line 4820, in zeros_like res.fill_(0) RuntimeError: HIP error: invalid device function HIP kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing AMD_SERIALIZE_KERNEL=3. Compile with TORCH_USE_HIP_DSA to enable device-side assertions.

Anyone fought their way past this? I'm on an RX 6750XT.

l0ggik added the bug-report Report of a bug, yet to be confirmed label Oct 28, 2023

Dalton-Murray mentioned this issue Mar 18, 2024

Update pytorch lightning utilities #15310

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: stable diffusion inside Docker crashes with error: ModuleNotFoundError: No module named 'pytorch_lightning.utilities.distributed' #13785

[Bug]: stable diffusion inside Docker crashes with error: ModuleNotFoundError: No module named 'pytorch_lightning.utilities.distributed' #13785

l0ggik commented Oct 28, 2023

gsteinLTU commented Oct 29, 2023 •

edited

TheNexter commented Nov 24, 2023 •

edited

axxapy commented Dec 28, 2023

hchasens commented Mar 6, 2024

t3dc commented May 7, 2024

[Bug]: stable diffusion inside Docker crashes with error: ModuleNotFoundError: No module named 'pytorch_lightning.utilities.distributed' #13785

[Bug]: stable diffusion inside Docker crashes with error: ModuleNotFoundError: No module named 'pytorch_lightning.utilities.distributed' #13785

Comments

l0ggik commented Oct 28, 2023

Is there an existing issue for this?

What happened?

Steps to reproduce the problem

What should have happened?

Sysinfo

What browsers do you use to access the UI ?

Console logs

Additional information

gsteinLTU commented Oct 29, 2023 • edited

TheNexter commented Nov 24, 2023 • edited

axxapy commented Dec 28, 2023

hchasens commented Mar 6, 2024

t3dc commented May 7, 2024

gsteinLTU commented Oct 29, 2023 •

edited

TheNexter commented Nov 24, 2023 •

edited