Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: "Memory access fault by GPU node-1" error with RX 6600 on Linux #8139

Open
1 task done
Yumae opened this issue Feb 26, 2023 · 14 comments
Open
1 task done

[Bug]: "Memory access fault by GPU node-1" error with RX 6600 on Linux #8139

Yumae opened this issue Feb 26, 2023 · 14 comments
Labels
bug-report Report of a bug, yet to be confirmed

Comments

@Yumae
Copy link

Yumae commented Feb 26, 2023

Is there an existing issue for this?

  • I have searched the existing issues and checked the recent builds/commits

What happened?

When trying to generate pictures above a certain resolution i get this error in the console window. I have been able to consistently reproduce this by trying to generate a picture bigger than 768x1024/1024x768. Im sure that i could go higher than that with the amount of VRAM that this card has considering that the KDE resource monitor shows VRAM usage never reaching 7gb. In the screenshot it can be seen that the generation process goes to 100% but when it tries to output the image it spits out that error instead.
Screenshot_20230226_130106

Steps to reproduce the problem

Generate a picture with a resolution higher than 1024x768 like for example 1280x768.

What should have happened?

It should output the picture and it should let me generate at higher resolutions as well.

Commit where the problem happens

3715ece

What platforms do you use to access the UI ?

Linux

What browsers do you use to access the UI ?

Mozilla Firefox

Command Line Arguments

export COMMANDLINE_ARGS="--medvram --listen"

List of extensions

wildcards
openpose-editor
stable-diffusion-webui-dataset-tag-editor
stable-diffusion-webui-images-browser
stable-diffusion-webui-pixelization

Console logs

################################################################
Install script for stable-diffusion + Web UI
Tested on Debian 11 (Bullseye)
################################################################

################################################################
Running on anon user
################################################################

################################################################
Repo already cloned, using it as install directory
################################################################

################################################################
Create and activate python venv
################################################################

################################################################
Launching launch.py...
################################################################
Python 3.10.9 (main, Dec 19 2022, 17:35:49) [GCC 12.2.0]
Commit hash: 3715ece0adce7bf7c5e9c5ab3710b2fdc3848f39
Installing requirements for Web UI

Launching Web UI with arguments: --medvram --listen
No module 'xformers'. Proceeding without it.
Loading weights [c353313f5d] from /home/anon/stable-diffusion-webui/models/Stable-diffusion/AOM2-nutmegmixGav2+ElysV2.safetensors
Creating model from config: /home/anon/stable-diffusion-webui/configs/v1-inference.yaml
LatentDiffusion: Running in eps-prediction mode
DiffusionWrapper has 859.52 M params.
Loading VAE weights specified in settings: /home/anon/stable-diffusion-webui/models/VAE/sd15.vae.pt
Applying cross attention optimization (Doggettx).
Textual inversion embeddings loaded(25): chr-atagorq, chr-ayanami, chr-bremertonsummer, chr-honolulu, chr-shun, chr-shylily, chr-sirius, chr-stlouislux, chr-taihou, chr-yamashiro, chr-yukikazepan, ero-lactation, ero-doggystyle, ero-deepmissionary, spe-centaur, chr-nahida, spe-mothgirl, chr-okayu, spe-lamia, chr-senko, chr-i19, chr-lumine, chr-kashino, chr-yuudachi, EasyNegative
Model loaded in 18.8s (load weights from disk: 7.2s, create model: 1.1s, apply weights to model: 8.2s, apply half(): 0.6s, load VAE: 1.6s).
[tag-editor] Settings has been read from config.json
Running on local URL:  http://0.0.0.0:7860

To create a public link, set `share=True` in `launch()`.
  0%|                                                                                  | 0/20 [00:00<?, ?it/s]MIOpen(HIP): Warning [SQLiteBase] Missing system database file: gfx1030_14.kdb Performance may degrade. Please follow instructions to install: https://github.com/ROCmSoftwarePlatform/MIOpen#installing-miopen-kernels-package
100%|█████████████████████████████████████████████████████████████████████████| 20/20 [00:07<00:00,  2.66it/s]
100%|█████████████████████████████████████████████████████████████████████████| 20/20 [00:44<00:00,  2.23s/it]
Memory access fault by GPU node-1 (Agent handle: 0x559e97c5f090) on address 0x7f91800e1000. Reason: Page not present or supervisor privilege.


Warning: Program '/home/anon/stable-diffusion-webui/webui.sh' crashed.

Additional information

Distro: EndeavourOS (ArchLinux)
DE: KDE on X11
CPU: Ryzen 1600
GPU: RX 6600 (8GB VRAM)
RAM: 16GB

WebUI installed with the default script. I didn't mess with ROCm versions or any of that since it took care of that automatically.
Can generate pictures at or below 1024x768 with no problems. I get the same error both with and without highres fix enabled.
Screenshot_20230226_132136

@Yumae Yumae added the bug-report Report of a bug, yet to be confirmed label Feb 26, 2023
@raff766
Copy link

raff766 commented Mar 3, 2023

Can confirm, I'm having the same exact issue with my RX 6800 XT (16GB VRAM)

@Parzival1608vonKatze
Copy link

Same here, exactly the same issue. (RX 6700XT 12GB)

@ishawn944
Copy link

You can try the following commands:
sudo usermod -a -G video $USER
sudo usermod -a -G render $USER

Set the environment variable for SD:
PYTORCH_HIP_ALLOC_CONF=garbage_collection_threshold:0.6,max_split_size_mb:128
GPU memory will be garbage collected when it reaches 60% capacity. and set the maximum size of memory splits to 128mb, that can help to reduce memory fragmentation.

You may also need to add the --medvram
These worked for my RX 6750XT

@Ridien
Copy link

Ridien commented Mar 11, 2023

Running the WebUI using --no-half and --lowvram solved it for me.

@popemkt
Copy link

popemkt commented Mar 19, 2023

Can confirm the same with RX 6800S 8GB

@mlrey7
Copy link

mlrey7 commented Mar 30, 2023

Upgrading to pytorch 2.0 and rocm 5.4.2 fixed this for me. Also using --opt-sub-quad-attention really helps along side with --medvram and the PYTORCH_HIP_ALLOC_CONF=garbage_collection_threshold:0.6,max_split_size_mb:128 All of these allows me to hi-res fix 512x768 to 1.85x (944x1420) on the RX 6600 8GB VRAM

@DGdev91
Copy link
Contributor

DGdev91 commented Apr 6, 2023

I have the same problem on 5700XT using rocm 5.4.2 and pytorch 2.0
Strangely, it works fine using pytorch 1.13.1

same issue with both --medvram and --lowvram

With pytorch 2 i also tried to use --opt-sdp-attention with no effect

i also use --precision full and --no-half

i finally tried export PYTORCH_HIP_ALLOC_CONF=garbage_collection_threshold:0.6,max_split_size_mb:128, did not help.

@egolfbr
Copy link

egolfbr commented Apr 27, 2023

I have the same problem on 5700XT using rocm 5.4.2 and pytorch 2.0 Strangely, it works fine using pytorch 1.13.1

same issue with both --medvram and --lowvram

With pytorch 2 i also tried to use --opt-sdp-attention with no effect

i also use --precision full and --no-half

i finally tried export PYTORCH_HIP_ALLOC_CONF=garbage_collection_threshold:0.6,max_split_size_mb:128, did not help.

I have a very similar setup on an Ubuntu machine. I downgraded to Pytorch 1.13.1 and everything appears to be fine except for a warning about missing database file

MIOpen(HIP): Warning [SQLiteBase] Missing system database file: gfx1030_20.kdb Performance may degrade. Please follow instructions to install: https://github.com/ROCmSoftwarePlatform/MIOpen#installing-miopen-kernels-package

@DianaNites
Copy link

@egolfbr That a harmless warning from AMD due to https://github.com/ROCmSoftwarePlatform/MIOpen/blob/develop/doc/src/cache.md

TLDR is that AMD ROCm will compile and cache some GPU stuff in the background, but also comes with pre-compiled GPU kernels for some cards. The version with pytorch 1.x does not seem to bundle a copy for your card, but the only effect should be that the first image you generate may be slow.

@skerit
Copy link

skerit commented May 12, 2023

Had the same error while using a Lora model, but I was still using torch 1.12
Upgrading to 1.13.1 fixed it for me.

@torgeir
Copy link

torgeir commented May 27, 2023

The following fixed an issue similar to OP

index 49a426ff..03b57253 100644
--- a/webui-user.sh
+++ b/webui-user.sh
@@ -10,7 +10,7 @@
 #clone_dir="stable-diffusion-webui"
 
 # Commandline arguments for webui.py, for example: export COMMANDLINE_ARGS="--medvram --opt-split-attention"
-#export COMMANDLINE_ARGS=""
+#export COMMANDLINE_ARGS="--reinstall-torch"
 
 # python3 executable
 #python_cmd="python3"
@@ -27,6 +27,9 @@
 # install command for torch
 #export TORCH_COMMAND="pip install torch==1.12.1+cu113 --extra-index-url https://download.pytorch.org/whl/cu113"
 
+# https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/8139
+export PYTORCH_HIP_ALLOC_CONF=garbage_collection_threshold:0.8,max_split_size_mb:128
+
 # Requirements file to use for stable-diffusion-webui
 #export REQS_FILE="requirements_versions.txt"
--- a/webui.sh
+++ b/webui.sh
@@ -119,7 +119,7 @@ esac
 if echo "$gpu_info" | grep -q "AMD" && [[ -z "${TORCH_COMMAND}" ]]
 then
     # AMD users will still use torch 1.13 because 2.0 does not seem to work.
-    export TORCH_COMMAND="pip install torch==1.13.1+rocm5.2 torchvision==0.14.1+rocm5.2 --index-url https://download.pytorch.org/whl/rocm5.2"
+    export TORCH_COMMAND="pip install torch torchvision --index-url https://download.pytorch.org/whl/rocm5.4.2"
 fi  
 
 for preq in "${GIT}" "${python_cmd}"

Arch, RX6800XT

@Essoje
Copy link

Essoje commented May 30, 2023

Confirming the above has solved the 'Memory access fault by GPU node-1' problem on my machine.
However, while the above would work without a problem on a clean installation, I was forced to additionally use the --ignore-installed flag on the pip install command as follows.

TORCH_COMMAND="pip install --ignore-installed torch torchvision --index-url https://download.pytorch.org/whl/rocm5.4.2"

Manjaro, RX6900XT

@lufixSch
Copy link

lufixSch commented Dec 2, 2023

Just wanted to add, for anyone finding this.

sudo usermod -a -G video $USER
sudo usermod -a -G render $USER

For some reason I got this error after adding my user to the groups video and render.
When removing the groups everything worked again.

sudo usermod -r -G video $USER
sudo usermod -r -G render $USER

@juipeltje
Copy link

i'm having the same problem with fooocus running on void linux with a 6950xt, pretty much tried every solution in this thread to no avail, but what seems to work as a workaround for me now is to use it with --always-no-vram and --always-offload-from-vram, not sure if A1111 has similar flags available but maybe worth a shot. it is a little bit slower compared to using vram, but it still easily beats running on cpu and atleast now i can leave it running to generate a bunch of images without it crashing every other image. if you have the extra system ram available it might be a good bandaid solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug-report Report of a bug, yet to be confirmed
Projects
None yet
Development

No branches or pull requests