-
Notifications
You must be signed in to change notification settings - Fork 26.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[help-with-local-system]: Segmentation fault RX580 #12376
Comments
This error comes with rocm 5.6, try using rocm 5.5.0. I'm using Linux Mint 21.2, but I had same error with RX 580 Edit: 5.5.3 also does not work. Only 5.5.0 did. |
So guess I'll have to attempt downgrading it somehow, or switch to a debian based distro. |
Following the same instructions as OP, downgrading python-pytorch-opt-rocm to any non-current version triggers the "Torch is not able to use GPU" runtime error instead of the seg fault, which appears on the current version for me as well. |
I post my versions currently installed, maybe this help further:
python version:
rocm versions:
Commandlineargs in webui-user.sh: And you have to set the system variable (or set it every time before the start of webui):
In Ubuntu / Linux Mint you also need:
|
@viebrix |
I tried downgrading more packages which seemed to not match my functional manjaro system, but didn't work out. Still getting "Torch is not able to use GPU". |
I may have been missing some libraries for the amdgpu. I did a fresh arch install manually and installed: Now instead of crashing before even finishing to start up, it starts the webui but then crashes right afterwards before I can do anything.
|
I'm also having the exact same error on Garuda (Arch/KDE). Starts the webui and crashes with I have an AMD 9700xtx |
I do have integrated graphics as well. I managed to get it working last night on CPU by downgrading with pyenv to 3.10 and forcing torch to rocm5.4.2 and using No luck getting my graphics card selected/working though :( |
Using rocm version 5.5.0 fixed segfault for me (RX 580):
In this case webui.sh does not need to be touched |
For those still following this using Arch Linux like in the OP, the cause is this issue affecting python-pytorch-rocm and python-pytorch-opt-rocm. |
Seems like there's an update for python-pytorch-opt-rocm/python-pytorch-rocm to 2.0.1-11, But I did not have the opportunity to check if it's fixed yet. |
It's working properly with python-pytorch-opt-rocm 2.1.0-1 and python-torchvision 0.15.2-1. |
Is there an existing issue for this?
What happened?
It's my second time installing SD for a RX580, I been following these instructions: #4870 (comment)
It worked flawlessly at my first system running manjaro distro, however I can't seem to make it work at another machine with the same GPU but running arch. It gives off a segmentation fault error even before the webui finishes starting up.
Does anyone have any idea of what I could try, or what's causing this problem?
Steps to reproduce the problem
What should have happened?
webui starts up properly without segmentation fault just like at my manjaro setup.
Version or Commit where the problem happens
68f336b
What Python version are you running on ?
Python 3.11.x (above, no supported yet)
What platforms do you use to access the UI ?
Linux
What device are you running WebUI on?
AMD GPUs (RX 5000 below)
Cross attention optimization
Automatic
What browsers do you use to access the UI ?
Mozilla Firefox
Command Line Arguments
List of extensions
No
Console logs
Additional information
Both systems are running it on python 3.11.3
Both systems seem to be unable to detect the GPU if I run it on python 3.10.x.
Arch's hardware: Xeon E5 2650 v3, RX580 2048SP 8GB vram, 32GB RAM.
Manjaro hardware: Ryzen 3 1200, RX 580 2048SP 8GB vram, 16GB Ram
this is my journalctl log:
https://justpaste.it/5cd49
coredump.txt
manjaro $pacman -Q | grep rocm:
manjaro $pacman -Q | grep torchvision:
python-torchvision 0.15.1-2
arch $pacman -Q | grep rocm:
arch $pacman -Q | grep torchvision:
python-torchvision 0.15.2-1
I tried matching the package versions, but it just turned into another error about not being able to detect the GPU.
The text was updated successfully, but these errors were encountered: