Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pytorch binaries not working on arch4edu ROCm #26

Closed
FranGamer1892 opened this issue Feb 1, 2023 · 6 comments
Closed

Pytorch binaries not working on arch4edu ROCm #26

FranGamer1892 opened this issue Feb 1, 2023 · 6 comments

Comments

@FranGamer1892
Copy link

FranGamer1892 commented Feb 1, 2023

Hello, I installed the ROCm stack from arch4edu and it seems to be working (rocminfo detects my RX 580). However, upon installing and testing torch (installed from the wheels provided here), this error pops up.

Traceback (most recent call last):
  File "pytest.py", line 4, in <module>
    import torch
  File "/home/fran/.local/lib/python3.8/site-packages/torch/__init__.py", line 199, in <module>
    from torch._C import *  # noqa: F403
ImportError: /home/fran/.local/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so: undefined symbol: zgetrs_

I tried building torch myself but it didn't go so well, haha
I attempted to follow this but for instance, I don't seem to find the Arch equivalent of the packages installed by apt. Proceeding to build pytorch results in a bunch of errors, I couldn't really distinguish what the problem was.

@FranGamer1892 FranGamer1892 changed the title Pytorch binaries not working on Pytorch binaries not working on arch4edu ROCm Feb 1, 2023
@xuhuisheng
Copy link
Owner

xuhuisheng commented Feb 1, 2023

Which version of ROCm do you use?
I didnot use arch before, But I can test related ROCm version with pytorch on this weekend.

The log said cannot find zgetr_ function, maybe caused by uncompatable api.

@FranGamer1892
Copy link
Author

I am not sure where to check, so I'll just send you the output from various commands, sorry haha

Thanks!

https://paste.debian.net/1269416/
https://paste.debian.net/1269417/
https://paste.debian.net/1269418/
https://paste.debian.net/1269419/

@xuhuisheng
Copy link
Owner

Looks like Versión : 5.4.0-1.

I will have a try this weekend.

@FranGamer1892
Copy link
Author

FranGamer1892 commented Feb 2, 2023 via email

@FranGamer1892
Copy link
Author

FranGamer1892 commented Feb 2, 2023

Hello, I got around to building torch by myself and when I try testing it, this error pops up:

/usr/include/c++/12.2.0/bits/stl_vector.h:1123: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](size_type) [with _Tp = const char*; _Alloc = std::allocator<const char*>; reference = const char*&; size_type = long unsigned int]: Assertion '__n < this->size()' failed.
Abortado

Same error I had with the python-pytorch-rocm package from my distro's repositories. Unfortunately I can't find anything online. I built torch from release/1.12 using python 3.8. I had to set BUILD_TEST=OFF otherwise I couldn't build, and I had to change many things on the source code since my GCC version is too new, for instance.

PS: This is the script I'm testing torch with, I think I got it from you haha

@FranGamer1892
Copy link
Author

Closing this as now there is a gfx803-compatible pytorch package in Arch repos, last time I checked it was on [community-testing]. Using a pytorch package built from source is also possible, but it's trickier to accomplish because of gcc/g++ 12 and other Arch-specific issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants