Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error on installation #83795

Open
Zernez opened this issue Aug 20, 2022 · 6 comments
Open

Error on installation #83795

Zernez opened this issue Aug 20, 2022 · 6 comments
Labels
module: rocm AMD GPU support for Pytorch triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@Zernez
Copy link

Zernez commented Aug 20, 2022

馃悰 Describe the bug

Hello, after "python setup.py install" the script shows this error and the installation fails:

fatal error: error in backend: Cannot select: intrinsic %llvm.amdgcn.ds.bpermute
clang-14: error: clang frontend command failed with exit code 70 (use -v to see invocation)
AMD clang version 14.0.0 (https://github.com/RadeonOpenCompute/llvm-project roc-5.2.3 22324 d6c88e5a78066d5d7a1e8db6c5e3e9884c6ad10e)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /opt/rocm/hip/../llvm/bin
clang-14: note: diagnostic msg: Error generating preprocessed source(s).
CMake Error at torch_hip_generated_cub-RadixSortPairs.hip.o.cmake:200 (message):
Error generating file
/home/ferna/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/hip/./torch_hip_generated_cub-RadixSortPairs.hip.o

Something is missing? Thanks

Versions

roc-5.2.3
python 3.9

cc @malfet @seemethere @jeffdaily @sunway513 @jithunnair-amd @ROCmSupport @KyleCZH

@mikaylagawarecki mikaylagawarecki added module: build Build system issues triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Aug 23, 2022
@malfet malfet added module: rocm AMD GPU support for Pytorch and removed module: build Build system issues labels Aug 23, 2022
@malfet
Copy link
Contributor

malfet commented Aug 23, 2022

Looks like ROCm compiler crash to me...

@amdfaa
Copy link
Contributor

amdfaa commented Aug 23, 2022

This is Faa Diallo from the ROCm team. Can you provide the output from the following script for info about the environment: https://github.com/pytorch/pytorch/blob/master/torch/utils/collect_env.py

Also have you tried using the ROCm release containers?
For 5.2.3, this page might be helpful:
https://docs.amd.com/bundle/ROCm-Release-Notes-v5.2.3/page/ROCm_v5.2.3_Release_Notes.html

@Zernez
Copy link
Author

Zernez commented Aug 25, 2022

Thanks for reply to me. I run the script:

PyTorch version: N/A
Is debug build: N/A
CUDA used to build PyTorch: N/A
ROCM used to build PyTorch: N/A
OS: Ubuntu 20.04.4 LTS (x86_64)
GCC version: (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
Clang version: 10.0.0-4ubuntu1
CMake version: version 3.19.6
Libc version: glibc-2.31
Python version: 3.9.12 (main, Apr 5 2022, 06:56:58) [GCC 7.5.0] (64-bit runtime)
Python platform: Linux-5.15.0-46-generic-x86_64-with-glibc2.31
Is CUDA available: N/A
CUDA runtime version: Could not collect
GPU models and configuration: Could not collect
Nvidia driver version: Could not collect
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: N/A
Versions of relevant libraries:
[pip3] mypy-extensions==0.4.3
[pip3] numpy==1.21.5
[pip3] numpydoc==1.2
[conda] blas 1.0 mkl
[conda] mkl 2021.4.0 h06a4308_640
[conda] mkl-include 2022.0.1 h06a4308_117
[conda] mkl-service 2.4.0 py39h7f8727e_0
[conda] mkl_fft 1.3.1 py39hd3c417c_0
[conda] mkl_random 1.2.2 py39h51133e4_0
[conda] numpy 1.21.5 py39he7a7128_1
[conda] numpy-base 1.21.5 py39hf524024_1
[conda] numpydoc 1.2 pyhd3eb1b0_0

And with "apt show rocm-libs -a":

Package: rocm-libs
Version: 5.2.3.50203-109
Priority: optional
Section: devel
Maintainer: ROCm Libs Support rocm-libs.support@amd.com
Installed-Size: 13.3 kB
Depends: hipblas (= 0.51.0.50203-109), hipfft (= 1.0.8.50203-109), hipsolver (= 1.4.0.50203-109), hipsparse (= 2.1.0.50203-109), miopen-hip (= 2.17.0.50203-109), rccl (= 2.12.12.50203-109), rocalution (= 2.0.2.50203-109), rocblas (= 2.44.0.50203-109), rocfft (= 1.0.17.50203-109), rocrand (= 2.10.9.50203-109), rocsolver (= 3.18.0.50203-109), rocsparse (= 2.2.0.50203-109), rocm-core (= 5.2.3.50203-109), hipblas-dev (= 0.51.0.50203-109), hipcub-dev (= 2.10.12.50203-109), hipfft-dev (= 1.0.8.50203-109), hipsolver-dev (= 1.4.0.50203-109), hipsparse-dev (= 2.1.0.50203-109), miopen-hip-dev (= 2.17.0.50203-109), rccl-dev (= 2.12.12.50203-109), rocalution-dev (= 2.0.2.50203-109), rocblas-dev (= 2.44.0.50203-109), rocfft-dev (= 1.0.17.50203-109), rocprim-dev (= 2.10.9.50203-109), rocrand-dev (= 2.10.9.50203-109), rocsolver-dev (= 3.18.0.50203-109), rocsparse-dev (= 2.2.0.50203-109), rocthrust-dev (= 2.10.9.50203-109), rocwmma-dev (= 0.7.0.50203-109)
Homepage: https://github.com/RadeonOpenCompute/ROCm
Download-Size: 986 B
APT-Manual-Installed: yes
APT-Sources: https://repo.radeon.com/rocm/apt/5.2.3 ubuntu/main amd64 Packages
Description: Radeon Open Compute (ROCm) Runtime software stack

@amdfaa
Copy link
Contributor

amdfaa commented Aug 26, 2022

Can you try the following before the running the setup script:

python tools/amd_build/build_amd.py
export CMAKE_PREFIX_PATH=${CONDA_PREFIX:-"$(dirname $(which conda))/../"}
I'd advise you to use the 5.2.3 release container mentioned in my above comment.
Let me know if that works!

@Zernez
Copy link
Author

Zernez commented Aug 30, 2022

Is not working again... Thanks anyways

I've tried to use container but another error occur:
"docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]]."

My gpu is R9 290x that have from the official ROCm list is: "GPUs that are enabled, but which AMD does not officially support". I don't know what it is mean.

@hongxiayang
Copy link
Collaborator

Please check the latest documentations about supported GPUs: https://rocm.docs.amd.com/en/latest/release/gpu_os_support.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: rocm AMD GPU support for Pytorch triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

No branches or pull requests

5 participants