New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[rocm-opencl-runtime] Blender demo file eventually crashes while compiling kernels #258
Comments
I had this happen with Blender, but I'm not sure which demo file (maybe classroom) - did you try any others? And what GPU do you have? I think it's worth noting that HIP also didn't work, and would segfault when I tried a simple program. In both cases, after the program crashed, it left my GPU (Vega 56) in a strange power state (I think) where it electrically whined, loudly. I haven't touched it since, but something is definitely wrong. |
Hey @kode54. I can reproduce our problem:
I'm using |
@c0d3st0rm Could you please open a separate issue for HIP, please? Can you run the HIP-Examples?
|
@tpkessler it will be a while before I can I think. The |
You can use the binary packages hosted by
|
I'll build it from source now, and see if I can recreate the issue - what information do you need? |
So even just running clinfo, my GPU starts whining like crazy and the GPU tach (see Vega reference cards) is stuck at a constant 50%. Something is definitely wrong. It starts when I run, and doesn't stop - I'll run the HIP examples soon. |
Here's the logs (I can't upload it as a file for some reason): HIP-Examples logs``` $ ./test_all.sh==== vectorAdd ==== ==== gpu-burn ==== ==== strided-access ==== Using device: Vega 10 XL/XT [Radeon RX Vega 56/64]stride time GB/sec0 0.000713 336.606 ==== rtm8 ==== ==== reduction ==== ./reduction 8388608 ./reduction 16777216 ./reduction 33554432 ./reduction 67108864 ./reduction 134217728 ./reduction 268435456 ./reduction 536870912 ==== mini-nbody ==== ==== add4 ==== Running kernels 10 times Running kernels 10 times Running kernels 10 times Running kernels 10 times ==== cuda-stream ==== Function Rate (GiB/s) Avg time(s) Min time(s) Max time(s)Copy: 316.1695 0.00327369 0.00316286 0.00348091 ==== Rodinia ====
|
Also, I dumped
Looks like the GPU might not be clocking down? Edit: additional info from
Also, I think this warrants a separate issue by this point? |
These other issues might be related, the symptoms seem to be the same: |
Ok, the hip examples work. In view of the related issues you found it seems to be an upstream issue. Do you mind reporting your problem there? Have you tried to set the DPM states as suggested here? |
I have, and it makes no difference. I'll report this upstream - sorry to spam this thread. OpenCL now seems to work too, however I was experiencing the same issue as the OP. |
So regarding the Blender crashes - I wasn't wrong. OpenCL on my Vega with ROCm 3.5.0 still crashes just after compiling (or failing to compile) the render kernels for the classroom demo:
and then leaves my Vega in the weird power state. |
I opened an issue upstream regarding the blender issues. Today I got an answer:
|
@kode54 @c0d3st0rm @tpkessler Has anyone been able to reproduce the error with the latest rocm-opencl-runtime? If not or if the error still exists, I'll close this issue with a upstream-wontfix label within a week. |
With the latest runtime ( |
Right, gotcha, I'll keep using opencl-amd version 19.50 until some day in the next 15-20 years when I may be able to afford a new gpu again. Then maybe rocm and amd and blender will have their shit together. |
Attempting to use
blender 17:2.83-1
to render the Blender 2.81 - The Junk Shop demo from here:https://www.blender.org/download/demo-files/
Using:
llvm-amdgpu 3.50-2
comgr 3.50-2
rocm-opencl-runtime 3.50-1
(modified to use Release build, but the original did the same thing)Results in Blender crashing partway through compiling the render kernels, never achieving any render output. This used to work with
rocm-opencl-runtime 3.30-1
on the same system and GPU.The text was updated successfully, but these errors were encountered: