-
Notifications
You must be signed in to change notification settings - Fork 372
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Darktable opencl results are wrong with rocm (compared to amdgpu-pro and others) #704
Comments
+1 affected by this issue. The current rocm-opencl ubuntu 18.10 package (as of 20190314) is impacted by this issue. Also the same .cl code used in nvidia opencl works fine. The issue only happens with rocm-opencl. |
@RvRijsselt if you get a chance, can you look into the discussion at darktable-org/darktable#3756 |
@RvRijsselt hi again - when you get a chance would you be able to take a look at the following discussion: One of the ROCM devs is asking for specific tests with the problematic ocl kernel that you may have done already during your extensive tests and analysis. Thanks again! |
Have you verified the discussion link provided above? Is this still an issue? If not; can we please close the issue? Thanks! |
I could not reproduce the issue with a test project and: |
Thank You! |
The Local Contrast module in Darktable uses an OpenCL kernel for applying a local laplacian filter. As far as I know the OpenCL version always resulted in the same output as a cpu based algorithm. This is also the case with the amdgpu-pro drivers which gives very nice results. With Rocm however the results are, to put it mildly, very ugly.
Bug 12423 @ Darktable
So far we have localized the issue to somewhere in the laplacian_assemble kernel.
Note that this kernel is run a number of times with different sizes of the same image and the results are merged into one final output. This means that any error will quickly propagate to a big artifact on the end result. With rocm the results already look different from amdpro on the smallest image scale (8x6 pixels). This was tested by dumping all the inputs and outputs and comparing the results from both opencl drivers.
The compiler option used in Darktable is -cl-fast-relaxed-math. Removing it has no effect. I have tested also a couple of different settings but no changes in the results: -cl-denorms-are-zero -cl-no-signed-zeros.
I have run out of ideas on how to check why the results are different. It looks more like an issue in the compiled binary than in the kernel itself.
Package: rocm-libs version: 2.1.96
Package: rocm-dkms version: 2.1.96
Package: rocm-opencl version: 1.2.0-2019020110
The issue has been reported also with different gpus: AMD RX-560, RX-570, Vega 56.
The text was updated successfully, but these errors were encountered: