New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrong CUDA backend results (while interpreter ones are good) #141
Comments
I've simplified the example. Now all code is in Main.hs (between line 49 and 65). The error is still the same. |
It doesn't print anything for me... does that mean it is correct? What GPU are you using? |
Actually, if I add
for both the CUDA and interpreter backends. This might be #58 ? |
@tmcdonell: I'm sorry - of course the line The issue #58 migh be related to this one. I'm suprised, that you are getting good results, while I'm getting wrong. My card has compute capability of 1.2 - maybe this is related? |
I guess that that is it, as I have a compute capability 3.0 card. I never tracked down the root of the problem with #58. Perhaps texture references for 8- and 16-bit types need to be done specially in CUDA? Maybe it is a CUDA bug? Probably testing to see if 8-bit texture reference work when using plain CUDA is the best first step. |
@tmcdonell: I do not know plain CUDA well. I'll check what I can do in this topic :) |
I think this was just a bug with the (now dead) |
Hi!
I've got an example, where I faced a strange problem. I'm reading an BMP image (Word32) and I'm decomposing it to 4 channels (r,g,b and a, each Word8). Then I'm composing it back (to Word32) and the result is different than original image when using CUDA backend and identical when using interpreter.
I want to ask you, If you could look at this example and tell me if it is a bug in Accelerate or there are some not allowed operations in CUDA backend, which are working in Interpreter by accident?
Example: https://github.com/wdanilo/AccelerateHS-tests/tree/master/channels
It bases on Canny accelerate example (to parse cmd arguments) and there is not much code - all example specific code is placed inside the 'src' directory.
There is also a sample image attached.
To run the example write:
./dist/build/test/test small.bmp small2.bmp
The program prints its values to the screen.
When running on CUDA, we get:
Image {_channels = fromList [("rgba",Raw Array (Z :. 3 :. 3) [0,0,0,33153,8454273,8487168,0,0,0])]})
and with interpreter:
Right (Image {_channels = fromList [("rgba",Raw Array (Z :. 3 :. 3) [4278255615,4294902015,4294967040,4278222976,4286578816,4286611456,4278190335,4278255360,4294901760])]})
(which is the correct value)The text was updated successfully, but these errors were encountered: