Fix for AMD OCL crashing on windows (Or I believe so) #5

Closed
kiljacken opened this Issue Aug 1, 2011 · 19 comments

Projects

None yet

2 participants

@kiljacken

The OCL compiler is crashing on windows, and can be fixed by adding the flag -fbin-amdil to the compiler options. But I'm not sure how that is done in C code, as I have only tried it using AMD's KernelAnalyzer

@samr7
Owner
samr7 commented Aug 1, 2011

Are you seeing oclvanitygen crash on Windows?

Indeed the -fbin-amdil flag makes it possible to get past the "Can't find the IL for Cypress" error in KernelAnalyzer. But after that, if I paste the oclvanitygen kernel, and try to enable the #pragma unroll directives, it crashes hard. So I'm a little bit skeptical that the -fbin-amdil flag will make a difference.

Along the same lines, oclvanitygen will always enable the loop unrolling directives, and will readily crash in the kernel compile step if told to use AMD's CPU OpenCL device. So I'm going to guess that the CPU OpenCL device has bugs related to loop unrolling. It doesn't seem to affect any of the GPU devices though, and without loop unrolling, the performance is terrible.

@kiljacken

Strange, as oclvanity gen crashes on my GPU. Windows 7 Ultimate 64-bit with a Radeon 400 series card of some sort. Can I do anything to help resolving this issue?

@samr7
Owner
samr7 commented Aug 1, 2011

Ah, gotcha!

Can you paste the console output from oclvanitygen, when you get it to crash, with the -v flag added?

You're probably the first to try it on a Radeon HD 4000 card.

@kiljacken

Okay, output of oclvanitygen.exe -v -w 32 -d 0 1emiL

Prefix difficulty: 264104224 1emiL
Difficulty: 264104224
Device: ATI RV710
Vendor: Advanced Micro Devices, Inc.
Driver: CAL 1.4.1417
Profile: FULL_PROFILE
Version: OpenCL 1.0 AMD-APP-SDK-v2.4 (650.9)
Max compute units: 2
Max workgroup size: 128
Global memory: 268435456
Max allocation: 134217728
Compiling kernel...

Hope that will help :) If not feel free to tell me what else I can get ya

@kiljacken kiljacken closed this Aug 1, 2011
@kiljacken kiljacken reopened this Aug 1, 2011
@kiljacken

Derp, don't mind my misclick

@samr7
Owner
samr7 commented Aug 2, 2011

To try to diagnose this problem, the latest version 0.16 has a safe mode flag (-S) that disables some optimizations, including loop unrolling.
The binaries are posted: http://www.sendspace.com/file/ozixwu

@kiljacken

Okay, that work first time I use it, but if I use it again without removing the .oclbin file it crashes next time I try to start it

@samr7
Owner
samr7 commented Aug 2, 2011

Ah crap I thought that was fixed!
Does it crash with an access violation, or does it fail semi-gracefully with an OpenCL error, e.g.:

clBuildProgram: CL_BUILD_PROGRAM_FAILURE
Build log:
Internal error: Link failed.
Make sure the system setup is correct.

@kiljacken

It manages to tell me this:
Prefix difficulty: 4329577 1emiL
Difficulty: 4329577
Device: ATI RV710
Vendor: Advanced Micro Devices, Inc.
Driver: CAL 1.4.1417
Profile: FULL_PROFILE
Version: OpenCL 1.0 AMD-APP-SDK-v2.4 (650.9)
Max compute units: 2
Max workgroup size: 128
Global memory: 268435456
Max allocation: 134217728
OpenCL compiler flags:
Loading kernel binary 2ec1061a6fca3f4ee2f76048286f1c0c.oclbin
Build log:

Grid size: 1024x512
Modular inverse: 64 threads, 8192 ops each
Using GPU prefix matcher

And then windows tells me it stopped working

@kiljacken kiljacken closed this Aug 2, 2011
@kiljacken kiljacken reopened this Aug 2, 2011
@kiljacken

Derp, another misclick

@samr7
Owner
samr7 commented Aug 2, 2011

So far this has me completely stumped. I don't have a Radeon card to test on Windows. So I used the CPU driver and went through the oclbin save/restore code again, and was not able to find an issue with loading and saving binaries of various sizes.

I posted a WIndows build with some extra code to output sizes and MD5 hashes of binaries when it loads and saves them here:
http://www.sendspace.com/file/2rk0y9

Make sure to delete your oclbins first, and run it with -S and -v as before. If it reports different hashes, take an md5sum of the oclbin file and we will find out which end is broken. Otherwise, we will need a better idea

@kiljacken

Nope, no different hashes. This is getting strange

@samr7
Owner
samr7 commented Aug 2, 2011

Wow, you are fast!
This is starting to look like some sort of silent memory corruption bug.
Are you able to run other OpenCL apps that save/restore binaries, like phoenix miner?

@kiljacken

So Phoenix miner does this too? Then I guess I know why it was crashing. I
use DiabloMiner, but it doesn't save the bin so it works just fine

@samr7
Owner
samr7 commented Aug 2, 2011

Try running it with AMD's OpenCL CPU device, and see if you can reproduce the binary load/save problem.
To use the CPU device, aside from the -d1 or whatever, you'll probably want the following command line arguments:
-g256x1 -w1

If this problem is specific to the Catalyst driver running on your hardware, perhaps it will need to go on a list of platforms where binary load/save should be avoided.

@kiljacken

Strange, binary loading works just fine on CPU.

@samr7
Owner
samr7 commented Aug 2, 2011

Ok, I'll push a change to blacklist RV710 for stored program binaries.
Do you know if this problem exists for any other HD 4000 series devices?

@kiljacken

No, I don't know if it exists for other devices, but I guess we'll
find out if people report it

@samr7
Owner
samr7 commented Aug 2, 2011

The RV710 black list change is in git now, and I posted a build at:
http://www.sendspace.com/file/1u17w5

HD 4000 probably runs a lot slower in safe mode, and we're going to have to get rid of it one way or another. On GT200, it reduces speed to about 1/2. On HD 5000, it's closer to 1/20th.

@colindean colindean added a commit to colindean/vanitygen that referenced this issue Jan 27, 2014
@colindean colindean Add Litecoin pattern 5fa4223
@kiljacken kiljacken closed this Feb 26, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment