Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error: work group size exceeds the maximum default — Win10 64bit #7

Closed
Semnodime opened this issue Apr 29, 2018 · 5 comments
Closed

Comments

@Semnodime
Copy link

After hassling around with g++ I finally got the executable built but launching throws an error.

.\gpuowl.exe -device 0

gpuOwL v2.0-dbc5a01-mod GPU Mersenne primality checker
Pitcairn-16x 860-@1:0.0 AMD Radeon HD 7800 Series
Note: using long carry and fused tail kernels
OpenCL compilation error -11 (args  -DEXP=2976221u  -I. -cl-fast-relaxed-math -cl-kernel-arg-info )
".\gpuowl.cl", line 34: warning: OpenCL extension is now part of core
  #pragma OPENCL EXTENSION cl_khr_fp64 : enable
                           ^

".\gpuowl.cl", line 454: error: work group size exceeds the maximum default
          value for the selected device
  KERNEL(512) fft4K(P(T2) io, Trig smallTrig) {
  ^

".\gpuowl.cl", line 619: error: work group size exceeds the maximum default
          value for the selected device
  KERNEL(512) square(P(T2) io, Trig bigTrig)  { csquare(512, 4096, 625, io, bigTrig); }
  ^

".\gpuowl.cl", line 621: error: work group size exceeds the maximum default
          value for the selected device
  KERNEL(512) multiply(P(T2) io, CP(T2) in, Trig bigTrig)  { cmul(512, 4096, 625, io, in, bigTrig); }
  ^

".\gpuowl.cl", line 663: error: work group size exceeds the maximum default
          value for the selected device
  KERNEL(512) autoConv(P(T2) io, Trig smallTrig, P(T2) bigTrig) {
  ^

4 errors detected in the compilation of "C:\Users\\AppData\Local\Temp\OCL2284T1.cl".
Frontend phase failed compilation.


Bye

It does seem to work on my Intel HD Graphics though (I had it run for 1 Minute because of experimenting with -device) but obviously I want to run it on a proper graphics card.

Could you help me to get your program to hunt for a prime?

@preda
Copy link
Owner

preda commented Jun 2, 2018

If you want to run it on Intel HD GPU, it may be good to find a way to set group size above 256, in particular 512. I expect there is some way to do that, e.g. by setting some environment variable. This depends on the openCL driver (Intel), I don't know it.

But I really do need the 512 group size. I do not plan to reduce that, because it's important for fast running on AMD GPUs.

@preda
Copy link
Owner

preda commented Jun 6, 2018

Right now gpuOwl requires WG 512; closing as there's nothing I think I can do here, except looking up documentation from Intel on how to enable that.

@preda preda closed this as completed Jun 6, 2018
@Semnodime
Copy link
Author

Semnodime commented Jun 7, 2018

Sorry but I wasn't describing the issue clear enough:

It does work on my integrated Intel GPU but it does not work on my AMD Radeon HD 7850 Graphics Card. The error log is from the AMD card (as it also states in the first few lines of the log!)

@preda
Copy link
Owner

preda commented Jun 7, 2018

Yes, sorry for my bad reading. IMO it's a driver issue. It does not accept, for some reason, a workgroup size of 512, which is needed by gpuOwl. I don't think there's much I can do to fix this in the program.

OTOH I bet I saw an environment variable that was lifting this restriction (256) on some AMD driver, but I can't seem to find it now by web search, which is annoying.

@preda
Copy link
Owner

preda commented Jun 7, 2018

found it: GPU_MAX_WORKGROUP_SIZE
https://community.amd.com/thread/166244

So try running after an export like
export GPU_MAX_WORKGROUP_SIZE=512

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants