Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error building kernelBayesian.cpp for GPU on macOS #48

Open
a-hurst opened this issue Apr 10, 2018 · 2 comments
Open

Error building kernelBayesian.cpp for GPU on macOS #48

a-hurst opened this issue Apr 10, 2018 · 2 comments

Comments

@a-hurst
Copy link

a-hurst commented Apr 10, 2018

I'm testing out first-level analysis using BROCCOLI with some data I'd previously analyzed with FSL. When I run FirstLevelAnalysis using my CPU as the OpenCL device, everything builds perfectly fine and the analysis seems to run as intended. When I try running the same analysis with my GPU, however, I get the following error:

Source build error for kernelBayesian.cpp is CL_BUILD_PROGRAM_FAILURE 
One or several kernels were not created correctly, check buildInfo* !

Looking at the build log for that file for my GPU, it contains just this single line:

ptxas error   : Program using constant pointers passed as entry function parameter cannot use cvta.const

The only reference I could turn up to this error on Google was another OpenCL developer having the same issue on a Mac, but he never posted a solution. Here's my GetOpenCLInfo output:

Device info 
 
---------------------------------------------
Platform number: 0
---------------------------------------------
Platform vendor: Apple
Platform name: Apple
Platform extentions: cl_APPLE_SetMemObjectDestructor cl_APPLE_ContextLoggingFunctions cl_APPLE_clut cl_APPLE_query_kernel_names cl_APPLE_gl_sharing cl_khr_gl_event
Platform profile: FULL_PROFILE
---------------------------------------------

---------------------------------------------
Device number: 0
---------------------------------------------
Device vendor: Intel
Device name: Intel(R) Core(TM) i7-4771 CPU @ 3.50GHz
Hardware version: OpenCL 1.2 
Software version: 1.1
OpenCL C version: OpenCL C 1.2 
Device extensions: cl_APPLE_SetMemObjectDestructor cl_APPLE_ContextLoggingFunctions cl_APPLE_clut cl_APPLE_query_kernel_names cl_APPLE_gl_sharing cl_khr_gl_event cl_khr_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_image2d_from_buffer cl_APPLE_fp64_basic_ops cl_APPLE_fixed_alpha_channel_orders cl_APPLE_biased_fixed_point_image_formats cl_APPLE_command_queue_priority
Global memory size in MB: 8192
Size of largest memory object in MB: 2048
Global memory cache size in KB: 0
Local memory size in KB: 32
Constant memory size in KB: 64
Parallel compute units: 8
Clock frequency in MHz: 3500
Max number of threads per block: 1024
Max number of threads in each dimension: 1024 1 1

---------------------------------------------
Device number: 1
---------------------------------------------
Device vendor: NVIDIA
Device name: GeForce GTX 780M
Hardware version: OpenCL 1.2 
Software version: 10.30.25 355.11.10.10.30.120
OpenCL C version: OpenCL C 1.2 
Device extensions: cl_APPLE_SetMemObjectDestructor cl_APPLE_ContextLoggingFunctions cl_APPLE_clut cl_APPLE_query_kernel_names cl_APPLE_gl_sharing cl_khr_gl_event cl_khr_byte_addressable_store cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_APPLE_fp64_basic_ops cl_khr_fp64 cl_khr_3d_image_writes cl_khr_depth_images cl_khr_gl_depth_images cl_khr_gl_msaa_sharing cl_khr_image2d_from_buffer cl_APPLE_ycbcr_422 cl_APPLE_rgb_422 
Global memory size in MB: 4096
Size of largest memory object in MB: 1024
Global memory cache size in KB: 0
Local memory size in KB: 48
Constant memory size in KB: 64
Parallel compute units: 8
Clock frequency in MHz: 648
Max number of threads per block: 1024
Max number of threads in each dimension: 1024 1024 64

I've tried installing the NVIDIA Web Drivers and CUDA drivers for my GPU to see if that would make a difference, but without any luck. I'm running macOS 10.13.4 (High Sierra). Any idea what's going on?

Thanks in advance!

@wanderine
Copy link
Owner

wanderine commented Apr 16, 2018 via email

@a-hurst
Copy link
Author

a-hurst commented Apr 16, 2018

Hmm, that's as I suspected. Perhaps Apple's implementation (or Nvidia's macOS implementation, it worked fine on my CPU and I haven't tested an AMD card yet) is extra-picky about things that the Linux and Windows implementations are more forgiving of. Since I'd prefer not to dual-boot for something like this, I'll try creating an account on those Nvidia developer forums where the guy reported the same issue and see if I get anywhere. I'll also try scrounging around some OpenCL IRC channels to see if anyone has thoughts on this, and report back here once I have time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants