Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WiP] OpenCL support #582

Closed
wants to merge 82 commits into from
Closed

[WiP] OpenCL support #582

wants to merge 82 commits into from

Conversation

sklam
Copy link
Member

@sklam sklam commented Jul 2, 2014

OpenCL support

  • numba.ocl namespace
  • OpenCL driver bindings
  • code generation thought SPIR 1.2
  • vectorize support

Only tested on Linux + AMD hardware

The lack of generic address space support is making it difficult for high-level programming. Current implementation assume all array in global memory space. The lack of inttoptr/ptrtoint also makes array with non contiguous layout (numpy order 'A') impossible. These limitation poses a barrier to further implementation.

We can provide this as an experimental support. OpenCL 2.0 will have generic address space and may make inttoptr/ptrtoint possible.

TODO:

  • clean up user interface
  • math functions

Oscar Villellas added 30 commits April 28, 2014 20:11
ctypes wrapping seems to be working on mac os X.
It is now possible to call a selected number of
functions through the wrapper.
Automatic error handling of error code working.
Platform extracts its devices. Devices export their
parameters.
It is now possible to build very simple OpenCL
programs with this driver.
A new sample that mimics apple's "OpenCL hello world" example.
Added whatever was needed for the example to work :)
Better use "array.nbytes" that relying on len(array.data). As
array.data is a memory view in python3 that returns number of
elements instead of byte length (as it happen on 2.7 where data
is a buffer.
- "b" prefix for opencl program string
- tweaks to make for a different image format in skimage.data.moon().
  Made more robust.
There is a framework now to handle the get info for the different
OpenCL objects. They are exported as (read-only) properties and they
don't cache results (caching support could be added if needed, on a
per-property basis).
Now it is possible to get devices by type using type-specific
accessors. No state is preserved, so queries will be repeated each
time the property is accessed. That shouldn't be an issue though,
and caching can be implemented on an as-needed basis.

Also adds error code enums
The will be returned as a python list, querying the length of the
array in the same way used by string arguments.
Instead of dumping a numeric error code, it gets translated
to a string containing the CL_WHATEVER that describes the error.
That symbol is what appears in API descriptions.

A function, opencl_strerror(code) handles the conversion and is
available.
This makes creation of a call to a kernel in a program very compact.
This allows device memory copying.
@breznak
Copy link

breznak commented Feb 16, 2015

this looks exciting @sklam 👍 any progress?

@seibert
Copy link
Contributor

seibert commented Feb 16, 2015

This work is blocked pending support of the SPIR 2.0 standard (which is not yet finalized) in OpenCL runtimes. We really need the support for generic pointer address spaces in order for this Numba target to integrate into the general compiler pipeline.

@seibert
Copy link
Contributor

seibert commented Jul 8, 2015

I think it provides a path in the future for OpenCL support in Numba, but it will require the OpenCL 2.1 standard to be approved and implemented. (Historically, it has taken a long time for OpenCL standards to be implemented broadly, even if we ignore NVIDIA, who is typically the slowest since they already have CUDA.)

We also require LLVM to add the necessary SPIR-V support, but it sounds like that is already actively being investigated by developers on the LLVM mailing list.

@seibert
Copy link
Contributor

seibert commented Sep 15, 2015

Can we pull the ocl-compiler branch from Siu's repo into Numba (to preserve this bit of history), but then close this PR? We won't be coming back to OpenCL until we see how SPIR-V is implemented in OpenCL 2.1.

@sklam
Copy link
Member Author

sklam commented Nov 3, 2016

The work is preserved as https://github.com/numba/numba/tree/ocl-compiler

@sklam sklam closed this Nov 3, 2016
@ARF1
Copy link
Contributor

ARF1 commented Oct 21, 2019

@seibert I stumbled across this old issue. In various posts from 2015 you mentioned several blocking requirements. Do you have the time to write a short follow-up with the state of availability of these requirements now (4 years later)?

I imagine, other numba users searching for "numba" and "OpenCL" will land here as well and would benefit from a short update. - Though I understand if you do not have the time...

@seibert
Copy link
Contributor

seibert commented Oct 21, 2019

What has changed most with OpenCL is that the current standard has moved from SPIR (which is an LLVM-based IR) to SPIR-V (which is not LLVM at all).

Someone could write a Numba target for OpenCL, but would need to confirm the following:

  • Is there a reliable path to translate something like LLVM IR into SPIR-V? Numba pipeline is pretty focused on emitting LLVM IR, so making it do something entirely different would require a significantly larger amount of work.

  • Does SPIR-V work on a wide enough range of hardware / software OpenCL implementations to be interesting for users? (Apple's dropped OpenCL, so macOS is out unless someone has a SPIR-V to Metal compiler.) Most important is probably NVIDIA and AMD GPUs + x86 multicore CPU on Linux and Windows. Note that we were frequently surprised by OpenCL bugs during past OpenCL development, so we're a bit cautious.

Basically, we're not actively looking at OpenCL support (with all the other things on our todo list), but if someone wants to work on this, we would try to figure out a clear target extension API so that such an extension could develop asynchronous to the core Numba repository.

@ARF1
Copy link
Contributor

ARF1 commented Oct 21, 2019

@seibert Thank you very much for taking the time to explain this so clearly.

@stuartarchibald stuartarchibald mentioned this pull request May 7, 2020
22 tasks
@thipokKub
Copy link

thipokKub commented Dec 2, 2021

Hello, it's 2 year since the last post. But I think now there might be a solution to convert LLVM IR code to SPIR-V using KhronosGroup/SPIRV-LLVM-Translator. And about SPIR-V support, I think it might be possible now as multiple vendor also support Vulkan (both nVidia, and AMD). Moreover, for apple there is a KhronosGroup/MoltenVK project which translate vulkan to metal. So maybe it is now possible to target modern GPU?

I could be wrong though, because I'm not sure about the technical detail. If so please correct me

@gmarkall
Copy link
Member

we would try to figure out a clear target extension API so that such an extension could develop asynchronous to the core Numba repository.

A quick 2022 update: the target extension API is pretty much complete and the CUDA target uses it, so it should be possible to add out-of-tree support for an OpenCL backend to Numba from 0.56 onwards. Unfortunately it's not really documented, but one could follow the pattern adopted by the CUDA target - the main classes to follow the implementation of would be CudaCodeLibrary, JITCUDACodegen, CUDABackend, CUDACompiler, CUDATarget, and CUDADispatcher.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants