Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Texture memory? #46

Closed
ChrisRackauckas opened this issue Jun 27, 2019 · 12 comments 路 Fixed by #206
Closed

Texture memory? #46

ChrisRackauckas opened this issue Jun 27, 2019 · 12 comments 路 Fixed by #206
Labels
cuda kernels Stuff about writing CUDA kernels. enhancement New feature or request

Comments

@ChrisRackauckas
Copy link
Member

We would like to make use of the texture memory functionality for high-dimensional interpolations. @vchuravy said it would be easy 馃憤 .

@cdsousa
Copy link
Contributor

cdsousa commented Jun 27, 2019

Refs JuliaGPU/CUDAnative.jl#158

@vchuravy
Copy link
Member

Tim and I discussed a better interface to texture memory and I think what Chris really wants is access to the interpolation for 3D texture memory

@cdsousa
Copy link
Contributor

cdsousa commented Jun 28, 2019

Yes, I think I understand, kind of tex2d and tex3d operations, right? (JuliaGPU/CUDAnative.jl#158 (comment))

@cdsousa
Copy link
Contributor

cdsousa commented Oct 2, 2019

I guess that to support this, there must be a way to allocate, free, refer to and initialize texture memory, right? Would that require support in CuArrays.jl or only in CUDAnative.jl?

@maleadt
Copy link
Member

maleadt commented Oct 2, 2019

The host-side API calls (allocate, initialize, free) would end up in CUDAdrv, while referring to and using texture memory would probably be implemented in CUDAnative.jl. We would also need to think about an appropriate abstraction, e.g. some sort of device-side array type that supports interpolations, but how to tie this into the existing CuArray/CuDeviceArray infrastructure, etc.

@cdsousa
Copy link
Contributor

cdsousa commented Oct 11, 2019

I've done some experiments and I have been able to create texture objects with CUDAdrv.jl and then use interpolated texture fetch with CUDAnative.jl ; I have a repo with that, where the main file is
https://github.com/cdsousa/cuda_julia_experiments/blob/master/cudatextureobj_cudanative.jl

I've used CBindingGen.jl and CBinding.jl to interface to some CUDA Driver API structs (see https://github.com/cdsousa/cuda_julia_experiments/blob/master/cudadrvbindings.jl). Also, I've used LLVM NVVM calls (https://docs.nvidia.com/cuda/nvvm-ir-spec/index.html#nvvm-intrin-texture-surface) since I don't know how to write assembly PTX code (JuliaGPU/CUDAnative.jl#158 (comment) and https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#texture-instructions).

@maleadt
Copy link
Member

maleadt commented Oct 16, 2019

Sweet! I haven't taken a close look yet, but it's good to see that you can get there with plain llvmcalls.

Out of curiosity, why did you pick CBinding* over plain Clang.jl? I just yesterday made the CUDAdrv.jl wrappers autogenerated (using Clang.jl), so that should make it easier to do further experimentation: JuliaGPU/CUDAdrv.jl#157

@cdsousa
Copy link
Contributor

cdsousa commented Oct 16, 2019

I pick CBinding just because I knew none of them and CBinding seemed to be simple and do the job.
But having the bidings in CUDAdrv.jl will be much better!

The API for LLVM NVVM texture fetches seems to not be so complete as the PTX counterpart, e.g., llvm.nvvm.tex.unified.2d.v4* only versus PTX tex.2d.v1*, tex.2d.v2* and tex.2d.v4*.

I'm starting to develop a small package to handle CUDA textures. I'm planning a CuTextureMemory{T, N} type to wrap "CUDA arrays", a CuTextureSampler{T, N} type (sub-type of Sampler https://github.com/JuliaGPU/GPUArrays.jl/blob/master/src/abstractarray.jl#L5) to wrap "CUDA texture objects" linked to CuTextureMemorys, and also a CuArraySampler{T, N} type (also sub-type of Sampler) to wrap "CUDA texture objects" linked to CuArrayss .

The reason for CuTextureSampler versus CuArraySampler is that CUDA texture objects can both wrap "CUDA array memory" (with better 2D/3D spatial fetch performance) or normal GPU memory (see https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__TEXOBJECT.html#group__CUDA__TEXOBJECT_1g1f6dd0f9cbf56db725b1f45aa0a7218a and https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__TYPES.html#group__CUDA__TYPES_1g9f0a76c9f6be437e75c8310aea5280f6)

Unfortunately, I'm not sure my little free time will allow me to even depart from this initial phase.

@cdsousa
Copy link
Contributor

cdsousa commented Nov 3, 2019

Hi @ChrisRackauckas, @maleadt and @vchuravy, I'm pleased to announce that I have made a working prototype package for using CUDA textures from CUDAnative.jl:

https://github.com/cdsousa/CuTextures.jl

Just scroll down on the readme page to see what is already possible to do with it 馃槈

I think I have already done many of the "hard" parts, and what is remaining is adding some flexibility and check, and possibly improve, code for performance.

@maleadt
Copy link
Member

maleadt commented Nov 4, 2019

That's awesome! I need to give it a closer look, but it looks great.
Do you want to keep it a separate package, or is it OK by you to merge this functionality with CUDAnative once it's full featured enough?

@cdsousa
Copy link
Contributor

cdsousa commented Nov 4, 2019

Thanks, I would prefer it merged indeed.
BTW, I forget to mention that I had used a development version of CUDAdrv that is recorded on the Manifest.

@cdsousa
Copy link
Contributor

cdsousa commented Nov 4, 2019

Let me note that I'm completely OK with changing and improving namings, APIs, internals...

@maleadt maleadt transferred this issue from JuliaGPU/CUDAnative.jl May 27, 2020
@maleadt maleadt added cuda kernels Stuff about writing CUDA kernels. enhancement New feature or request labels May 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuda kernels Stuff about writing CUDA kernels. enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants