-
Notifications
You must be signed in to change notification settings - Fork 203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add texture support from CuTextures.jl #206
Conversation
Codecov Report
@@ Coverage Diff @@
## master #206 +/- ##
==========================================
- Coverage 82.30% 81.97% -0.34%
==========================================
Files 143 145 +2
Lines 9315 9529 +214
==========================================
+ Hits 7667 7811 +144
- Misses 1648 1718 +70
Continue to review full report at Codecov.
|
Well, the part that I was least satisfied was the Also, the way that |
Didn't you use the call overload ( |
Yeah, I used to have that design, it perfectly fine.
That's possible too, we just then need to track that configuration on the host and device side objects, either by storing a flag or by splitting the types into the two versions. |
Did I start with the wrong code then? Your master branch still has that design: https://github.com/cdsousa/CuTextures.jl/blob/6838fff61ffe32a5bb8344ca1c6e6a3dca3fd16b/src/native.jl#L62-L70
Yeah that's what I do now, a field CPU-side and a typevar in the device type. But I need to play with the code some more to get a feel about that API :-) |
Ahh, sorry for the confusion, I've overlooked into the |
About the automatic cast/reconstruct/ |
I'm Ok with that, we can always do that explicitly (however I think we cannot reinterpret a CuDeviceArray inside a kernel, can we?). On the other hand, we will still have an automatic mapping between |
No; I could add that, but it it seems simpler to reinterpret back when moving data off the GPU?
I'm not entirely sure what you mean, currently using sources of type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the other hand, we will still have an automatic mapping between
NTuple{4, Float32}
andfloat4
, right? In that case, there must be rationale for choosingNTuple{4, Float32}
rather than something else (e.g.,SIMD.Vec{4,Float32}
).I'm not entirely sure what you mean, currently using sources of type
NTuple{4, T}
will construct a 4-channel texture, and accessing that device-side just gives you thatNTuple
. (thinking of, maybe it would be convenient to allow passing thechannel
to getindex to get a plain value?)
Not a big deal, I was just not sure why an NTuple{4, T}
is the Julia-side version of a CUDA float4
(despite the fact that I actually used that equivalence in my original code). I guess that is the usual and right choice.
src/texture.jl
Outdated
Base.size(tm::CuTexture) = size(tm.mem) | ||
|
||
Adapt.adapt_storage(::Adaptor, t::CuTexture{T,N}) where {T,N} = | ||
CuDeviceTexture{T,N,t.normalized_coordinates}(size(t.mem), t.handle) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this bring any type of instability?
I have not used a solution like this in the first place due to the fear of type instability, but now I guess that maybe the kernel launch works as a function barrier, is that right?
Fixes #46
@cdsousa: I'm still getting to understand your design and the full extent of the relevant CUDA APIs, so I expect to do some work here before merging. Feel free to comment though! I changed some already to be closer to the CUDA C defaults.