Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cubic texture interpolation #460

Merged
merged 8 commits into from
Oct 1, 2020
Merged

Cubic texture interpolation #460

merged 8 commits into from
Oct 1, 2020

Conversation

maleadt
Copy link
Member

@maleadt maleadt commented Sep 30, 2020

No description provided.

@codecov
Copy link

codecov bot commented Sep 30, 2020

Codecov Report

Merging #460 into master will increase coverage by 0.04%.
The diff coverage is 78.94%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #460      +/-   ##
==========================================
+ Coverage   80.49%   80.54%   +0.04%     
==========================================
  Files         166      166              
  Lines        8835     8852      +17     
==========================================
+ Hits         7112     7130      +18     
+ Misses       1723     1722       -1     
Impacted Files Coverage Δ
test/texture.jl 91.66% <50.00%> (-4.63%) ⬇️
src/texture.jl 86.84% <92.30%> (+2.22%) ⬆️
lib/cublas/CUBLAS.jl 80.30% <0.00%> (+3.03%) ⬆️
lib/curand/random.jl 92.30% <0.00%> (+3.07%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 70d93cc...8da6cdf. Read the comment docs.

@maleadt
Copy link
Member Author

maleadt commented Sep 30, 2020

@cdsousa I found the reason for the broken texture test: tex1D only supports texture arrays, not linear memory. See here: https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#texture-object-api-appendix. While tex2D supports both linear and array sources, you need to use tex1Dfetch for texture objects bound to linear memory. However, that's pretty useless: you can only fetch, no filters, clamping, etc. So I removed that support (you can easily do the same with ldg anyhow), and made it a compilation failure.

After fixing that I introduced a bunch of new broken tests though, as I can't seem to validate the result of my cubic interpolation against Interpolations.jl. Even though everything looks OK:

Nearest:
nearest

Linear:
linear

Cubic:
cubic

@maleadt maleadt added cuda array Stuff about CuArray. enhancement New feature or request labels Sep 30, 2020
@cdsousa
Copy link
Contributor

cdsousa commented Sep 30, 2020

Um OK, there was no problem at all.

Regarding the cubic interpolation differences, can it be that the low-resolution weights (9 bits only, https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#linear-filtering) have enough impact to make the cubic interpolation comparison fail, yet letting the linear one be within the tolerance?

@maleadt
Copy link
Member Author

maleadt commented Sep 30, 2020

@cdsousa
Copy link
Contributor

cdsousa commented Sep 30, 2020

https://github.com/cdsousa/CuTextures.jl/blob/08403490794f808086a6a6c8d093b5094f35ab42/test/runtests.jl#L112-L114

Oops, I wanted to say: "Um OK, there was no problem after all." Unlike I had though 😄

@maleadt
Copy link
Member Author

maleadt commented Sep 30, 2020

That makes more sense :-)

Re cubic, I don't think it's accuracy related, but I haven't dug too deeply.

Set-up:

julia> using CUDA, Interpolations

# source values
julia> src = Float32[2^i for i in 1:10]
10-element Array{Float32,1}:
    2.0
    4.0
    8.0
   16.0
   32.0
   64.0
  128.0
  256.0
  512.0
 1024.0

# indices we'll interpolate
julia> idx = collect(1:0.25:10)
  1.0
  1.25
  1.5
  1.75
  2.0
  2.25
  2.5
  2.75
  3.0
  3.25
  3.5
  3.75
  4.0
  4.25
  4.5
  4.75
  5.0
  5.25
  5.5
  5.75
  6.0
  6.25
  6.5
  6.75
  7.0
  7.25
  7.5
  7.75
  8.0
  8.25
  8.5
  8.75
  9.0
  9.25
  9.5
  9.75
 10.0

Interpolations.jl:

julia> int = interpolate(src, BSpline(Cubic(Line(OnGrid()))));

julia> dst = similar(src, size(idx));

julia> dst .= int.(idx)
37-element Array{Float32,1}:
    2.0
    2.4186783
    2.8698852
    3.3861496
    3.9999998
    4.7426863
    5.6403437
    6.717829
    7.9999995
    9.516826
   11.318738
   13.461281
   16.0
   19.002508
   22.584702
   26.874544
   31.999998
   38.098137
   45.342453
   53.91554
   64.0
   75.85494
   90.045494
  107.2133
  128.00002
  152.98212
  182.4756
  216.73128
  255.99998
  301.21658
  356.0521
  424.86154
  511.99997
  620.1514
  745.3159
  881.82245
 1023.99994

So at integral coordinates, e.g. idx[5] = 2.0, dst[5] = 3.9999998f0 which is very close to src[2] = 4f0.

Now the GPU (ignore values at the boundary because the GPU clamps here, and I don't know what Interpolations.jl does):

julia> gpu_idx = CuArray(idx);

julia> gpu_dst = CuArray{Float32}(undef, size(idx));

julia> gpu_src = CuArray(src);

julia> gpu_tex = CuTexture(CuTextureArray(gpu_src); interpolation=CUDA.CubicInterpolation());

julia> broadcast!(gpu_dst, gpu_idx, Ref(gpu_tex)) do idx, tex
           tex[idx]
       end
37-element CuArray{Float32,1}:
   2.3333335
   2.6453452
   3.0859375
   3.6417644
   4.3346357
   5.1520996
   6.1289062
   7.2785645
   8.669271
  10.304199
  12.2578125
  14.557129
  17.338543
  20.608398
  24.515625
  29.114258
  34.677086
  41.216797
  49.03125
  58.228516
  69.35417
  82.43359
  98.0625
 116.45703
 138.70834
 164.86719
 196.125
 232.91406
 277.4167
 329.73438
 392.25
 465.82812
 554.8334
 656.9271
 762.5
 860.69794
 939.00006

Here dst[5] is 4.3346357f0. Maybe I'm misunderstanding something though.

@maleadt maleadt merged commit 6f214bf into master Oct 1, 2020
@maleadt maleadt deleted the tb/texture branch October 1, 2020 09:11
@stillyslalom
Copy link
Contributor

stillyslalom commented Jan 18, 2021

Re: accuracy, naive cubic splines can't guarantee that the interpolant will pass through the control points - a prefiltering step is required (http://www.dannyruijters.nl/docs/cudaPrefilter3.pdf). Interpolations.jl performs the prefiltering step, but I don't see it anywhere here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuda array Stuff about CuArray. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants