Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce cuFFT plan cache; switch to auto-managed memory. #1734

Merged
merged 1 commit into from
Jan 20, 2023

Conversation

maleadt
Copy link
Member

@maleadt maleadt commented Jan 19, 2023

This PR introduces a cache for CUFFT handles, much like how cupy recently did.

Before:

julia> @benchmark fft($A)
BenchmarkTools.Trial: 1166 samples with 1 evaluation.
 Range (min … max):  2.479 ms … 281.481 ms  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     3.999 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   4.274 ms ±   8.145 ms  ┊ GC (mean ± σ):  0.00% ± 0.00%

                    ▃ ▁  ▁ ▂▄█▇▃▃▅▂▄▃▃ ▁ ▁▂▃▂▅ ▁  ▁
  ▃▁▁▁▂▁▂▃▁▃▄▃▆▇▇████▇██▆███████████████▇███████▇██▇▇▇▄▅▄▃▃▃▃ ▅
  2.48 ms         Histogram: frequency by time        5.26 ms <

 Memory estimate: 2.81 KiB, allocs estimate: 64.

After:

julia> @benchmark fft($A)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):   84.129 μs … 805.442 μs  ┊ GC (min … max):  0.00% … 91.37%
 Time  (median):     275.457 μs               ┊ GC (median):    87.83%
 Time  (mean ± σ):   276.244 μs ±   9.010 μs  ┊ GC (mean ± σ):  87.78% ±  0.98%

                                                           █▂
  ▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂██▄▂ ▂
  84.1 μs          Histogram: frequency by time          286 μs <

Reference:

❯ python wip.py
fft_func            :    CPU:   94.263 us   +/-16.328 (min:   68.750 / max:  421.615) us     GPU-0:  110.159 us   +/-31.509 (min:   42.400 / max: 2295.744) us

Closes #1682.

@maleadt maleadt added cuda libraries Stuff about CUDA library wrappers. performance How fast can we go? labels Jan 19, 2023
@codecov
Copy link

codecov bot commented Jan 19, 2023

Codecov Report

Base: 59.69% // Head: 59.71% // Increases project coverage by +0.02% 🎉

Coverage data is based on head (d90a32c) compared to base (c35bc08).
Patch coverage: 93.52% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1734      +/-   ##
==========================================
+ Coverage   59.69%   59.71%   +0.02%     
==========================================
  Files         150      150              
  Lines       12067    12077      +10     
==========================================
+ Hits         7203     7212       +9     
- Misses       4864     4865       +1     
Impacted Files Coverage Δ
lib/utils/cache.jl 94.28% <90.90%> (-1.55%) ⬇️
lib/cufft/fft.jl 83.92% <93.54%> (-3.67%) ⬇️
lib/cufft/wrappers.jl 94.11% <93.81%> (-5.89%) ⬇️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@maleadt maleadt merged commit 72b0796 into master Jan 20, 2023
@maleadt maleadt deleted the tb/cufft_handle_cache branch January 20, 2023 12:57
simonbyrne pushed a commit to simonbyrne/CUDA.jl that referenced this pull request Nov 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuda libraries Stuff about CUDA library wrappers. performance How fast can we go?
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CUDA.jl cuFFT underperforming against CuPy cuFFT
1 participant