Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a cuFFT plan cache #3730

Merged
merged 50 commits into from Sep 18, 2020
Merged
Show file tree
Hide file tree
Changes from 42 commits
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
7eb8985
[WIP] checking in...
leofang Aug 5, 2020
3cdd1f2
[WIP] a very simple LRU
leofang Aug 5, 2020
e91cc6e
[WIP] separate linked list from the cache object; improve
leofang Aug 5, 2020
4224ad2
refactored a bit, seems complete
leofang Aug 5, 2020
ea26609
add thread-local storage and helpers from scipy/scipy#12512
leofang Aug 5, 2020
4863ce6
relax size/memsize constraints; add is_enabled flag (WIP)
leofang Aug 6, 2020
d7e65f1
add TODOs
leofang Aug 6, 2020
62f5699
hook up the cache with fft modules; all tests passed
leofang Aug 6, 2020
d7d8cbd
bug fix: forgot to cache the plans...
leofang Aug 6, 2020
011222c
fixes for 1. empty plan, 2. multi-gpu plan
leofang Aug 6, 2020
2ad410b
[WIP] per-device cache
leofang Aug 7, 2020
737d0ab
per-thread, per-device cache!
leofang Aug 7, 2020
583826d
expose scipy apis to cp.fft.config; fix flake8
leofang Aug 7, 2020
561085e
couple get_fft_plan to cache
leofang Aug 7, 2020
3838bc5
fix flake8
leofang Aug 7, 2020
a613866
support caching multi-gpu plans
leofang Aug 7, 2020
7ba50ae
fix flake8
leofang Aug 7, 2020
24c7c64
order class methods (def -> cdef -> cpdef)
leofang Aug 7, 2020
2d157fc
traverse the list from the end
leofang Aug 7, 2020
e123b51
fix ordering
leofang Aug 7, 2020
49fcccd
refactor to simplify PlanCache usage, see below
leofang Aug 9, 2020
edc958f
add show_plan_cache_info(); nicer formatting
leofang Aug 9, 2020
d1eb466
add simple tests
leofang Aug 9, 2020
d8f09ef
[WIP] add cache tests
leofang Aug 9, 2020
763e2b2
improve tests
leofang Aug 9, 2020
21d8440
fix multi-gpu removal bug; more tests
leofang Aug 10, 2020
faad22a
fix flake8
leofang Aug 10, 2020
dac7cb0
more tests
leofang Aug 10, 2020
2b40d6f
test bug fixed
leofang Aug 10, 2020
0f16c04
Merge branch 'master' into cufft_cache
leofang Aug 10, 2020
4bb33b7
ensure multi-gpu memsize is properly set
leofang Aug 10, 2020
79e061b
minor clean-up
leofang Aug 10, 2020
4f91e6e
decouple get_fft_plan from cache
leofang Aug 10, 2020
40565d4
add docstring to PlanCache
leofang Aug 10, 2020
8fa3a9a
add a few Sphinx links
leofang Aug 10, 2020
f482f06
update docstring
leofang Aug 10, 2020
06dcb25
one more test
leofang Aug 10, 2020
2d3a94b
try documenting PlanCache?
leofang Aug 10, 2020
495a116
nitpick: ensure _get_plan_memsize() always gets the device (not neces…
leofang Aug 11, 2020
90c5081
Apply suggestions from code review
leofang Aug 11, 2020
4a55a84
add memsize test
leofang Aug 11, 2020
c4c8549
work around cuda 11 bug
leofang Aug 18, 2020
8ccafd6
Merge branch 'master' into cufft_cache
leofang Aug 28, 2020
e70bd05
Merge branch 'master' into cufft_cache
leofang Sep 9, 2020
b8218d6
make the cache module private
leofang Sep 9, 2020
590647b
fix _util
leofang Sep 9, 2020
13b2686
fix docs
leofang Sep 9, 2020
bfd1de7
record hits/misses
leofang Sep 14, 2020
8d0705c
add cleanup routine to break circular ref
leofang Sep 17, 2020
3784845
use weakref
leofang Sep 17, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
17 changes: 15 additions & 2 deletions cupy/cuda/cufft.pyx
Expand Up @@ -278,6 +278,17 @@ class Plan1d(object):
else:
self._multi_gpu_get_plan(
plan, nx, fft_type, batch, devices, out)
else:
if use_multi_gpus:
# multi-GPU FFT cannot transform 0-size arrays, and attempting
# to create such a plan will error out, but we still need this
# for bookkeeping
if isinstance(devices, (tuple, list)):
self.gpus = list(devices)
elif isinstance(devices, int) and devices > 0:
self.gpus = [i for i in range(int)]
else:
raise ValueError

self.nx = nx
self.fft_type = fft_type
Expand Down Expand Up @@ -327,7 +338,7 @@ class Plan1d(object):
if fft_type != CUFFT_C2C and fft_type != CUFFT_Z2Z:
raise ValueError('Currently for multiple GPUs only C2C and Z2Z are'
' supported.')
if isinstance(devices, list):
if isinstance(devices, (tuple, list)):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be collections.abc.Iterable to match the ValueError message below, but I think it is probably best to keep it as-is so we don't allow iterables like cupy.ndarray.

nGPUs = len(devices)
for i in range(nGPUs):
gpus.push_back(devices[i])
Expand All @@ -336,7 +347,8 @@ class Plan1d(object):
for i in range(nGPUs):
gpus.push_back(i)
else:
raise ValueError('\"devices\" should be an int or a list of int.')
raise ValueError('\"devices\" should be an int or an iterable '
'of int.')
if batch == 1:
if (nx & (nx - 1)) != 0:
raise ValueError('For multi-GPU FFT with batch = 1, the array '
Expand Down Expand Up @@ -724,6 +736,7 @@ class PlanNd(object):
result = cufftSetAutoAllocation(plan, 0)
check_result(result)
self.plan = plan
self.gpus = None # TODO(leofang): support multi-GPU PlanNd

if batch == 0:
work_size = 0
Expand Down