Cache numba.cuda functions on repeated deserialization #4590

mrocklin · 2019-09-19T00:31:13Z

As of #3026 , Numba kindly returns the same function when an equivalent bytestring is deserialized many times. This is great for systems like Dask, which may send around the same numba function many times.

Currently, it looks like this isn't being done for numba.cuda functions, which ends up being a bottleneck in Dask + Numba GPU workloads.

cc @seibert

seibert · 2019-09-19T12:58:03Z

Notes to myself: Looking more closely, it seems that serialization of CUDA functions is happening in a different way than the CPU Dispatcher objects. The function cache needs to hang off of numba.cuda.compiler.CUDAKernel and related classes. (Grep for __reduce__ to find them all.)

seibert · 2019-10-08T20:28:14Z

@sklam: There are 5 different kinds of CUDA objects which can be serialized (all in numba.cuda.compiler):

DeviceFunctionTemplate
DeviceFunction
CUDAKernelBase
CachedCUFunction
CUDAKernel

Given the range here, I'm wondering if I need to figure out some kind of metaclass that does the common behavior of:

Generates a new UUID when object is initialized
Creates a class-level _memo (weak dict of UUID->object) and _recent (fixed size deque of objects to ensure recently deserialized objects stay in _memo)
Provides some generic implementation of reduce and rebuild that prefer the cached version of the function

Given the above, I think we'll need to use this metaclass (or mixin, not sure how meta we need to get) with 7 different classes. 2 on CPU and 5 on GPU

mrocklin mentioned this issue Sep 19, 2019

Slow performance on repeated deserialization #3026

Closed

stuartarchibald assigned seibert Sep 19, 2019

stuartarchibald added CUDA CUDA related issue/PR needtriage labels Sep 19, 2019

stuartarchibald added feature_request performance - run time Performance issue occurring at run time. and removed needtriage labels Sep 11, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cache numba.cuda functions on repeated deserialization #4590

Cache numba.cuda functions on repeated deserialization #4590

mrocklin commented Sep 19, 2019 •

edited

seibert commented Sep 19, 2019

seibert commented Oct 8, 2019

Cache numba.cuda functions on repeated deserialization #4590

Cache numba.cuda functions on repeated deserialization #4590

Comments

mrocklin commented Sep 19, 2019 • edited

seibert commented Sep 19, 2019

seibert commented Oct 8, 2019

mrocklin commented Sep 19, 2019 •

edited