Enable setting of kernel attributes via cuFuncSetAttribute #7897

gmarkall · 2022-03-09T21:21:22Z

Presently we support getting some attributes of a kernel e.g. via Function.read_func_attr_all:

Lines 2482 to 2490 in 59fa2e3

    
           def read_func_attr_all(self): 
        
               nregs = self.read_func_attr(enums.CU_FUNC_ATTRIBUTE_NUM_REGS) 
        
               cmem = self.read_func_attr(enums.CU_FUNC_ATTRIBUTE_CONST_SIZE_BYTES) 
        
               lmem = self.read_func_attr(enums.CU_FUNC_ATTRIBUTE_LOCAL_SIZE_BYTES) 
        
               smem = self.read_func_attr(enums.CU_FUNC_ATTRIBUTE_SHARED_SIZE_BYTES) 
        
               maxtpb = self.read_func_attr( 
        
                   enums.CU_FUNC_ATTRIBUTE_MAX_THREADS_PER_BLOCK) 
        
               return FuncAttr(regs=nregs, const=cmem, local=lmem, shared=smem, 
        
                               maxthreads=maxtpb)

However we don't provide a way to set function attributes, which would be useful for example for setting the shared memory carveout (this is an evolution of the functionality provided by cache_config() which allows for finer-grained control over the level of carveout. From the Ampere Tuning Guide:

In the NVIDIA Ampere GPU architecture, the portion of the L1 cache dedicated to shared memory (known as the carveout) can be selected at runtime as in previous architectures such as Volta, using cudaFuncSetAttribute() with the attribute cudaFuncAttributePreferredSharedMemoryCarveout. The NVIDIA A100 GPU supports shared memory capacity of 0, 8, 16, 32, 64, 100, 132 or 164 KB per SM. GPUs with compute capability 8.6 support shared memory capacity of 0, 8, 16, 32, 64 or 100 KB per SM.

(cc @gavinpotter)

The text was updated successfully, but these errors were encountered:

gmarkall added CUDA CUDA related issue/PR feature_request labels Mar 9, 2022

gmarkall added this to the Numba 0.56 RC milestone Mar 9, 2022

sklam modified the milestones: Numba 0.56 RC, Numba 0.57 RC Jun 1, 2022

ekhahniii mentioned this issue Aug 19, 2022

Add a user API for CUDA kernel cache/shared memory configuration #1711

Open

gmarkall added this to Imported in CUDA 0.57 overflow May 2, 2023

gmarkall removed this from the Numba 0.57 RC milestone May 2, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable setting of kernel attributes via cuFuncSetAttribute #7897

Enable setting of kernel attributes via cuFuncSetAttribute #7897

gmarkall commented Mar 9, 2022

Enable setting of kernel attributes via cuFuncSetAttribute #7897

Enable setting of kernel attributes via cuFuncSetAttribute #7897

Comments

gmarkall commented Mar 9, 2022