You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
However we don't provide a way to set function attributes, which would be useful for example for setting the shared memory carveout (this is an evolution of the functionality provided by cache_config() which allows for finer-grained control over the level of carveout. From the Ampere Tuning Guide:
In the NVIDIA Ampere GPU architecture, the portion of the L1 cache dedicated to shared memory (known as the carveout) can be selected at runtime as in previous architectures such as Volta, using cudaFuncSetAttribute() with the attribute cudaFuncAttributePreferredSharedMemoryCarveout. The NVIDIA A100 GPU supports shared memory capacity of 0, 8, 16, 32, 64, 100, 132 or 164 KB per SM. GPUs with compute capability 8.6 support shared memory capacity of 0, 8, 16, 32, 64 or 100 KB per SM.
Presently we support getting some attributes of a kernel e.g. via
Function.read_func_attr_all
:numba/numba/cuda/cudadrv/driver.py
Lines 2482 to 2490 in 59fa2e3
However we don't provide a way to set function attributes, which would be useful for example for setting the shared memory carveout (this is an evolution of the functionality provided by
cache_config()
which allows for finer-grained control over the level of carveout. From the Ampere Tuning Guide:(cc @gavinpotter)
The text was updated successfully, but these errors were encountered: