Version 0.6: PTX compilation library support
Changes since v0.5.6:
PTX Compilation library
This version introduces a single major change:
- #385 : Support for NVIDIA's PTX compilation library.
Note: The CUDA driver already supports compilation of PTX code, but it has limited supported for various compilation options; plus - it requires a driver to be loaded, i.e. requires kernel involvement and a GPU on your system. This library does not.
Value-vs-reference issues
- #430 : Now passing kernel-like objects by reference rather than by value where relevant in the kernel launch wrapper functions.
- #433 : Now passing program name by value rather than by reference.
Other changes
- #431 : The NVTX wrappers no longer depend on a thread support library
- #436 : The wrapper library now respects
CUDA_NO_HALF
, when you want to avoid CUDA defining thehalf
- #432 : Removed some
std::
rather than::std::
namespace qualifications which had snuck into the codebase recently (which cause trouble with NVIDIA'scuda::std
namespace). - #435 : Updated static data tables for the Ampere/Lovelace (8.x) and Hopper architectures.