You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This release is aligned with the CUDA Tile IR specification included in
CUDA Toolkit 13.2.
New in CUDA Tile 13.2 open-source:
* Extended architecture support: CUDA Tile now supports compute capability 8.X (Ampere, Ada) in addition to 10.X, 11.X, and 12.X (Blackwell) architectures.
* New atan2 math operation.
* Added overflow attribute to cuda_tile.negi to control integer overflow behavior.
* Added rounding_mode attribute to cuda_tile.tanh to control floating-pointrounding behavior.
* Added token result to cuda_tile.print_tko for memory ordering support.
* Added unsignedCmp flag to cuda_tile.for to support unsigned integer comparison for loop termination.
* Renamed cuda_tile.print to cuda_tile.print_tko in the textual format. Bytecode encoding is unchanged and remains backward compatible.
* Bytecode version 13.2 with explicit type tag versioning for improved forward and backward compatibility.
For more information:
* CUDA Tile IR Spec 13.2: https://docs.nvidia.com/cuda/tile-ir/13.2/index.html
* CUDA 13.2 Blog: https://developer.nvidia.com/blog/cuda-13-2-introduces-enhanced-cuda-tile-support-and-new-python-features