Skip to content

v13.2.0: [Release] CUDA Tile IR 13.2.0

Choose a tag to compare

@dcaballe dcaballe released this 24 Mar 01:32
This release is aligned with the CUDA Tile IR specification included in
CUDA Toolkit 13.2.

New in CUDA Tile 13.2 open-source:
  * Extended architecture support: CUDA Tile now supports compute capability 8.X (Ampere, Ada) in addition to 10.X, 11.X, and 12.X (Blackwell) architectures.
  * New atan2 math operation.
  * Added overflow attribute to cuda_tile.negi to control integer overflow behavior.
  * Added rounding_mode attribute to cuda_tile.tanh to control floating-pointrounding behavior.
  * Added token result to cuda_tile.print_tko for memory ordering support.
  * Added unsignedCmp flag to cuda_tile.for to support unsigned integer comparison for loop termination.
  * Renamed cuda_tile.print to cuda_tile.print_tko in the textual format. Bytecode encoding is unchanged and remains backward compatible.
  * Bytecode version 13.2 with explicit type tag versioning for improved forward and backward compatibility.

For more information:
  * CUDA Tile IR Spec 13.2: https://docs.nvidia.com/cuda/tile-ir/13.2/index.html
  * CUDA 13.2 Blog: https://developer.nvidia.com/blog/cuda-13-2-introduces-enhanced-cuda-tile-support-and-new-python-features