Skip to content

rocWMMA 1.1.0 for ROCm 5.6.0

Compare
Choose a tag to compare
@rocm-ci rocm-ci released this 28 Jun 23:17
29d5571

Added

  • Added cross-lane operation backends (Blend, Permute, Swizzle and Dpp)
  • Added GPU kernels for rocWMMA unit test pre-process and post-process operations (fill, validation)
  • Added performance gemm samples for half, single and double precision
  • Added rocWMMA cmake versioning
  • Added vectorized support in coordinate transforms
  • Included ROCm smi for runtime clock rate detection
  • Added fragment transforms for transpose and change data layout

Changed

  • Default to GPU rocBLAS validation against rocWMMA
  • Re-enabled int8 gemm tests on gfx9
  • Upgraded to C++17
  • Restructured unit test folder for consistency
  • Consolidated rocWMMA samples common code