Skip to content

Releases: ROCm/rocWMMA

rocWMMA 1.5.0 for ROCm 6.2.2

27 Sep 16:01
677b441
Compare
Choose a tag to compare

rocWMMA code for ROCm 6.2.2 did not change. The library was rebuilt for the updated ROCm 6.2.2 stack.

rocWMMA 1.5.0 for ROCm 6.2.1

20 Sep 19:58
677b441
Compare
Choose a tag to compare

rocWMMA code for ROCm 6.2.1 did not change. The library was rebuilt for the updated ROCm 6.2.1 stack.

rocWMMA 1.5.0 for ROCm 6.2.0

02 Aug 16:15
677b441
Compare
Choose a tag to compare

Additions

  • Added internal utilities for element-wise vector transforms
  • Added internal utilities for cross-lane vector transforms
  • Implemented internal aos<->soa transforms for block sizes of 16, 32, 64, 128 and 256 and vector widths of 2, 4, 8 and 16
  • Added tests for new internal transforms

Changes

  • Improved loading layouts by increasing vector width for fragments with blockDim > 32
  • API applyDataLayout transform now accepts WaveCount template argument for cooperative fragments
  • API applyDataLayout transform now physically applies aos<->soa transform as necessary
  • Refactored entry-point of std library usage to improve hipRTC support
  • Documentation updates for installation, programmer's guide and API reference

Fixes

  • Fixed some header includes ordering to improve portability

rocWMMA 1.4.0 for ROCm 6.1.2

04 Jun 16:53
7dbd524
Compare
Choose a tag to compare

rocWMMA code for ROCm 6.1.2 did not change. The library was rebuilt for the updated ROCm 6.1.2 stack.

rocWMMA 1.4.0 for ROCm 6.1.1

08 May 18:00
7dbd524
Compare
Choose a tag to compare

rocWMMA code for ROCm 6.1.1 did not change. The library was rebuilt for the updated ROCm 6.1.1 stack.

rocWMMA 1.4.0 for ROCm 6.1.0

16 Apr 19:11
7dbd524
Compare
Choose a tag to compare

Additions

  • Added bf16 support for hipRTC sample

Changes

  • Changed Clang C++ version to C++17
  • Updated rocwmma_coop API
  • Linked rocWMMA to hiprtc

Fixes

  • Fixed compile/runtime arch checks
  • Built all test in large code model
  • Removed inefficient branching in layout loop unrolling

rocWMMA 1.3.0 for ROCm 6.0.2

31 Jan 20:13
4b10c7e
Compare
Choose a tag to compare

rocWMMA code for ROCm 6.0.2 did not change. The library was rebuilt for the updated ROCm 6.0.2 stack.

rocWMMA 1.3.0 for ROCm 6.0.0

15 Dec 18:31
4b10c7e
Compare
Choose a tag to compare

Added

  • Added support for gfx940, gfx941 and gfx942 targets
  • Added support for f8, bf8 and xfloat32 datatypes
  • Added support for HIP_NO_HALF, __ HIP_NO_HALF_CONVERSIONS__ and __ HIP_NO_HALF_OPERATORS__ (e.g. pytorch environment)

Changed

  • rocWMMA with hipRTC now supports bfloat16_t datatype
  • gfx11 wmma now uses lane swap instead of broadcast for layout adjustment
  • Updated samples GEMM parameter validation on host arch

Fixed

  • Disabled gtest static library deployment
  • Extended tests now build in large code model

rocWMMA 1.2.0 for ROCm 5.7.1

13 Oct 19:00
b5a884d
Compare
Choose a tag to compare

rocWMMA code for ROCm 5.7.1 did not change. The library was rebuilt for the updated ROCm 5.7.1 stack.

rocWMMA 1.2.0 for ROCm 5.7.0

15 Sep 17:29
Compare
Choose a tag to compare

Changed

  • Fixed a bug with synchronization
  • Updated rocWMMA cmake versioning