Skip to content

rocBLAS-2.1.0

Compare
Choose a tag to compare
@bragadeesh bragadeesh released this 01 Feb 02:27
· 3432 commits to master since this release

Changelist:

  • Refactor rocBLAS test framework
  • Improved performance of i8_r/i32_r rocblas_gemm_ex on gfx906
  • Addition of simple trsv implementation using trsm
  • Improved performance of trsm
  • Tuning improvements for resnet50 problems
  • Update tuning to use new Tensile solution selection logic
  • rocblas_gemm_ex performance improvement when ldd == lcc and strideD == strideC
  • Bug fixes for IAMIN and TRSV
  • Add sphinx based readthedoc documentation