Releases: ROCm/MIOpen
Releases · ROCm/MIOpen
rocm-5.3.0
ROCm release v5.3.0
MIOpen v2.16.0
Notes
- This release includes enhanced support for MI210 and MI250 and various other improvements.
Changes
- This release consists of various bug fixes and performance improvements
- Improved support for Navi21
- Performance improvements via performance database updates
- Fix various issues in convolution kernels specific to certain ASICs
- Fix an accuracy issue in reduction kernels
- Fix an accuracy issue in Batchnormalization kernels
MIOpen v2.14.0
Notes
- This release consists of various bug fixes and performance improvements
Changes
-
Improved support for Navi21
-
Performance improvements via performance database updates
-
Fix various issues in convolution kernels specific to certain ASICs
-
Fix an accuracy issue in reduction kernels
-
Fix an accuracy issue in Batchnormalization kernels
MIOpen v2.12.0
Notes
- This release includes support for Navi21 and various other bug fixes and performance improvements
Changes
- MIOpen now supports Navi21!! (via MIOpen PRs 973, 780, 764, 740, 739, 677, 660, 653, 493, 498)
- Fixed a correctness issue with ImplicitGemm algorithm
- Updated the performance data for new kernel versions
- Improved MIOpen build time by splitting large kernel header files
- Fixed an issue in reduction kernels for padded tensors
- Various other bug fixes and performance improvements
MIOpen v2.11.0
Notes
- This release contains various bug fixes and performance improvements.
Changes
- Updates for Target ID features in ROCm stack
- Correctness fix in Batchnorm kernels
- Various bug fixes for MIOpenGEMM on the OpenCL backend
- Various bug fixes in 3x3 assembly kernels
MIOpen v2.10.0
Notes
- This release contains new reduction operations, Winograd algorithm performance improvements as well as bug fixes. Various host side performance improvements have been added as well.
Changes
- Added a GPU reference kernel implementation for faster testing.
- Add TargetID support for new AMD GPU architectures.
- Implementation of four additional generic tensor reduction operations (AVG, AMAX, NORM1, NORM2).
- Fixed a bug where Batchnorm would give incorrect results when the product of image height and image width is not a factor of four.
- Various host side improvements for better find and tuning performance.
- Added support for AMD Code Object V4.
MIOpen v2.9.0
Notes:
- This release contains implicit GEMM algorithm performance updates and bug fixes. Additional performance improvements have been implement for batch normalization.
Changes:
- Added new assembly implicit GEMM kernels
- Added batch normalization optimizations
- Fixed issue where miopen-hip backend install would not search for rocBLAS dependency
- Removed missing tunings from previous release cycle
- Removed deprecated implicit GEMM xDLOPs solvers
- Removed incorrect error messages from implicit GEMM solvers
- Disabled ConvAsmBwdWrW3x3 solver for stride > 1 cases
- Disabled bidirectional multi-pass kernels due to stability issues
MIOpen v2.8.0
Notes:
- This release provides additional bug fixes and support for embedded builds using MIOpen as a static library.
Changes:
- Fixed workspace size calculation for GEMM group convolutions
- Fixed performance regression for M/N
- Fixed issue with faulty compiler option
- Fixed typo in components dependency variable in CMakeLists.txt
- Fixed issues with COMgr backed online compilation for HIP kernels
- Added cmake flag for embedding system databases when building a static library
- Added a way to disable building MIOpenDriver when building a static library
- Added CC compiler detection in ROCm environment
- Known issue: This release may show warnings for "obsolete configs" in the performance database. This can be fixed by rerunning tuning on a specific network; see tuning documentation
MIOpen v2.7.0
Notes:
- This release contains a new reduction API; see API documentation for more information. Additional features for embedded builds have been added, and further support for 3D convolutional networks.
Changes:
- Added additional tunings into performance database
- Added general reduction API
- Added cmake flag for embedding binary database into a static MIOpen build
- Added cmake flag for embedding system find-db text files into static MIOpen build
- Fixed issue with GEMM workspace size calculation for backwards data convolutions #381
- Fixed issue with 3D pooling indexing #365
MIOpen v2.6.0
Notes:
- This release contains convolution performance improvements, improved multi-threading behavior, and improved stability for half precision convolutions. Initial iteration time has been reduced with the introduction of hybrid find mode. Builds for a static library have been refined for this release.
Changes:
- Added MIOPEN_FIND_MODE=3 as the new default convolution Find mode; see documentation here for details
- Added a more runtime-parameterized version of pooling to reduce the number of online compilations
- Improved the performance of backwards spatial batch normalization for small images
- Fixed issue with std::logic_error in SQLite deleter #306
- Fixed issues with half precision stability for convolutions
- Fixed issues with multi-threaded SQLite database accesses
- Fixed issues with 3-D convolutions and incorrect parameters
- Fixed various issues with implicit GEMM static assert failures
- Removed inactive implicit GEMM convolution solvers
- Removed SCGEMM convolutional algorithm from MIOpen