Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
50 changes: 50 additions & 0 deletions Changelog.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,54 @@
OpenBLAS ChangeLog
====================================================================
Version 0.3.33
23-Apr-2026

general:
- fixed an incorrect cast in the SBGEMM test case that could lead to spurious test failures
- fixed an invalid memory access in the converted C version of the CBLAS tests
- made the BIGNUMA setting automatic when the number of cores exceeds 256
- Imported recent updates from Reference-LAPACK to realign with its upcoming 3.13.0 release:
- Implement ?LARF1F and ?ORM2R (Reference-LAPACK PRs 1019,1020,1196,1257)
- Change loop order in ?GETC2 to improve performance (Reference-LAPACK PR 1023)
- Change WORK array dimension in ?GELQS/?GEQRS (Reference-LAPACK PR 1094)
- Add NaN checks for input matrix A in ?GEEV (Reference-LAPACK PR 1136)
- Fix support for jobu/v in LAPACKE_?GESVDQ_WORK (Reference-LAPACK PRs 1146,1221)
- Fix display of version number in LAPACK testsuite (Reference-LAPACK PR 1149)
- Fix DGGES test seed to avoid bad matrix cases (Reference-LAPACK PR 1187)
- Fix truncation of large WORK array sizes in ZHE (Reference-LAPACK PR 1195)
- Fix overwriting of LDSWORK parameter in ?TRSYL3 (Reference-LAPACK PR 1206)
- Fix overwriting of error states in some EIG tests (Reference-LAPACK PR 1207)
- Remove unused parameter in DORBDB3/ZUNBDB3 (Reference-LAPACK PR 1209)
- Re-enable testing of ?BB and ?GG driver functions (Reference-LAPACK PR 1211)
- Fix workspace size calculation in ?TGSEN (Reference-LAPACK PR 774)
- Fix typos in the EIG DMD tests and initialized the cutoff variable (PR 1212,1228)
- Optimized looping in ?LACPY/?LASCL/?LANTR with fat matrix and UPLO=L (PR 1251)

arm64:
- worked around a serious miscompilation of the DDOT kernel by GCC15, affecting
most non-SVE targets, and SVE targets in the case of non-unit array stride)
- fixed an accuracy issue in the GEMV kernel for Neoverse V1 and other SVE targets
- fixed broken STRMM and SSYMM in DYNAMIC_ARCH builds when running on non-SME hardware
- added an optimized SHGEMM kernel for Neoverse N2
- fixed DYNAMIC_ARCH builds under Windows on Arm
- Added autodetection of Cortex A75/A76 in DYNAMIC_ARCH builds
- Added autodetection of Neoverse V3, currently supported through V2 kernels
- Re-added support for the "VORTEX" target in DYNAMIC_ARCH builds with DYNAMIC_LIST
- Fixed CMake-based builds that use the "Ninja" generator

loongarch64:
- fixed a build failure due to missing support for the new half-precision float type
- fixed a long-standing bug in asserting 64bit capability in the c_check helper script

x86_64:
- added a workaround for miscompilation of the AVX512 GEMM kernels by LLVM on Windows
- fixed a build failure in the LAED3 code when compiling with MinGW on Windows
- fixed CMake-based compilation with the NVIDIA HPC compiler
- Fixed CMake-based builds that use the "Ninja" generator

wasm:
- added optimized kernels for STRSM and DTRSM

====================================================================
Version 0.3.32
23-Mar-2026
Expand Down
Loading