HPL-AI Version 2.3c
Kan Wu fixed the bug when n > 65536.
-- [M] src/pgesv/HPLAI_pdgesv.cc
Kan Wu added an option to use the half-precision computeType
in cublasGemmEx (cuda@11: is required):
-- [M] src/blas/HPLAI_blas.cc
Kan Wu renamed the executable "xhplai" to "xhpl_ai", to be
synced to HPL-AI-NVIDIA v1.0.0 in nvidia:hpc-benchmarks:
-- [M] src/Makefile.am
-- [M] testing/Makefile.am
Kan Wu release the software at
https://github.com/SYSU-SCC/sysu-scc-spack-repo:
-- [M] testing/ptest/HPLAI_pdinfo.cc