For a (admittedly corner case) simple 8x8 matrix inversion problem according to the code:
#include <limits>
extern "C"
{
void dgetrf_ (const int *m, const int *n, double *A,
const int *lda, int *ipiv, int *info);
void dgetri_ (const int *n, double *A, const int *lda,
int *ipiv, double *inv_work, const int *lwork, int *info);
}
int main()
{
const int N = 8;
const int lwork = 2*N;
double *mat = new double[N*N];
int *ipiv = new int[N];
double *work = new double[lwork];
int info = 0;
for (int i=0; i<N; ++i)
for (int j=0; j<N; ++j)
mat[i*N+j] = -std::numeric_limits<double>::quiet_NaN();
dgetrf_ (&N, &N, mat, &N, ipiv, &info);
dgetri_ (&N, mat, &N, ipiv, work, &lwork, &info);
return info;
}
I get memory access errors in both the factorization phase and the inversion phase:
mklap4:openblas_bug$ g++ -L/home/kronbichler/sw/lib/ -lopenblas -Wl,-rpath=/home/kronbichler/sw/lib/ -lopenblas test.cc
mklap4:openblas_bug$ valgrind ./a.out
==8649== Memcheck, a memory error detector
==8649== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==8649== Using Valgrind-3.11.0.SVN and LibVEX; rerun with -h for copyright info
==8649== Command: ./a.out
==8649==
==8649== Invalid read of size 8
==8649== at 0x4D0DF3A: dgetf2_k (in /home/kronbichler/sw/lib/libopenblas_haswell-r0.2.14.so)
==8649== by 0x4D0CD75: dgetrf_single (in /home/kronbichler/sw/lib/libopenblas_haswell-r0.2.14.so)
==8649== by 0x4AAEA64: dgetrf_ (in /home/kronbichler/sw/lib/libopenblas_haswell-r0.2.14.so)
==8649== by 0x4008D2: main (in /home/kronbichler/Work/deal_tests/trilinos_tests/openblas_bug/a.out)
==8649== Address 0x67a61e0 is 0 bytes after a block of size 512 alloc'd
==8649== at 0x402D81C: operator new[](unsigned long) (vg_replace_malloc.c:422)
==8649== by 0x400825: main (in /home/kronbichler/Work/deal_tests/trilinos_tests/openblas_bug/a.out)
==8649==
==8649== Invalid write of size 8
==8649== at 0x4D0DF44: dgetf2_k (in /home/kronbichler/sw/lib/libopenblas_haswell-r0.2.14.so)
==8649== by 0x4D0CD75: dgetrf_single (in /home/kronbichler/sw/lib/libopenblas_haswell-r0.2.14.so)
==8649== by 0x4AAEA64: dgetrf_ (in /home/kronbichler/sw/lib/libopenblas_haswell-r0.2.14.so)
==8649== by 0x4008D2: main (in /home/kronbichler/Work/deal_tests/trilinos_tests/openblas_bug/a.out)
==8649== Address 0x67a61e0 is 0 bytes after a block of size 512 alloc'd
==8649== at 0x402D81C: operator new[](unsigned long) (vg_replace_malloc.c:422)
==8649== by 0x400825: main (in /home/kronbichler/Work/deal_tests/trilinos_tests/openblas_bug/a.out)
==8649==
==8649== Invalid read of size 16
==8649== at 0x4BEF93D: dswap_k (in /home/kronbichler/sw/lib/libopenblas_haswell-r0.2.14.so)
==8649== by 0x4AA4B86: dswap_ (in /home/kronbichler/sw/lib/libopenblas_haswell-r0.2.14.so)
==8649== by 0x4E56F20: dgetri_ (in /home/kronbichler/sw/lib/libopenblas_haswell-r0.2.14.so)
==8649== by 0x4008FB: main (in /home/kronbichler/Work/deal_tests/trilinos_tests/openblas_bug/a.out)
==8649== Address 0x67a61e0 is 0 bytes after a block of size 512 alloc'd
==8649== at 0x402D81C: operator new[](unsigned long) (vg_replace_malloc.c:422)
==8649== by 0x400825: main (in /home/kronbichler/Work/deal_tests/trilinos_tests/openblas_bug/a.out)
==8649==
==8649== Invalid write of size 8
==8649== at 0x4BEF942: dswap_k (in /home/kronbichler/sw/lib/libopenblas_haswell-r0.2.14.so)
==8649== by 0x4AA4B86: dswap_ (in /home/kronbichler/sw/lib/libopenblas_haswell-r0.2.14.so)
==8649== by 0x4E56F20: dgetri_ (in /home/kronbichler/sw/lib/libopenblas_haswell-r0.2.14.so)
==8649== by 0x4008FB: main (in /home/kronbichler/Work/deal_tests/trilinos_tests/openblas_bug/a.out)
==8649== Address 0x67a61e0 is 0 bytes after a block of size 512 alloc'd
==8649== at 0x402D81C: operator new[](unsigned long) (vg_replace_malloc.c:422)
==8649== by 0x400825: main (in /home/kronbichler/Work/deal_tests/trilinos_tests/openblas_bug/a.out)
==8649==
==8649== Invalid read of size 16
==8649== at 0x4BEF94F: dswap_k (in /home/kronbichler/sw/lib/libopenblas_haswell-r0.2.14.so)
==8649== by 0x4AA4B86: dswap_ (in /home/kronbichler/sw/lib/libopenblas_haswell-r0.2.14.so)
==8649== by 0x4E56F20: dgetri_ (in /home/kronbichler/sw/lib/libopenblas_haswell-r0.2.14.so)
==8649== by 0x4008FB: main (in /home/kronbichler/Work/deal_tests/trilinos_tests/openblas_bug/a.out)
==8649== Address 0x67a61f0 is 16 bytes after a block of size 512 alloc'd
==8649== at 0x402D81C: operator new[](unsigned long) (vg_replace_malloc.c:422)
==8649== by 0x400825: main (in /home/kronbichler/Work/deal_tests/trilinos_tests/openblas_bug/a.out)
==8649==
==8649== Invalid write of size 8
==8649== at 0x4BEF954: dswap_k (in /home/kronbichler/sw/lib/libopenblas_haswell-r0.2.14.so)
==8649== by 0x4AA4B86: dswap_ (in /home/kronbichler/sw/lib/libopenblas_haswell-r0.2.14.so)
==8649== by 0x4E56F20: dgetri_ (in /home/kronbichler/sw/lib/libopenblas_haswell-r0.2.14.so)
==8649== by 0x4008FB: main (in /home/kronbichler/Work/deal_tests/trilinos_tests/openblas_bug/a.out)
==8649== Address 0x67a61f0 is 16 bytes after a block of size 512 alloc'd
==8649== at 0x402D81C: operator new[](unsigned long) (vg_replace_malloc.c:422)
==8649== by 0x400825: main (in /home/kronbichler/Work/deal_tests/trilinos_tests/openblas_bug/a.out)
==8649==
valgrind: m_mallocfree.c:303 (get_bszB_as_is): Assertion 'bszB_lo == bszB_hi' failed.
valgrind: Heap block lo/hi size mismatch: lo = 576, hi = 18444492273895866368.
This is probably caused by your program erroneously writing past the
end of a heap block and corrupting heap metadata. If you fix any
invalid writes reported by Memcheck, this assertion failure will
probably go away. Please try that before reporting this as a bug.
The error seems to come from the dswap routines that do partial pivoting. The matrix does only contain NaN and inversion makes no sense, but OpenBLAS should not create memory access errors.
I compiled openBLAS from the latest git source but also checked release 0.2.14. Appears on both haswell compilation (see above) and penryn compilation. Compilers: gcc/gfortran 5.2, no other special options in openBLAS build process.
For a (admittedly corner case) simple 8x8 matrix inversion problem according to the code:
I get memory access errors in both the factorization phase and the inversion phase:
The error seems to come from the dswap routines that do partial pivoting. The matrix does only contain NaN and inversion makes no sense, but OpenBLAS should not create memory access errors.
I compiled openBLAS from the latest git source but also checked release 0.2.14. Appears on both haswell compilation (see above) and penryn compilation. Compilers: gcc/gfortran 5.2, no other special options in openBLAS build process.