Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SGESDD throws a floating point exception #348

Closed
Fulguritus opened this issue Feb 26, 2014 · 11 comments
Closed

SGESDD throws a floating point exception #348

Fulguritus opened this issue Feb 26, 2014 · 11 comments

Comments

@Fulguritus
Copy link

Calling SGESDD with a non-square matrix throws a floating-point exception:

Program received signal SIGFPE, Arithmetic exception.
sgemv_t_NEHALEM () at ../kernel/x86_64/sgemv_t.S:1716
1716        haddps  %xmm11, %xmm10

I'm using the current development version:

git describe HEAD
v0.2.9.rc1-21-g322a178

Compiled with

make USE_OPENMP=1 COMMON_OPT="-g -fbacktrace" DYNAMIC_ARCH=1

Here is a minimum (non-)working example - since the matrix is filled with random numbers, it only works roughly seven out of ten times... DGESDD, CGESDD, and ZGESDD seem to be unaffected.

program svd_test
  use, intrinsic :: ISO_Fortran_env
  implicit none
  integer,parameter         :: m=1500, n=750
  real(REAL32)              :: A(m,n)
  real(REAL32)              :: S(min(m,n))
  real(REAL32)              :: U(m,m), Vt(n,n)
  ! Local variables 
  real(REAL32),allocatable  :: work(:)
    !< Workspace for LAPACK - SVD
  integer,allocatable       :: iwork(:)
    !< Workspace for LAPACK - Divide and Conquer SVD
  integer                   :: lwork
    !< size of WORK(:), optimal size obtained by LAPACK
  integer                   :: lda, ldu, ldvt
  character(len=1),parameter:: jobz='A'
  integer                   :: stat

  lda = m
  ldu = m
  ldvt = n

  call init_random_seed()
  call random_number( A )

  ! Get the optimal size for the work array
  lwork = -1
  allocate( iwork(8*min(M,N)), work(1), stat=stat )
  if ( stat /= 0 ) stop 'Cannot allocate memory! '

  call SGESDD( jobz, m, n, A, lda, S, U, ldu, VT, ldvt, work, lwork, &
               iwork, stat )
  if ( stat /= 0 ) stop 'Obtaining the optimal work array size failed! '

  ! The first element contains the optimal work array size
  lwork = nint( work(1) )
  deallocate( work ) ; allocate( work(lwork), stat=stat )
  if ( stat /= 0 ) stop 'Cannot allocate memory! '

  ! Now the work array is of optimal size we can actually compute the SVD
  call SGESDD( jobz, m, n, A, lda, S, U, ldu, VT, ldvt, work, lwork, &
               iwork, stat )
  if ( stat /= 0 ) stop 'Error while computing the SVD! '

  deallocate(iwork, work)

end program

!> Taken from http://gcc.gnu.org/onlinedocs/gcc-4.3.5/gfortran/RANDOM_005fSEED.html
SUBROUTINE init_random_seed()
  INTEGER :: i, n, clock
  INTEGER, DIMENSION(:), ALLOCATABLE :: seed

  CALL RANDOM_SEED(size = n)
  ALLOCATE(seed(n))

  CALL SYSTEM_CLOCK(COUNT=clock)

  seed = clock + 37 * (/ (i - 1, i = 1, n) /)
  CALL RANDOM_SEED(PUT = seed)

  DEALLOCATE(seed)
END SUBROUTINE

Compiled with

gfortran -L${HOME}/openblas -lopenblas -Wall -Wextra -g -fbacktrace \
              -Wuninitialized -O -ffpe-trap=invalid,zero,overflow  svd_test.F90

My version of gfortran is

gfortran -v
Using built-in specs.
COLLECT_GCC=gfortran
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/4.8.2/lto-wrapper
Target: x86_64-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-bootstrap --enable-shared --enable-threads=posix --enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-linker-hash-style=gnu --enable-languages=c,c++,objc,obj-c++,java,fortran,ada,go,lto --enable-plugin --enable-initfini-array --enable-java-awt=gtk --disable-dssi --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre --enable-libgcj-multifile --enable-java-maintainer-mode --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --disable-libjava-multilib --with-isl=/builddir/build/BUILD/gcc-4.8.2-20131212/obj-x86_64-redhat-linux/isl-install --with-cloog=/builddir/build/BUILD/gcc-4.8.2-20131212/obj-x86_64-redhat-linux/cloog-install --with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux
Thread model: posix
gcc version 4.8.2 20131212 (Red Hat 4.8.2-7) (GCC) 

Using ifort and MKL works all the time.

@martin-frbg
Copy link
Collaborator

Bisected down to the earliest alphas without finding a "good" version. In fact, the "invalid operation" trap is reproducible with libgoto-1.13 as well, while netlib lapack appears to be unaffected.

@Fulguritus
Copy link
Author

I also get this error for a QR decomposition using SGEQRF and SORGQR... So far, SGEQP3 is working fine.

@wernsaar
Copy link
Contributor

On 27.02.2014 18:02, Alexander Vogt wrote:

I also get this error for a QR decomposition using SGEQRF and SORGQR...


Reply to this email directly or view it on GitHub:
#348 (comment)
Hi,

I did some tests with lapack-3.5.0 and saw that some functions in
Openblas like sdsdot , srotmg and drotmg
are buggy. A lot of other functions return wrong error-exits for false
parameters.

I will do a lot of other tests in the next days and will provide
bugfixes for the blas functions.

Please be patient,

Werner

@xianyi
Copy link
Collaborator

xianyi commented Mar 1, 2014

Hi all,
Sorry for the late replying. I just met a project deadline a few days ago.
I think srotmg and drotmg are very similar to netlib Fortran version.

Xianyi

@wernsaar
Copy link
Contributor

wernsaar commented Mar 1, 2014

On 01.03.2014 16:15, Zhang Xianyi wrote:

Hi all,
Sorry for the late replying. I just met a project deadline a few days ago.
I think srotmg and drotmg are very similar to netlib Fortran version.

Xianyi


Reply to this email directly or view it on GitHub:
#348 (comment)
Hi,

I rewrote srotmg and drotmg and pushed the new sorce code
to the github repository.

Best regards
Werner

@Fulguritus
Copy link
Author

I still get the exception in sgemv_t_NEHALEM with the latest verion :(

@Fulguritus
Copy link
Author

@wernsaar Any updates on this issue? At the moment OpenBLAS is segfaulting almost every-time when using single precision operations.

@wernsaar
Copy link
Contributor

wernsaar commented May 6, 2014

On 06.05.2014 11:19, Alexander Vogt wrote:

@wernsaar Any updates on this issue? At the moment OpenBLAS is segfaulting almost every-time when using single precision operations.


Reply to this email directly or view it on GitHub:
#348 (comment)
Hi,

I will look at the code and will try to fix the bugs
I think, this will take 2 or 3 days.

Best regards,

Werner

@Fulguritus
Copy link
Author

Thanks a lot!

Regards,
Alex

@wernsaar
Copy link
Contributor

wernsaar commented May 6, 2014

On 06.05.2014 11:55, Alexander Vogt wrote:

Thanks a lot!

Regards,
Alex


Reply to this email directly or view it on GitHub:
#348 (comment)
Hi,

I pushed bug fixes to the git develop branch.
Your fortran code svd_test runs now very well.
Please test it again and report the result.

There is still a bug regarding QR- or Cholesky decomposition,
but only for big matrix sizes. I will also try to find this bug.

Best regards

Werner

@Fulguritus
Copy link
Author

Hi Werner,

your fixes work for me - Thanks!

Regards,
Alex

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants