Skip to content

Floating point exception encountered in svdvals(A) originating from a call to stdlib_ieeeck #932

@loiseaujc

Description

@loiseaujc

Description

I've been playing around with the new matrix norm function mnorm and found that a floating point exception (a division by zero in particular) is raised if I compile the run code with

fpm run --compiler "gfortran" --flag "-Og -g3 -fbacktrace -ffpe-trap=zero"

After some quick investigation, it seems that the culprit is somewhere in the calling tree of svdvals. Here is a MWE.

program main
  use stdlib_linalg_constants, only: dp
  use stdlib_linalg, only: mnorm, svdvals, svd
  implicit none
  integer, parameter :: n = 3
  real(dp) :: A(n, n)
  real(dp) :: S(n)
  !> Initialize random matrix.
  call random_number(A)
  !> Compute its singular values.
  S = svdvals(A)
end program main

When compiling with the command above, I get

Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.

Backtrace for this error:
#0  0x7f907d4b851f in ???
	at ./signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0
#1  0x555ee55a277a in __stdlib_linalg_lapack_aux_MOD_stdlib_ieeeck
	at build/dependencies/stdlib/src/stdlib_linalg_lapack_aux.F90:136
#2  0x555ee55a3a01 in __stdlib_linalg_lapack_aux_MOD_stdlib_ilaenv
	at build/dependencies/stdlib/src/stdlib_linalg_lapack_aux.F90:1369
#3  0x555ee5788b76 in __stdlib_lapack_eig_svd_lsq_MOD_stdlib_dlasq2
	at build/dependencies/stdlib/src/stdlib_lapack_svd_bidiag_qr.f90:725
#4  0x555ee578ac7e in __stdlib_lapack_eig_svd_lsq_MOD_stdlib_dlasq1
	at build/dependencies/stdlib/src/stdlib_lapack_svd_bidiag_qr.f90:169
#5  0x555ee56e08e5 in __stdlib_lapack_eig_svd_lsq_MOD_stdlib_dbdsqr
	at build/dependencies/stdlib/src/stdlib_lapack_eigv_svd_drivers3.f90:561
#6  0x555ee57d5860 in __stdlib_lapack_eig_svd_lsq_MOD_stdlib_dlasdq
	at build/dependencies/stdlib/src/stdlib_lapack_eigv_svd_bidiag_dc.f90:3504
#7  0x555ee56da988 in __stdlib_lapack_eig_svd_lsq_MOD_stdlib_dbdsdc
	at build/dependencies/stdlib/src/stdlib_lapack_eigv_svd_drivers3.f90:2239
#8  0x555ee559ab05 in __stdlib_lapack_eig_svd_lsq_MOD_stdlib_dgesdd
	at build/dependencies/stdlib/src/stdlib_lapack_eigv_svd_drivers2.f90:1517
#9  0x555ee554c060 in __stdlib_linalg_MOD_stdlib_linalg_svd_d
	at build/dependencies/stdlib/src/stdlib_linalg_svd.f90:489
#10  0x555ee554dbff in __stdlib_linalg_MOD_stdlib_linalg_svdvals_d
	at build/dependencies/stdlib/src/stdlib_linalg_svd.f90:320
#11  0x555ee55446c4 in MAIN__
	at app/main.f90:11
#12  0x555ee554472d in main
	at app/main.f90:2
Floating point exception (core dumped)
<ERROR> Execution for object " MWE_stdlib_mnorm " returned exit code  136
<ERROR> *cmd_run*:stopping due to failed executions
STOP 136

I get something very similar if I compile with ifx instead

<INFO> BUILD_NAME: build/ifx
 <INFO> COMPILER:  ifx
 <INFO> C COMPILER:  icx
 <INFO> CXX COMPILER: icpx
 <INFO> COMPILER OPTIONS:  -fpp -O0 -g -traceback -check all -check bounds -check uninit, -ftrapuv -debug all
 <INFO> C COMPILER OPTIONS:  
 <INFO> CXX COMPILER OPTIONS: 
 <INFO> LINKER OPTIONS:  
 <INFO> INCLUDE DIRECTORIES:  []
[100%] Project compiled successfully.
 + build/ifx_A1EB479C3608C637/app/MWE_stdlib_mnorm 
forrtl: error (73): floating divide by zero
Image              PC                Routine            Line        Source             
MWE_stdlib_mnorm   000000000046E723  Unknown               Unknown  Unknown
libc.so.6          00007F70F79C1520  Unknown               Unknown  Unknown
MWE_stdlib_mnorm   0000000000BF00BC  stdlib_ieeeck             136  stdlib_linalg_lapack_aux.F90
MWE_stdlib_mnorm   0000000000C22DDF  stdlib_ilaenv            1369  stdlib_linalg_lapack_aux.F90
MWE_stdlib_mnorm   00000000040422C5  stdlib_dlasq2             725  stdlib_lapack_svd_bidiag_qr.f90
MWE_stdlib_mnorm   00000000040168FD  stdlib_dlasq1             170  stdlib_lapack_svd_bidiag_qr.f90
MWE_stdlib_mnorm   0000000002E9A13B  stdlib_dbdsqr             562  stdlib_lapack_eigv_svd_drivers3.f90
MWE_stdlib_mnorm   0000000004B24626  stdlib_dlasdq            3547  stdlib_lapack_eigv_svd_bidiag_dc.f90
MWE_stdlib_mnorm   0000000002F4C703  stdlib_dbdsdc            2238  stdlib_lapack_eigv_svd_drivers3.f90
MWE_stdlib_mnorm   00000000005F3BCE  stdlib_dgesdd            1516  stdlib_lapack_eigv_svd_drivers2.f90
MWE_stdlib_mnorm   00000000004E11A5  stdlib_linalg_svd         488  stdlib_linalg_svd.f90
MWE_stdlib_mnorm   00000000004C4A91  stdlib_linalg_svd         320  stdlib_linalg_svd.f90
MWE_stdlib_mnorm   000000000049A926  main                       11  main.f90
MWE_stdlib_mnorm   000000000040E339  Unknown               Unknown  Unknown
libc.so.6          00007F70F79A8D90  Unknown               Unknown  Unknown
libc.so.6          00007F70F79A8E40  __libc_start_main     Unknown  Unknown
MWE_stdlib_mnorm   000000000040E236  Unknown               Unknown  Unknown
Aborted
<ERROR> Execution for object " MWE_stdlib_mnorm " returned exit code  134
<ERROR> *cmd_run*:stopping due to failed executions
STOP 134

Here are the first few lines of stdlib_ieeeck:

 pure integer(ilp) function stdlib_ieeeck( ispec, zero, one )
 !! IEEECK is called from the ILAENV to verify that Infinity and
 !! possibly NaN arithmetic is safe (i.e. will not trap).
    ! -- lapack auxiliary routine --
    ! -- lapack is a software package provided by univ. of tennessee,    --
    ! -- univ. of california berkeley, univ. of colorado denver and nag ltd..--
       ! Scalar Arguments 
       integer(ilp), intent(in) :: ispec
       real(sp), intent(in) :: one, zero
    ! =====================================================================
       ! Local Scalars 
       real(sp) :: nan1, nan2, nan3, nan4, nan5, nan6, neginf, negzro, newzro, posinf
       ! Executable Statements 
       stdlib_ieeeck = 1
       posinf = one / zero

The error seems to originate from this last line with the division by zero being trapped when compiling with -ffpe-trap=zero. This seems to be a recuring cause of concerns when working in debug mode (see here or here for instance). It probably is more of a lapack issue than stdlib per say but even so, if any one knows a work-around that'd be great. Maybe a known issues page could be added to the stdlib website where this error is being referenced for everyone to be aware.

Expected Behaviour

It should just work I guess.

Version of stdlib

0.7

Platform and Architecture

Ubuntu 22.04.5 LTS

Additional Information

Compilers tested:

  • gfortran 14.2.0
  • ifx 2025.0.4 (20241205)

Some additional observations:

  • If instead of a random matrix, I instantiate the matrix as A = eye(n, mold=1.0_dp), the whole thing works correctly.
  • If I do call svd(A, S, U, Vt), the whole thing works correctly.

Ping @perazz, @jvdp1, @jalvesz

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions