Try improve convergence dlaed4 #655

weslleyspereira · 2022-03-15T21:56:08Z

Description, and suggested fix from Professor Ren-Cang Li, University of Texas at Arlington

Consider the case of computing the last updated eigenvalue using DLAED4. If C=0, ETA is currently set to

DLTUB (upper bound) - TAU

Chances exist that TAU is close to the desired value DLAM - D(N). In such a case, the current initial guess for ETA would be very close to the upper bound (either RHO or RHO/2). As a result, it may take a lot more iterations to get to DLAM - D(N) eventually.

One idea is to use a Newton step instead, i.e.,

ETA = -W / ( DPSI+DPHI )

This is the same initial step adopted when W*ETA.GT.ZERO.

langou · 2022-10-12T01:39:15Z

We are merging this issue. Quick update before the merge. This was a bug report from MathWorks, Intel looked at it, then Prof. Ren-Cang Li proposed a fix, then Intel confirmed that the fix fixed the issue. So we consider this as a legit fix, and we are merging. I hope we can write more on this thread to give more details. Stay tuned.

langou · 2022-10-16T16:54:25Z

For the record, I am adding an explanation of the fix from Prof Ren-Cang Li. This was done over email, so I am copy-pasting his email.

Hi All,

I looked at the data. The trouble happens at computing the last eigenvalue which is usually the easiest. I cannot repeat the problem because I have no access to the machine and the compiler as mentioned in one of the previous emails.

In what follows, I will offer a rough analysis and propose a quick fix.

First D = O(1), Z is normal between 1 and 1e-16, Rho=O(1). So scaling is not necessary. What is special about the example is
D(N-1) approx D(N) to 14 sig. decimal digits
Z(N-1) = O(10^-14), Z(N)=O(10^-15)
All suggest that the sought Lambda(N) approx D(N), and both
Delta(N-1)=D(N-1)- Lambda(N),   Delta(N)=D(N)- Lambda(N)
are tiny at the end. Initially Delta(N-1:N) are even tinier. The question is whether they are so tiny that could cause underflows in
     C = W - DELTA( N-1 )*DPSI - DELTA( N )*DPHI 
     A = ( DELTA( N-1 )+DELTA( N ) )*W -
 $       DELTA( N-1 )*DELTA( N )*( DPSI+DPHI )
     B = DELTA( N-1 )*DELTA( N )*W
to yield 0.0 for all computed A, B, C. The chance is so rare. From these expressions,
there is a chance for B to underflow to 0 because both Delta(N-1:N) are tiny, but it would require roughly both about O(sqrt(10^-308)). My calculation gives me initially

Delta(N-1:N), W : -2.2204e-14, -3.0408e-28, -3.0943e-01

These numbers won't underflow B to 0. As to why A, C are computed to 0, my guess is due to accidental cancellations. As Jim pointed out, the situation is "unstable". It would be interesting to see what these numbers (in the expressions for A, B,C) are before the problem happens.

Now a quick fix. Currently, if C=0, ETA is set to
    UB (upper bound) - Tau (current approx to D(N)- Lambda(N))
which essentially sets the initial approximation to the UB which is far from D(N)- Lambda(N). As a result, it takes a lot more iterations to get to D(N)- Lambda(N) eventually. My proposal is to use a NEWTON step instead, i.e.,
ETA = -W / ( DPSI+DPHI )
(See line 266 of attached delad4.f). It should fix the problem. Let us know how it goes.

best, rencang

weslleyspereira marked this pull request as draft March 15, 2022 22:02

Improve dlaed4 thanks to Li, Ren-Cang

db0abe5

weslleyspereira force-pushed the try-improve-convergence-dlaed4 branch from e252100 to db0abe5 Compare March 15, 2022 22:05

weslleyspereira marked this pull request as ready for review March 15, 2022 22:05

Improve slaed4 following the same strategy fromdlaed4

5a392d9

langou approved these changes Oct 12, 2022

View reviewed changes

langou merged commit 96bd210 into Reference-LAPACK:master Oct 12, 2022

julielangou added this to the LAPACK 3.10.1 milestone Nov 12, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Try improve convergence dlaed4 #655

Try improve convergence dlaed4 #655

weslleyspereira commented Mar 15, 2022 •

edited by langou

Loading

langou commented Oct 12, 2022

langou commented Oct 16, 2022

Try improve convergence dlaed4 #655

Try improve convergence dlaed4 #655

Conversation

weslleyspereira commented Mar 15, 2022 • edited by langou Loading

langou commented Oct 12, 2022

langou commented Oct 16, 2022

weslleyspereira commented Mar 15, 2022 •

edited by langou

Loading