Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try improve convergence dlaed4 #655

Merged

Conversation

weslleyspereira
Copy link
Collaborator

@weslleyspereira weslleyspereira commented Mar 15, 2022

Description, and suggested fix from Professor Ren-Cang Li, University of Texas at Arlington

Consider the case of computing the last updated eigenvalue using DLAED4. If C=0, ETA is currently set to

DLTUB (upper bound) - TAU

Chances exist that TAU is close to the desired value DLAM - D(N). In such a case, the current initial guess for ETA would be very close to the upper bound (either RHO or RHO/2). As a result, it may take a lot more iterations to get to DLAM - D(N) eventually.

One idea is to use a Newton step instead, i.e.,

ETA = -W / ( DPSI+DPHI )

This is the same initial step adopted when W*ETA.GT.ZERO.

@weslleyspereira weslleyspereira marked this pull request as draft March 15, 2022 22:02
@weslleyspereira weslleyspereira marked this pull request as ready for review March 15, 2022 22:05
@langou langou merged commit 96bd210 into Reference-LAPACK:master Oct 12, 2022
@langou
Copy link
Contributor

langou commented Oct 12, 2022

We are merging this issue. Quick update before the merge. This was a bug report from MathWorks, Intel looked at it, then Prof. Ren-Cang Li proposed a fix, then Intel confirmed that the fix fixed the issue. So we consider this as a legit fix, and we are merging. I hope we can write more on this thread to give more details. Stay tuned.

@langou
Copy link
Contributor

langou commented Oct 16, 2022

For the record, I am adding an explanation of the fix from Prof Ren-Cang Li. This was done over email, so I am copy-pasting his email.

Hi All,

I looked at the data. The trouble happens at computing the last eigenvalue which is usually the easiest. I cannot repeat the problem because I have no access to the machine and the compiler as mentioned in one of the previous emails.

In what follows, I will offer a rough analysis and propose a quick fix.

First D = O(1), Z is normal between 1 and 1e-16, Rho=O(1). So scaling is not necessary. What is special about the example is

D(N-1) approx D(N) to 14 sig. decimal digits
Z(N-1) = O(10^-14), Z(N)=O(10^-15)

All suggest that the sought Lambda(N) approx D(N), and both

Delta(N-1)=D(N-1)- Lambda(N),   Delta(N)=D(N)- Lambda(N)

are tiny at the end. Initially Delta(N-1:N) are even tinier. The question is whether they are so tiny that could cause underflows in

     C = W - DELTA( N-1 )*DPSI - DELTA( N )*DPHI 
     A = ( DELTA( N-1 )+DELTA( N ) )*W -
 $       DELTA( N-1 )*DELTA( N )*( DPSI+DPHI )
     B = DELTA( N-1 )*DELTA( N )*W

to yield 0.0 for all computed A, B, C. The chance is so rare. From these expressions,
there is a chance for B to underflow to 0 because both Delta(N-1:N) are tiny, but it would require roughly both about O(sqrt(10^-308)). My calculation gives me initially

Delta(N-1:N), W : -2.2204e-14, -3.0408e-28, -3.0943e-01

These numbers won't underflow B to 0. As to why A, C are computed to 0, my guess is due to accidental cancellations. As Jim pointed out, the situation is "unstable". It would be interesting to see what these numbers (in the expressions for A, B,C) are before the problem happens.

Now a quick fix. Currently, if C=0, ETA is set to

    UB (upper bound) - Tau (current approx to D(N)- Lambda(N))

which essentially sets the initial approximation to the UB which is far from D(N)- Lambda(N). As a result, it takes a lot more iterations to get to D(N)- Lambda(N) eventually. My proposal is to use a NEWTON step instead, i.e.,

ETA = -W / ( DPSI+DPHI )

(See line 266 of attached delad4.f). It should fix the problem. Let us know how it goes.

best, rencang

@julielangou julielangou added this to the LAPACK 3.10.1 milestone Nov 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants