Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adaptive-depth anderson #939

Open
wants to merge 9 commits into
base: master
Choose a base branch
from
Open

Adaptive-depth anderson #939

wants to merge 9 commits into from

Conversation

mfherbst
Copy link
Member

@mfherbst mfherbst commented Dec 27, 2023

Superseeds #719.

Todo:

  • Some testing
  • Unit tests

@mfherbst mfherbst mentioned this pull request Dec 27, 2023
@mfherbst mfherbst marked this pull request as ready for review December 27, 2023 19:06
@mfherbst
Copy link
Member Author

From a few experiments, just limiting the window size using adaptive depth anderson is rarely a good idea as the obtained sizes can get too large to be practical. Thus we for sure need both a maximal m and the adaptive depth algorithm is only a bonus if "smaller values also make sense". The condition number check is in principle obsolete (adaptive depth achieves the same thing), but probably does not hurt either.

For now I've left m=10, to upper-bound the required memory for handling the history, but we might want to increase that in the future.

@antoine-levitt
Copy link
Member

Yeah I think mmax is always going to make sense. I have to reread the paper but I think I remember that it's just an alternative way to monitor condition number (both are essentially measuring the size of the extrapolation coefficients). I think the version of the code in master is basically sound. @msdupuy might know better.

One thing I wanted to explore was to annotate each residual vector with an error bar, coming additively both from the approximate eigensolve (which we can try to estimate from the eigenvalue residuals) and from some estimate of the nonlinear effects (which we maybe try to model as an isotropic quadratic), and take that into account in the least squares. I never got around to it, but do ping me if you're interested in exploring this kind of things further.

@mfherbst
Copy link
Member Author

I think the version of the code in master is basically sound.

So you suggest we don't do this PR at all ? I mean what I like about it is that it catches the issues before computing the considerably more costly dot products and it overall seems to do a better job in shrinking the history when needed (from the few experiments I did).

One thing I wanted to explore was to annotate each residual vector with an error bar, coming additively both from the approximate eigensolve (which we can try to estimate from the eigenvalue residuals) and from some estimate of the nonlinear effects (which we maybe try to model as an isotropic quadratic), and take that into account in the least squares. I never got around to it, but do ping me if you're interested in exploring this kind of things further.

Hmm. Are you sure that would be worth it, given that it probably turns out to be quite crude (especially closer to convergence) ? Or was your thinking to only employ this for the initial phase to avoid Anderson putting too much trust into a potentially sketchy result ? For that it could be useful, given that the bounds are not completely random. How do you want to get the insight on the isotropic quadratic, just using the points the SCF anyway follows ? Happy to discuss at some point.

@mfherbst
Copy link
Member Author

mfherbst commented Jan 10, 2024

@antoine-levitt Thoughts on the above ?

# We need to solve 0 = M' Pfxₙ + M'M βs <=> βs = - (M'M)⁻¹ M' Pfxₙ

# Ensure the condition number of M stays below maxcond, else prune the history
# TODO This is too be tested, but in theory the adaptive-depth DIIS mechanism
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo


# Ensure the condition number of M stays below maxcond, else prune the history
# TODO This is too be tested, but in theory the adaptive-depth DIIS mechanism
# we implement, should ensure the condition number to stay bounded as well.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no comma

# we implement, should ensure the condition number to stay bounded as well.
Mfac = qr(M)
while size(M, 2) > 1 && cond(Mfac.R) > anderson.maxcond
M = M[:, 2:end] # Drop oldest entry in history
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

view?

@antoine-levitt
Copy link
Member

Merge the two mechanisms? Ie drop vectors with biggest errors (instead of oldest ones) until conditioning decreases below the acceptable threshold. For safety don't drop say the last iterate and the one before.

@antoine-levitt
Copy link
Member

I mean what I like about it is that it catches the issues before computing the considerably more costly dot products

Are they really expensive?

What I liked in the current version is its simplicity, but if dropping based on residual norm is better then sure go ahead.

Also might be better to monitor the norm of the extrapolation coefficients rather than the conditioning. Idk there's so much possibilities it scares me.

@antoine-levitt
Copy link
Member

Hmm. Are you sure that would be worth it, given that it probably turns out to be quite crude (especially closer to convergence) ? Or was your thinking to only employ this for the initial phase to avoid Anderson putting too much trust into a potentially sketchy result ? For that it could be useful, given that the bounds are not completely random. How do you want to get the insight on the isotropic quadratic, just using the points the SCF anyway follows ? Happy to discuss at some point.

The point is to have something a bit more principled than the binary keep/discard mechanism; instead of having a weight of 1 (keep) or 0 (discard), have something in between. But it's pretty hazy how to do this properly, so more research is needed (tm).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants