Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document that it is not save to reuse SCF callbacks #504

Open
umbriquse opened this issue Jul 26, 2021 · 20 comments
Open

Document that it is not save to reuse SCF callbacks #504

umbriquse opened this issue Jul 26, 2021 · 20 comments
Labels
documentation Correcting or expanding documentation good first issue Good for newcomers hacktoberfest A good beginners issue for the hacktoberfest

Comments

@umbriquse
Copy link

I wrote an optimizer to predict bonding energies, but I am currently running into issues with the Eigensolver not converging in the self_consistent_field function. I'm curious what could be causing this issue.

@antoine-levitt
Copy link
Member

Did you request super small tolerances? Can you post a MWE?

@umbriquse
Copy link
Author

No, the tol is at it's default value. I might be able to, but the program is quite large and fractured. Would that mean that the issue would most likely lie in the setup of the scf? Or could it be in the PlaneWaveBasis?

@umbriquse
Copy link
Author

I used to have this issue when I was recycling the electron density information for the next calculation. The fix for that was to stop recycling the electron density.

@umbriquse
Copy link
Author

The first couple solved scf's do not experience this issue. Only after a few solved scf iterations does this start to happen.

@mfherbst
Copy link
Member

Could it be that you are running a very large calculation? How many electrons and bands do you have?

It could be that our diagtol heuristics is simply very bad for large problems and requests a tolerance that cannot be delivered by the LOBPCG in 200 iterations. We probably need to adapt the heuristics for such cases ... I remember looking at that in QE and finding out that they do some more tricks. We should probably test our heuristics some more on these cases.

@umbriquse
Copy link
Author

Number of electrons: 34
Number of bands = 41

@umbriquse
Copy link
Author

issue.txt

The above file is a MWE. The file has 1281 lines, so if you have any questions just ask.
The functions in the file are a part of a package I am developing called Dino.

You should be able to copy and paste this code into a julia file and run it.

It takes a few iterations in either the lattice optimization, or the atom optimization for the Eigensolver to not converge.

NOTE: This file does create files for saved data.

@umbriquse
Copy link
Author

umbriquse commented Jul 27, 2021

I found was was happening. The tolerance on diagonalize_all_kblocks kept shrinking after each completed scf in the optimizer. Any recommendations for a minimum tolerance for ScfDiagtol()?
I believe the issue lies in the line
info.n_iter == 2 && (diagtol_max /= 5) # Enforce more accurate Bloch wave line in scf_callbacks.jl.
The tolerance will continue to drop by a factor of 5 even past the 100eps(real(eltype(info.ρin))) value.

EDITS: Added more necessary information

@umbriquse
Copy link
Author

After some more tests I found that the clamp function isn't working as expected

@umbriquse
Copy link
Author

I'm not sure how, but after a hacky fix the first iteration could still drop below the allowed minimum value. This suggests that some information is being passed from one self_consistent_field call to the next.

@antoine-levitt
Copy link
Member

It looks like you are calling ScfDiagtol, saving it and then reusing it for several calls to self_consistent_field(). That's indeed going to do bad things, and not expected at all by our code (and I'm not sure we could solve that elegantly). Just use a new ScfDiagtol struct for each call to self_consistent_field. Looking at your code, I see that you specify explicitly the kwargs to self_consistent_field, why are you doing this?

@antoine-levitt
Copy link
Member

To be clear: ScfDiagtol is a closure, ie it stores some information implicitly by capturing its kwarg in the inner function

@mfherbst
Copy link
Member

mfherbst commented Jul 28, 2021

Actually not only ScfDiagtol is a closure, also some of the other callbacks are. It is generally not save to reuse them. Rather you should reuse the arguments to construct them.

One way we could avoid this problem is by converting them to proper structs and implementing a reset mechanism, but I'm not too sure if that not overcomplicates things.

edit: In your example this concerns in particular the callback, determine_diagtol and is_converged fields of the arguments datastructure.

@mfherbst
Copy link
Member

I think we should add a note to the documentation that makes this statefulness of the callbacks more clear.

@mfherbst mfherbst added the documentation Correcting or expanding documentation label Jul 28, 2021
@mfherbst mfherbst changed the title Eigensolver Not Converged Document that it is not save to reuse SCF callbacks Jul 28, 2021
@mfherbst mfherbst added the good first issue Good for newcomers label Jul 28, 2021
@umbriquse
Copy link
Author

The reason as to why I specify the kwargs explicitly is because the end user of this package wouldn't have direct access to the self_consistent_field function.
I didn't know about closure functions, but they make sense. Sorry about the confusion.
I've since made the necessary changes and it works properly now.

@antoine-levitt
Copy link
Member

I don't know what you want to do but you can do eg f(scf_args) = (...; self_consistent_field(basis; scf_args...)); f((;tol=1e-4))

@antoine-levitt
Copy link
Member

Also the merge function for named tuples is super useful for this kind of things

@mfherbst
Copy link
Member

But that causes trouble for the diagtol mechanism for example. There you need to go to the next level of indirection (i.e. cache the arguments passed to diagtol rather than the diagtol kwarg). I've fallen into that trap as well.

@umbriquse
Copy link
Author

Yeah, @mfherbst . I've gotten to that point as well.

@mfherbst
Copy link
Member

Yeah two people falling into that trap sounds like we should do something about it, so good you raised the issue @umbriquse .

@mfherbst mfherbst added the hacktoberfest A good beginners issue for the hacktoberfest label Oct 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Correcting or expanding documentation good first issue Good for newcomers hacktoberfest A good beginners issue for the hacktoberfest
Projects
None yet
Development

No branches or pull requests

3 participants