Fix occasional optimizer failure via improved filter to accept more eigenvalues #4917

markdewing · 2024-02-09T06:07:16Z

Sometimes no eigenvalue is found from the mapping (filter and sort), which leads the code to select the first eigenvalue returned from the solver, which often sends the optimizer parameters in a very undesirable direction.

The results of this situation look something like this qmca output:

                            LocalEnergy               Variance           ratio 
opt1_j2_7_6  series 0  -74.870455 +/- 0.024794   5.933186 +/- 0.535541   0.0792 
opt1_j2_7_6  series 1  -75.001842 +/- 0.037365   7.256994 +/- 2.932719   0.0968 
opt1_j2_7_6  series 2  -74.929538 +/- 0.016134   4.463142 +/- 0.154065   0.0596 
opt1_j2_7_6  series 3  -38087874.983748 +/- 13522761.916354   52176250689832632320.000000 +/- 20073852128999710720.000000   1369891355506.0911 
opt1_j2_7_6  series 4  -35506733.280610 +/- 11037598.694781   51117489304034312192.000000 +/- 17345565406866282496.000000   1439656216753.3770 
opt1_j2_7_6  series 5  -9073700.394155 +/- 3700257.273053   11376395351848120320.000000 +/- 5387829541237716992.000000   1253776833889.7781

The workaround is to detect the situation where no acceptable eigenvalue is found, and run a second filter which accepts eigenvalues above the current VMC energy ( E_0 = H(0,0) ). It cuts off at E_0 + 100, to mirror the lower cutoff.

The solution could be simpler, but the fix was chosen to keep the existing eigenvalue selection unchanged as much as possible.

Some notes on the details of eigenvalue selection:

Why 2.0? I don't know. It was introduced in 4b5b0a3 (2011). Prior to that change, the code searched for the eigenvalue closest to E_0 that falls between E_0 and E_0 - 1e10. After that change, the range was narrowed to E_0 and E_0 - 100, and the code searched for the eigenvalue closest to E_0 - 2.0. My guess is that a slightly lower eigenvalue might lead to faster convergence if there are a number of eigenvalues close to E_0 (??) but that really ought to be tested.

Why E_0 - 100 for the cutoff? I also don't know. My guess is it's to cut off obviously spurious eigenvalues which could arise from noise. Since this is a fixed value, it could cause problems for systems with very large total energies.

What type(s) of changes does this code introduce?

Delete the items that do not apply

Bugfix

Does this introduce a breaking change?

No

What systems has this change been tested on?

desktop

Checklist

Update the following with a yes where the items apply. If you're unsure about any of them, don't hesitate to ask. This is
simply a reminder of what we are going to look for before merging your code.

Yes. This PR is up to date with current the current state of 'develop'
Yes. Code added or changed in the PR has been clang-formatted
No. This PR adds tests to cover any new code, or to catch a bug that is being fixed
No. Documentation has been added (if appropriate)

Sometimes no eigenvalue is found from the mapping (filter and sort), which leads the code to select the first eigenvalue returned from the solver, which often sends the optimizer parameters in a very undesirable direction. The workaround is to detect the situation where no acceptable eigenvalue is found, and run a second filter which accepts eigenvalues above the current VMC energy ( H(0,0) ). The solution could be simpler, but the fix was chosen to keep the existing eigenvalue selection unchanged as much as possible.

prckent · 2024-02-09T23:55:11Z

Test this please

prckent · 2024-02-09T23:55:49Z

Very interesting! Thanks Mark. Can you describe the situation that triggered this?

markdewing · 2024-02-12T16:09:42Z

The context is testing for #4901. The cusp correction depends on a discrete parameter (locations where the second derivative of the correction matches the second derivative the basis function). I had set up some scripts to scan across that parameter, and this problem with the optimizer would show up in a few percent of the cases.

I had seen it before with some orbital rotation optimization, but hadn't tracked down the issue (It doesn't require orbital rotation to be involved - the problem could show up just optimizing J2 and J1, though it was more frequent when also optimizing J3)

ye-luo · 2024-02-12T18:55:34Z

It doesn't seem the case that E_0 = H(0,0) but E_0 = (H * Inv)[0,0] where inv is the inverse of overlap matrix.

markdewing · 2024-02-12T20:55:46Z

For the overlap matrix, the first row and column are the identity matrix, so the inverse also ends up as the identity matrix in the first row and column, and (H * Inv)[0, 0] ends up being the same as H[0,0].

prckent

Thanks Mark.

No issues with this PR since this doesn't change the relevant logic, but I do wonder about handling of excited states in future when more of the wavefunction is optimized.

prckent · 2024-04-29T17:57:17Z

Test this please

prckent · 2024-04-29T20:26:19Z

Test this please

prckent mentioned this pull request Mar 24, 2024

NonLocal ECP NaN with Batched Code #4941

Open

markdewing mentioned this pull request Apr 4, 2024

Linear method eigensolver improvements #3788

Open

Typo + improve message

07799a1

prckent approved these changes Apr 29, 2024

View reviewed changes

prckent changed the title ~~Fix occasional optimizer failure.~~ Fix occasional optimizer failure via improved filter to accept more eigenvalues Apr 29, 2024

Merge branch 'develop' into opt_eigen_filter

7e690c7

prckent enabled auto-merge April 29, 2024 17:57

Merge branch 'develop' into opt_eigen_filter

640bc00

prckent approved these changes Apr 29, 2024

View reviewed changes

prckent merged commit efdf45b into QMCPACK:develop Apr 29, 2024
38 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix occasional optimizer failure via improved filter to accept more eigenvalues #4917

Fix occasional optimizer failure via improved filter to accept more eigenvalues #4917

markdewing commented Feb 9, 2024

prckent commented Feb 9, 2024

prckent commented Feb 9, 2024

markdewing commented Feb 12, 2024

ye-luo commented Feb 12, 2024

markdewing commented Feb 12, 2024

prckent left a comment

prckent commented Apr 29, 2024

prckent commented Apr 29, 2024

Fix occasional optimizer failure via improved filter to accept more eigenvalues #4917

Fix occasional optimizer failure via improved filter to accept more eigenvalues #4917

Conversation

markdewing commented Feb 9, 2024

Some notes on the details of eigenvalue selection:

What type(s) of changes does this code introduce?

Does this introduce a breaking change?

What systems has this change been tested on?

Checklist

prckent commented Feb 9, 2024

prckent commented Feb 9, 2024

markdewing commented Feb 12, 2024

ye-luo commented Feb 12, 2024

markdewing commented Feb 12, 2024

prckent left a comment

Choose a reason for hiding this comment

prckent commented Apr 29, 2024

prckent commented Apr 29, 2024