Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Freezing two-body Jastrow parameters does not work #2814

Closed
shivupa opened this issue Dec 15, 2020 · 12 comments · Fixed by #3052
Closed

Freezing two-body Jastrow parameters does not work #2814

shivupa opened this issue Dec 15, 2020 · 12 comments · Fixed by #3052
Labels
Milestone

Comments

@shivupa
Copy link
Contributor

shivupa commented Dec 15, 2020

At @anbenali's request, I am opening an issue. Many thanks to @markdewing for helping with the troubleshooting.

Describe the bug
When a optimizing a two-body Jastrow, setting optimize="no" does not work.
When optimize is yes for both parameters, the optimization works.
If optimize="no" comes before optimize="yes", the optimization fails.
If optimize="no" comes after optimize="yes", the optimization works.

To Reproduce
Steps to reproduce the behavior:

  1. release version or git commit hash being built
    a6aaf95d3461e13751f7d38b5dc1ed52ca3618e6-dirty
  2. cmake command
    cmake -DENABLE_MKL=1 ..
  3. full program/test invocation command
    qmcpack Default.qmc.in-wfj.xml
  4. additional steps
    Full input files here

An H2O example with a two-body Jastrow u-u, u-d, d-d = u-u with two parameters each (starting at 0.0, but this doesn't matter):

    <jastrow name="J2" type="Two-Body" function="Bspline" print="yes">
      <correlation rcut="10" size="2" speciesA="u" speciesB="u">
        <coefficients id="uu" type="Array" optimize="no" >0.0 0.0 </coefficients>
      </correlation>
      <correlation rcut="10" size="2" speciesA="u" speciesB="d">
        <coefficients id="ud" type="Array" optimize="yes" >0.0 0.0 </coefficients>
      </correlation>
    </jastrow>

Results in (after several opt cycles s014.opt.xml):

    <jastrow name="J2" type="Two-Body" function="Bspline" print="yes">
      <correlation rcut="10" size="2" speciesA="u" speciesB="u">
        <coefficients id="uu" type="Array" optimize="no">0.0 0.0 </coefficients>
      </correlation>
      <correlation rcut="10" size="2" speciesA="u" speciesB="d">
        <coefficients id="ud" type="Array" optimize="yes"> 2.251239143 0</coefficients>
      </correlation>
    </jastrow>

Expected behavior
I would expect both the parameters of the second set to be updated. I have tested with more parameters and only the first parameter is only updated.

System:
Laptop Intel i7-7700HQ

Additional context
Following Mark's suggestions, I think I have some idea of where the bug is happening so I'll post what I have found so far.

I think the problem is in src/QMCWaveFunctions/Jastrow/DiffTwoBodyJastrowOrbital.h at line 149, the OffSet variable is pointing to the wrong indices:

    optimize? (uu/ud):     OffSet:
    ---------------------------------------------------------------
    yes/yes                i 0 OffSet[i].first 0 OffSet[i].second 2
                           i 1 OffSet[i].first 2 OffSet[i].second 4
                           i 2 OffSet[i].first 2 OffSet[i].second 4
                           i 3 OffSet[i].first 0 OffSet[i].second 2
    ---------------------------------------------------------------
    no/yes                 i 0 OffSet[i].first 0 OffSet[i].second 2
                           i 1 OffSet[i].first 1 OffSet[i].second 3
                           i 2 OffSet[i].first 1 OffSet[i].second 3
                           i 3 OffSet[i].first 0 OffSet[i].second 2
    ---------------------------------------------------------------
    yes/no                 i 0 OffSet[i].first 0 OffSet[i].second2
                           i 1 OffSet[i].first -1 OffSet[i].second1
                           i 2 OffSet[i].first -1 OffSet[i].second1
                           i 3 OffSet[i].first 0 OffSet[i].second2

The first and third case optimize correctly, but the second case does not. Even for the 1 parameter that is being updated I am not sure if the correct derivatives are being used to update that parameter.

@prckent prckent added the bug label Dec 15, 2020
@prckent
Copy link
Contributor

prckent commented Dec 15, 2020

Thanks for reporting what is clearly a bug, and likely a very longstanding one. If you have noticed issues with any other Jastrow component that would be good to know. When we know the error in this 2-body term we will have to check all the other Jastrow components in case the same error pattern is repeated.

@shivupa
Copy link
Contributor Author

shivupa commented Dec 15, 2020

Yeah I tried some things with the 1-body Jastrow, which is summarized below. I haven't tried anything with a 3-body Jastrow

    <jastrow name="J1" type="One-Body" function="Bspline" source="ion0" print="yes">
      <correlation rcut="10" size="2" cusp="0" elementType="H">
          <coefficients id="eH" type="Array" optimize="no" >0.0 0.0 </coefficients>
      </correlation>
      <correlation rcut="10" size="2" cusp="0" elementType="O">
        <coefficients id="eO" type="Array" optimize="yes" >0.0 0.0 </coefficients>
      </correlation>
    </jastrow>

This segfaults.


      <correlation rcut="10" size="2" cusp="0" spin="yes" speciesA="H" speciesB="u">
          <coefficients id="uH" type="Array" optimize="no">0.0 0.0 </coefficients>
      </correlation>
      <correlation rcut="10" size="2" cusp="0" spin="yes" speciesA="H" speciesB="d">
          <coefficients id="dH" type="Array" optimize="yes"> 0.0 0.0 </coefficients>
      </correlation>
      <correlation rcut="10" size="2" cusp="0" spin="yes" speciesA="O" speciesB="u">
        <coefficients id="uO" type="Array" optimize="down">0.0 0.0 </coefficients>
      </correlation>
      <correlation rcut="10" size="2" cusp="0" spin="yes" speciesA="O" speciesB="d">
        <coefficients id="dO" type="Array" optimize="yes"> 0.0 0.0 </coefficients>
      </correlation>
    </jastrow>

This seems to work:

      <correlation rcut="10" size="2" cusp="0" spin="yes" speciesA="H" speciesB="u">
          <coefficients id="uH" type="Array" optimize="no">0.0 0.0 </coefficients>
      </correlation>
      <correlation rcut="10" size="2" cusp="0" spin="yes" speciesA="H" speciesB="d">
          <coefficients id="dH" type="Array" optimize="yes"> 0.006273424616 0.04145355415</coefficients>
      </correlation>
      <correlation rcut="10" size="2" cusp="0" spin="yes" speciesA="O" speciesB="u">
        <coefficients id="uO" type="Array" optimize="down">0.0 0.0 </coefficients>
      </correlation>
      <correlation rcut="10" size="2" cusp="0" spin="yes" speciesA="O" speciesB="d">
        <coefficients id="dO" type="Array" optimize="yes"> 0.221016958 0.2694222086</coefficients>
      </correlation>
    </jastrow>

@markdewing
Copy link
Contributor

I looked into the code in DiffTwoBodyJastrow a little bit.
The vector F contains pointers to the pair functions. For up/down there are four combinations, with two unique ones (up/down, down/up), (up/up, down/down). The size of F is four.

The DiffTwoBodyJastrow class has it's set of variational parameters (myVars), and each pair function has a subset for the parameters affecting that pair (F[i]->myVars).

The OffSet variable is the same size as F and contains start/end indices of the parameters for each pair for myVars (the DiffTwoBodyJastrow one). It is used in two ways: to determine whether the pair function contains any active (optimizable) parameters, and to index into the output (dLogPsi, gradLogPsi, etc).

For the first part (determining active variables), the vector rcsingles is set per-parameter (true=recompute). Later, this is aggregated into RecalcSwitch, which determines if any variables for pair function are active, and therefore the function needs to be recomputed.

The OffSet (and rcsingles) variables are set in checkOutVariables. The loop to set OffSet starts with a base offset, varoffset, which is set to the index of the first variational parameter (myVars.Index[0]). Now this is where a (the?) problem occurs. If the first variable is not optimizable, the index is -1. So all the OffSet values get shifted by -1, which they should not be. In the no/yes case of the OffSet values (see table in original report), they are 1 and 3. They should be 2 and 4 (I think the OffSet ranges should be non-overlapping. There are 2 parameters per-pair, so the index ranges should be 0-2 and 2-4).

My guess is the initial offset should be set to the value of the first active variable. As in the following code

int varoffset = -1;
for (int i = 0; i < myVars.size(); i++) {
  varoffset = myVars.Index[i];
  if (varoffset != -1) break;
}

shivupa added a commit to shivupa/qmcpack that referenced this issue Dec 21, 2020
@shivupa
Copy link
Contributor Author

shivupa commented Dec 21, 2020

Thanks for the details. There were some other things that needed updated alongside that change. I think this branch has a fix to the issue.

For the example above:


yes/yes

    <jastrow name="J2" type="Two-Body" function="Bspline" print="yes">
      <correlation rcut="10" size="2" speciesA="u" speciesB="u">
        <coefficients id="uu" type="Array" optimize="yes"> -0.1936784994 0.7153175412</coefficients>
      </correlation>
      <correlation rcut="10" size="2" speciesA="u" speciesB="d">
        <coefficients id="ud" type="Array" optimize="yes"> 0.1122678027 1.323002376</coefficients>
      </correlation>
    </jastrow>

yes/no

    <jastrow name="J2" type="Two-Body" function="Bspline" print="yes">
      <correlation rcut="10" size="2" speciesA="u" speciesB="u">
        <coefficients id="uu" type="Array" optimize="yes"> -0.3580566016 0.7863662663</coefficients>
      </correlation>
      <correlation rcut="10" size="2" speciesA="u" speciesB="d">
        <coefficients id="ud" type="Array" optimize="no">0.0 0.0 </coefficients>
      </correlation>
    </jastrow>

no/yes

    <jastrow name="J2" type="Two-Body" function="Bspline" print="yes">
      <correlation rcut="10" size="2" speciesA="u" speciesB="u">
        <coefficients id="uu" type="Array" optimize="no">0.0 0.0 </coefficients>
      </correlation>
      <correlation rcut="10" size="2" speciesA="u" speciesB="d">
        <coefficients id="ud" type="Array" optimize="yes"> 0.06077836959 1.434312003</coefficients>
      </correlation>
    </jastrow>

I'm not sure what should be done about expanding tests (I also haven't ran the tests for these changes to check if anything breaks), but I wanted to put the changes on a branch in case it helps.

@prckent
Copy link
Contributor

prckent commented Jan 1, 2021

@shivupa Thanks for working on the fix. If you are happy with your changes please can you make a PR with them?. We can discuss testing in the PR, but since adding tests to catch this problem will take some non-trivial code, we might merge the fix ahead of any tests. We really need tests that try a bunch of combinations and verify that parameter sets either change or are held fixed as expected. This will take new code, and my preference is that we don't hold up a fix that is easily to understand.

shivupa added a commit to shivupa/qmcpack that referenced this issue Jan 1, 2021
shivupa added a commit to shivupa/qmcpack that referenced this issue Jan 1, 2021
@anbenali
Copy link
Contributor

anbenali commented Mar 19, 2021

I am bringing this back again as we really need to work for a project with Ken Jordan and we cannot use Jastrows due to the bug from the builder.
@prckent @markdewing @shivupa : Shiv has offered to help debug this but significant guidance will be needed.

@prckent prckent added this to the v3.11.0 milestone Mar 22, 2021
@prckent
Copy link
Contributor

prckent commented Mar 22, 2021

Adding to the milestone for v3.11.0 release. This is probably a >10 year old bug existing since the first implementation in QMCPACK, but if not too diabolical we should get it fixed since it is holding up science (it should be simple, famous last words).

@markdewing
Copy link
Contributor

Some first steps:

  1. Document some of the code better - what it should do, and how the pieces fit into that
  2. Write a unit test that exposes the failure
  3. Fix it
  4. See if that fix works in examples

markdewing added a commit to markdewing/qmcpack that referenced this issue Mar 23, 2021
Addresses QMCPACK#2814 (at least partially)

Add a unit test that will fail in one case where the offsets are not
computed correctly.

Fix the offset calculation in the case the first function does not have
any optimizable parameters.
markdewing added a commit to markdewing/qmcpack that referenced this issue Mar 24, 2021
Second part for fix of QMCPACK#2814

The myVars in this object contains all the variational parameters
aggregated from all the radial functions (in their myVars), including
the non-active ones.

The OffSet variable maps the indices from the radial function myVars to
myVars in this object.

The output for parameter derivatives should use a contiguous index (i.e.
all the non-active parameters removed).  This is what the Index value in
VariableSet maps (returned by myVars.where).
In the code, index k covers all variables in myVars.  The index kk is
the contiguous index for output.

Some of the arrays are indexed with k, but should be indexed with kk
instead.
@markdewing
Copy link
Contributor

Some updates on how to think about this. There are three levels of variational parameters (called 'variables' in the rest of the description).

  1. Global-level variables. These are all the active variables in the wavefunction. This list gets passed in via the active parameter. The Index value refers to the index into this level. Output values like dlogpsi and dhpsioverpsi use this index.
  2. Component-level variables. These are all the variables (active and inactive) in the component (in myVars). Intermediate values like dLogPsi, gradLogPsi, and lapLogPsi use this index.
  3. Subcomponent-level variables. These are the variables in each subcomponent. Contained in F[i]->myVars. The temporary array derivs passed to F[i]->evaluateDerivatives uses this index.

There needs to be a map of indices from subcomponent-level variables to component-level variables, and a map of indices from component-level variables to global-level variables. Currently the OffSet array maps blocks of indices from the subcomponent level to the component level. The mapping from the component level to the global level is performed through the indices of myVars. However, since myVars contains both active and inactive variables, the mapping is not correct.

One solution is to create another block mapping, (OffSet2? I'll find a better name) for the component to global mapping.

markdewing added a commit to markdewing/qmcpack that referenced this issue Mar 26, 2021
Second part of fix for QMCPACK#2814

Construct a second mapping for component-level variational parameters to
global-level variational parameters.

One potentially signficant change is that this maps the parameters as a block
from the component level to the global level.  The previous code could be more
fine-grained.
If there are cases where the subcomponent variables could be a mix of
active and inactive, then this code will need updating.
@ye-luo
Copy link
Contributor

ye-luo commented Mar 26, 2021

I assume you refer Component to WaveFunctionComponent.
There are actually no Component-level variables in J2. So the component level myVar can be removed. The component-level API can directly connect global-level variables to sub-component level ones. Does it make sense?

@markdewing
Copy link
Contributor

Yes, by Component I am referring to the WaveFunctionComponent.

By component-level variables I mean the aggregation of all the sub-component level variables.
Yes, I think you're right in that the myVars member variable could be removed. It might need to be used locally in checkOutVariables, or some other logic used to create the mappings.

@markdewing
Copy link
Contributor

I think removing the inactive variables from the component-level myVars will fix the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment