Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use new QCFractal #280

Merged
merged 62 commits into from
Mar 27, 2024
Merged

Use new QCFractal #280

merged 62 commits into from
Mar 27, 2024

Conversation

mattwthompson
Copy link
Member

No description provided.

@codecov
Copy link

codecov bot commented Aug 4, 2023

Codecov Report

Merging #280 (328c7da) into main (1f0e549) will decrease coverage by 0.74%.
The diff coverage is 95.04%.

Additional details and impacted files

@mattwthompson

This comment was marked as outdated.

@mattwthompson

This comment was marked as outdated.

@mattwthompson

This comment was marked as outdated.

@mattwthompson mattwthompson changed the title Test on Pydantic v1 and v2 Use new QCFractal Jan 25, 2024
@mattwthompson mattwthompson force-pushed the pydantic-1-2 branch 2 times, most recently from d795a80 to 5f60029 Compare January 26, 2024 19:23
@mattwthompson mattwthompson marked this pull request as ready for review March 22, 2024 21:20
Copy link
Member

@j-wags j-wags left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An update since this is taking a little while: I did a once-over on the code and it looked fine. However I ran the quick-start calc 4 times using the old and new stack and am seeing consistent differences in the resulting parameters. So this post is just an update and I'm still looking into this.

Here's where I've run them: 2024_03_25_quickstart_test.zip

Here's my analysis result (copied from the notebook in that zip):

from openff.toolkit import ForceField
import glob
folders = sorted(glob.glob("./*data*"))
ffs = dict()
for folder in folders:
    ffs[folder] = ForceField(f"{folder}/acetaminophen.offxml")
import numpy as np
from openff.units import unit
from matplotlib import pyplot
diffs = np.zeros((len(folders), len(folders))) * unit.kilocalorie_per_mole
for idx1, (folder1, ff1) in enumerate(ffs.items()):
    for idx2, (folder2, ff2) in enumerate(ffs.items()):
        tot_diff = 0
        for param1 in ff1["ProperTorsions"][-6:]:
            param2 = ff2["ProperTorsions"][param1.smirks]
            for k1, k2 in zip(param1.k, param2.k):
                tot_diff += abs(k1 - k2)
        #print(tot_diff)
        diffs[idx1, idx2] = tot_diff
pyplot.imshow(diffs)
pyplot.xticks(range(len(folders)), labels=folders, rotation=75)
pyplot.yticks(range(len(folders)), labels=folders)
pyplot.colorbar()
Screenshot 2024-03-26 at 12 45 33 PM

docs/getting-started/installation.md Outdated Show resolved Hide resolved
@j-wags
Copy link
Member

j-wags commented Mar 26, 2024

Ahh, chalk it up to environment contamination. Turns out my new env had openeye and the old one didn't. Adding a run where I've removed OE from the new env now gives results that match the old results.

Screenshot 2024-03-26 at 1 57 36 PM

Copy link
Member

@j-wags j-wags left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to see if I can get codecov happy again before I approve.

.github/workflows/CI.yaml Outdated Show resolved Hide resolved
docs/getting-started/installation.md Outdated Show resolved Hide resolved
docs/getting-started/installation.md Outdated Show resolved Hide resolved
@@ -118,9 +127,9 @@ def from_remote_records(cls, qc_records):
assert len(record_types) == 1, "records must be the same type"

conversion_functions = {
ResultRecord: cls._result_record_to_atomic_result,
BaseRecord: cls._result_record_to_atomic_result, # maybe should be SinglepointRecord?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(not blocking) Good question. BespokeFit should only really be dealing with torsiondrives and optimizations so I don't know when a bug here would reach a user. But let's leave the comment for help in future debugging.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The corresponding method is not implemented, so the user will have a hard time hitting a but at all

openff/bespokefit/tests/conftest.py Outdated Show resolved Hide resolved
openff/bespokefit/tests/conftest.py Outdated Show resolved Hide resolved
@jthorton
Copy link
Contributor

jthorton commented Mar 27, 2024

@j-wags thanks for doing that careful testing it seems like nothing is broken when running the full workflow! What does the new_data_w_temp refer to as the parameters seem to be slightly different to the new_data runs which seem very consistent?

Edit:
I think we actually need to run a test using the qc-cache though as that is the part which interfaces to qcarchive

Copy link
Member

@j-wags j-wags left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Codecov looks good. Merging!

@j-wags
Copy link
Member

j-wags commented Mar 27, 2024

Oh, and @jthorton (sorry to miss your comment, the page didn't cleanly refresh when I opened this today) - new_data_w_temp is the new stack run with BEFLOW_KEEP_TMP_FILES=True so that I could start debugging using the intermediate files. Keeping the intermediate files shouldn't introduce any numerical differences in the fitting process so I don't think that the significant differentiator for that run. My guess is that it just hit some variability in charge assignment/the fitting process (as a breadcrumb: I did notice that in some runs the new parameters in the OFFXML are written in different orders - this may affect the order of iteration during FB fits and therefore the optimization trajectory?)

We see a little bit of the variation between the old_data*/ runs as well using the legacy stack, and the new_data_no_openeye run matches a group of the old... outcomes closely.

I don't think this is a showstopper for release since some variability was always there. It would be better if this wasn't the case, but since this PR has been open 7 months now, I'd prefer to handle the variability problem in a separate issue/PR.

@j-wags j-wags merged commit d6f9b74 into main Mar 27, 2024
7 checks passed
@j-wags
Copy link
Member

j-wags commented Mar 27, 2024

@jthorton - Since you're currently listed as owner for BespokeFit, and since your previous comments mentioned other questions - Would you prefer those be addressed before a new release is cut? My team is over budget for BespokeFit effort this month (and behind on other goals) so we can't put in significant time until later in April. So either:

  • We could cut the release now and handle other stuff in future releases
  • We could hold off on the release and you could take the lead on investigating differences/adding tests
  • We could hold off on the release and my team could could take the lead on the other questions in mid/late-April

@jthorton
Copy link
Contributor

Hey @j-wags I think we should be fine to cut a release now as this update should not change the fitting outcomes and we can look into the differences in April, I'll be able to help with this but not take it on solo.

Your graphs above got me thinking about the possible fitting regression reported in #304 which is proving hard to track down which is the main thing I want to keep looking into.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants