Fix create_endstates_from_real_systems #1050

zhang-ivy · 2022-06-20T16:00:01Z

Description

This PR fixes create_endstates_from_real_systems(), which was not creating the unsampled endstates properly. The global parameters in the valence forces (of the unsampled endstate systems generated by create_endstates_from_real_systems()) were not being set according to the appropriate endstate. They were always defaulting to the lambda = 0 endstate, meaning that the unsampled endstate at lambda = 1 had a hybrid system with valence terms set for the lambda = 0 endstate. I've introduced a fix that sets the default values for global parameters in the valence forces according to the right endstate. The fix is based on what is done in create_endstates() here.

This bug was not caught by the existing tests likely because the valence terms are pretty similar for the lambda = 0 and 1 systems in ala dipeptide and barstar. Therefore, I've included one of the tyk2 transformations with large error bars (from which was originally discovered this bug) as a test -- this additional test case should cover transformations where the valence terms are quite different at the lambda = 0 vs 1 endstates.

Motivation and context

Resolves #1041

How has this been tested?

Added tyk2 test in test_relative.py -- the transformation tested here is one that had large error bars as noted in #1041

Change log

Bug fix in the `create_endstates_from_real_systems()` -- fixed by setting the global parameters for valence forces to the appropriate endstate. Also added tyk2 transformation test.

zhang-ivy · 2022-06-20T16:03:36Z

@ijpulidos : Could you try re-running a couple of the problematic tyk2 benchmark transformations (with large error bars) with this branch? I just want to confirm that the error bars are no longer large before we merge this and get the bug fix release out.

codecov · 2022-06-20T16:19:45Z

Codecov Report

Merging #1050 (c7deb59) into main (182bf16) will decrease coverage by 1.31%.
The diff coverage is 33.33%.

zhang-ivy · 2022-06-20T17:04:53Z

@ijpulidos @mikemhenry : The openmm nightly tests are failing with this error:

Error: Codecov failed with the following error: The process '/usr/bin/bash' failed with exit code 1

Any idea why? They're not failing in #1046

I already tried re-running one of the failed jobs, and its still failing.

ijpulidos · 2022-06-21T14:52:01Z

@zhang-ivy That bash error is just codecov failing (it does that sometimes). But there's another error when testing energies for REST system here https://github.com/choderalab/perses/runs/6971007912?check_suite_focus=true#step:9:440

zhang-ivy · 2022-06-21T14:56:23Z

@ijpulidos : That error does pop up quite a bit, I have it on my to do list to figure out if there's a way to prevent that test from failing so often. For now, I just re-ran both the failed tests again.

ijpulidos · 2022-06-21T20:04:35Z

Yes, we do need to check a better way to avoid these spurious tests failing. But for now this seems okay.

ijpulidos

Looks good, just some comments but nothing that's blocking. Great work! I'll run the tyk2 benchmarks using this branch to see if we can recover the low error bars behavior.

ijpulidos · 2022-06-21T20:08:31Z

perses/tests/test_relative.py

+        def concatenate_files(input_files, output_file):
+            """
+            Concatenate files given in input_files iterator into output_file.
+            """
+            with open(output_file, 'w') as outfile:
+                for filename in input_files:
+                    with open(filename) as infile:
+                        for line in infile:
+                            outfile.write(line)


I see this is the same concatenate_files from the run_benchmarks.py code. Seems like we want this to be in perses.utils.concatenate_files and avoid redundant code between both. We would just import it both here and in the benchmarks script.

Actually, reading now the whole test, I think what we do really want it to transform the benchmarks code into an actual module in perses.benchmarks or similar. This can be left for a future release (next 0.11.0 release sounds like a good option for it). I'll raise an issue with this for the 0.11.0 milestone.

Yes, I agree, raising an issue sounds good -- thanks!

ijpulidos · 2022-06-21T20:21:46Z

perses/tests/test_relative.py

+
+    # Tyk2 -- Run point and MD energy validation tests
+    run_unsampled_endstate_energies('tyk2', use_point_energies=True, use_md_energies=True)


Just a comment, nothing blocking. Sounds like a good test to parametrize in the future, I'll add that to our test cleanup issue.

ijpulidos · 2022-06-21T20:27:13Z

perses/dispersed/utils.py

+            # Set defaults for global parameters depending on the factory
+            htf_class = htf.__class__.__name__
+            for force_index, force in enumerate(list(hybrid_system.getForces())):
+                if hasattr(force, 'getNumGlobalParameters'): # Only custom forces will have global parameters to set
+                    for parameter_index in range(force.getNumGlobalParameters()):
+                        global_parameter_name = force.getGlobalParameterName(parameter_index)
+                        if global_parameter_name[0:7] == 'lambda_':
+                            if htf_class == 'HybridTopologyFactory':
+                                force.setGlobalParameterDefaultValue(parameter_index, lambda_val)
+                            elif htf_class == 'RESTCapableHybridTopologyFactory':
+                                if 'old' in global_parameter_name:
+                                    force.setGlobalParameterDefaultValue(parameter_index, 1 - lambda_val)
+                                elif 'new' in global_parameter_name:
+                                    force.setGlobalParameterDefaultValue(parameter_index, lambda_val)


Just wondering if this would be something that the actual HTF objects should do on their side, rather than having all the logic and conditionals here. Or is this something that's only accessible or only makes sense when creating the endstates? (Non-blocking)

Yes, setGlobalParameterDefaultValue() can be called on the HTF side.

Let me walk through what the code above does and try to write a helper function that would simplify the code. We are creating two hybrid systems: one for the lambda = 0 endstate and one for the lambda = 1 endstate. Therefore, the hybrid systems should have different default values (for lambda = 0, the default values will be 0 and for lambda = 1, the default values will be 1). However, the challenge is that the global parameters for the different factories work differently.

For the HybridTopologyFactory, we want lambda_* to be 0 for lambda = 0 and 1 for lambda = 1.

However, for RESTCapableHybridTopologyFactory, we decided that:

for lambda = 0, lambda_*_old should be 1 and lambda_*_new should be 0.

for lambda = 1, lambda_*_old should be 0 and lambda_*_new should be 1.

Here is are suggested helper functions that we could add to the factories to simplify the logic above.

# Helper function for HybridTopologyFactory def set_lambdas_for_endstate(self, endstate): for force_index, force in enumerate(list(hybrid_system.getForces())): if hasattr(force, 'getNumGlobalParameters'): for parameter_index in range(force.getNumGlobalParameters()): global_parameter_name = force.getGlobalParameterName(parameter_index) if global_parameter_name[0:7] == 'lambda_': force.setGlobalParameterDefaultValue(parameter_index, endstate) # Helper function for RESTCapableHybridTopologyFactory def set_lambdas_for_endstate(self, endstate): for force_index, force in enumerate(list(hybrid_system.getForces())): if hasattr(force, 'getNumGlobalParameters'): for parameter_index in range(force.getNumGlobalParameters()): global_parameter_name = force.getGlobalParameterName(parameter_index) if global_parameter_name[0:7] == 'lambda_': if 'old' in global_parameter_name: force.setGlobalParameterDefaultValue(parameter_index, 1 - endstate) elif 'new' in global_parameter_name: force.setGlobalParameterDefaultValue(parameter_index, endstate)

which would simplify the above code to:

# Set defaults for global parameters depending on the factory htf_class = htf.__class__.__name__ htf.set_lambdas_for_endstate(lambda_val)

@ijpulidos : Let me know if you think this would be better. It would certainly simplify the code in the test, but it would mean that we are adding functions that may never be used again, so not sure if its worth it. I'm on the fence, so we can just go with whatever you think is best. If you think its best to refactor with the helper functions, could you add a commit to this PR with these changes?

I have been thinking about this and I think the best approach is probably just to leave it like this and refactor to a more factory-like pattern for the next 0.11.0 release, hopefully without breaking any API compatibility. I can already see many things that can be improved in the whole structure of the classes and methods, but I don't want to make such big changes right now. This is good enough for the current state.

ijpulidos · 2022-06-22T17:19:53Z

I run the benchmarks again with the changes in this branch and we now recovered the low error bars. Funnily enough, that also means we have "recover" that outlier on the right for the relative FE calculations.

zhang-ivy added 3 commits June 20, 2022 11:46

fix creation of unsampled endstates

acae605

update tests

264bb75

remove comments

c7deb59

zhang-ivy requested a review from ijpulidos June 20, 2022 16:00

zhang-ivy added this to the 0.10.1 - Bugfix release milestone Jun 20, 2022

ijpulidos approved these changes Jun 21, 2022

View reviewed changes

ijpulidos mentioned this pull request Jun 21, 2022

Improvements for automated benchmarking #927

Open

ijpulidos mentioned this pull request Jun 22, 2022

Factory-like pattern for Hybrid Topology Factory (HTF) classes #1051

Open

ijpulidos merged commit 28885ab into main Jun 22, 2022

ijpulidos deleted the fix-create-endstates branch June 22, 2022 17:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix create_endstates_from_real_systems #1050

Fix create_endstates_from_real_systems #1050

zhang-ivy commented Jun 20, 2022 •

edited

zhang-ivy commented Jun 20, 2022

codecov bot commented Jun 20, 2022 •

edited

zhang-ivy commented Jun 20, 2022 •

edited

ijpulidos commented Jun 21, 2022

zhang-ivy commented Jun 21, 2022

ijpulidos commented Jun 21, 2022

ijpulidos left a comment

ijpulidos Jun 21, 2022

ijpulidos Jun 21, 2022

zhang-ivy Jun 22, 2022

ijpulidos Jun 21, 2022

ijpulidos Jun 21, 2022

zhang-ivy Jun 22, 2022

zhang-ivy Jun 22, 2022

ijpulidos Jun 22, 2022

ijpulidos commented Jun 22, 2022


		# Tyk2 -- Run point and MD energy validation tests
		run_unsampled_endstate_energies('tyk2', use_point_energies=True, use_md_energies=True)

Fix create_endstates_from_real_systems #1050

Fix create_endstates_from_real_systems #1050

Conversation

zhang-ivy commented Jun 20, 2022 • edited

Description

Motivation and context

How has this been tested?

Change log

zhang-ivy commented Jun 20, 2022

codecov bot commented Jun 20, 2022 • edited

Codecov Report

zhang-ivy commented Jun 20, 2022 • edited

ijpulidos commented Jun 21, 2022

zhang-ivy commented Jun 21, 2022

ijpulidos commented Jun 21, 2022

ijpulidos left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ijpulidos commented Jun 22, 2022

zhang-ivy commented Jun 20, 2022 •

edited

codecov bot commented Jun 20, 2022 •

edited

zhang-ivy commented Jun 20, 2022 •

edited