Bug hunt fixes #225

IAlibay · 2022-11-11T17:45:20Z

Work done in this PR:

Up to 0b71e23 : Fixes the constraint demapping call (a typo in the assigned order of the map items 😓 )
- If we can follow this up with some actual tests that'd be grand. For now this fully replicate's perses.
35f37ee : Add virtual bond for imaging which was missed from the vendored HTF code
- Removed the "center in box" code, which was both problematic and uncessary.
Added minimize call on creation of thermodynamic states
Added experimental code for CPU fallback when GPU minimisation fails (possibly remove).

To-do:

XML diff tests of p38 edge and benzene/phenol edge
Coordinates diffs tests

codecov · 2022-11-11T17:59:45Z

Codecov Report

Base: 92.44% // Head: 92.52% // Increases project coverage by +0.08% 🎉

Coverage data is based on head (cd36a0e) compared to base (f1140ea).
Patch coverage: 99.21% of modified lines in pull request are covered.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #225      +/-   ##
==========================================
+ Coverage   92.44%   92.52%   +0.08%     
==========================================
  Files          57       57              
  Lines        4170     4270     +100     
==========================================
+ Hits         3855     3951      +96     
- Misses        315      319       +4

Impacted Files	Coverage Δ
openfe/protocols/openmm_rbfe/equil_rbfe_methods.py	`86.47% <85.71%> (-0.33%)`	⬇️
...fe/protocols/openmm_rbfe/_rbfe_utils/multistate.py	`60.57% <100.00%> (+4.66%)`	⬆️
...otocols/openmm_rbfe/_rbfe_utils/topologyhelpers.py	`94.48% <100.00%> (+0.73%)`	⬆️
openfe/tests/setup/test_openmm_equil_protocols.py	`100.00% <100.00%> (ø)`
...enfe/protocols/openmm_rbfe/_rbfe_utils/relative.py	`80.61% <0.00%> (-0.38%)`	⬇️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

IAlibay · 2022-11-13T01:14:28Z

openfe/protocols/openmm_rbfe/_rbfe_utils/multistate.py

@@ -115,8 +116,7 @@ class creation of LambdaProtocol.
            context, context_integrator = context_cache.get_context(
                                             compound_thermostate_copy)
            # TODO: move to our own MD tasks
-            # feptasks.minimize(compound_thermodynamic_state_copy,
-            #                   sampler_state)
+            #minimize(compound_thermostate_copy, sampler_state)


The original perses code does the minimization here, however in my opinion it's a bit problematic to have it hidden here. It would be better for this to be a) controlled, b) easy to spot in the protocol.

IAlibay · 2022-11-14T09:09:32Z

openfe/protocols/openmm_rbfe/_rbfe_utils/multistate.py

+    def _minimize_replica(self, replica_id, tolerance, max_iterations):
+        """Minimize the specified replica.
+        """
+
+        # Retrieve thermodynamic and sampler states.
+        thermodynamic_state_id = self._replica_thermodynamic_states[replica_id]
+        thermodynamic_state = self._thermodynamic_states[thermodynamic_state_id]
+        sampler_state = self._sampler_states[replica_id]
+
+        # Use the FIRE minimizer
+        integrator = FIREMinimizationIntegrator(tolerance=tolerance)
+
+        # Get context and bound integrator from energy_context_cache
+        context, integrator = self.energy_context_cache.get_context(thermodynamic_state, integrator)
+        # inform of platform used in current context
+        logger.debug(f"{type(integrator).__name__}: Minimize using {context.getPlatform().getName()} platform.")
+
+        # Set initial positions and box vectors.
+        sampler_state.apply_to_context(context)
+
+        # Compute the initial energy of the system for logging.
+        initial_energy = thermodynamic_state.reduced_potential(context)
+        logger.debug('Replica {}/{}: initial energy {:8.3f}kT'.format(
+            replica_id + 1, self.n_replicas, initial_energy))
+
+        # Minimize energy.
+        try:
+            if max_iterations == 0:
+                logger.debug('Using FIRE: tolerance {} minimizing to convergence'.format(tolerance))
+                while integrator.getGlobalVariableByName('converged') < 1:
+                    integrator.step(50)
+            else:
+                logger.debug('Using FIRE: tolerance {} max_iterations {}'.format(tolerance, max_iterations))
+                integrator.step(max_iterations)
+        except Exception as e:
+            if str(e) == 'Particle coordinate is nan':
+                logger.debug('NaN encountered in FIRE minimizer; falling back to L-BFGS after resetting positions')
+                sampler_state.apply_to_context(context)
+                openmm.LocalEnergyMinimizer.minimize(context, tolerance, max_iterations)
+            else:
+                raise e
+
+        # Get the minimized positions.
+        sampler_state.update_from_context(context)
+
+        # Compute the final energy of the system for logging.
+        final_energy = thermodynamic_state.reduced_potential(sampler_state)
+
+        # If energy > 0 kT and on a GPU device attempt to use CPU L-BFGS minimizer
+        if final_energy > 0 and context.getPlatform().getName() in ['CUDA', 'OpenCL']:
+            logger.debug(f'Positive final FIRE minimizer energy {final_energy}; falling back to CPU L-BFGS')
+            sampler_state.apply_to_context(context)
+            integrator = openmm.VerletIntegrator(1.0)
+            cpu_platform = openmm.Platform.getPlatformByName('CPU')
+            context = thermodynamic_state.create_context(integrator, cpu_platform)
+            sampler_state.apply_to_context(context, ignore_velocities=True)
+            openmm.LocalEnergyMinimizer.minimize(context, tolerance, max_iterations)
+
+            # Get the minimized positions
+            sampler_state.update_from_context(context)
+
+            # Get the final energy
+            final_energy = thermodynamic_state.reduced_potential(sampler_state)
+
+        logger.debug('Replica {}/{}: final energy {:8.3f}kT'.format(
+            replica_id + 1, self.n_replicas, final_energy))
+
+        # Clean up the integrator
+        del context
+
+        # Return minimized positions.
+        return sampler_state.positions


This probably can go away for now, but we should try to push something like this to openmmtools. Ideally we need a) optional platform choice for minimisation (that isn't in the context cache), and b) fallback to CPU L-BFGS when GPU optimization doesn't get anywhere.

Also - I think here we want the positions to be reset to their pre GPU minimised values before the CPU L-BFGS fallback. I've had some interesting failures yesterday that indicate that the failed GPU minimisation might take things to a bad configurational state.

IAlibay · 2022-11-14T09:12:40Z

openfe/protocols/openmm_rbfe/_rbfe_utils/relative.py

@@ -2179,6 +2187,86 @@ def _handle_old_new_exceptions(self):
                         sigma_new, epsilon_new, 0, 1]
                    )

+    def _impose_virtual_bonds(self):


This should be reasonably easy to test. One option is to make this a static method and get it to return the bond force?

openfe/protocols/openmm_rbfe/_rbfe_utils/topologyhelpers.py

richardjgowers · 2022-11-15T16:39:14Z

test for constraint detection added

IAlibay · 2022-11-21T14:14:56Z

@RiesBen can this gathering stuff not be in a separate PR? It doesn't seem like part of the original bug issue?

IAlibay · 2022-11-30T18:02:10Z

Virtual bond addition doesn't seem to be working for inter-chain things, for some reason in one of my system I seem to be getting a bond between a monomer protein and water O.o

IAlibay · 2022-12-01T16:46:30Z

openfe/protocols/openmm_rbfe/_rbfe_utils/multistate.py

@@ -143,6 +148,79 @@ class creation of LambdaProtocol.
            self.create(thermodynamic_states=thermodynamic_state_list,
                        sampler_states=sampler_state_list, storage=reporter)

+    def _minimize_replica(self, replica_id, tolerance, max_iterations):


As discussed, let's remove this for now and move it to a separate PR

still only works for H constraints mind

making room for non-H constraint checking

added test for constraint resolution failure

no protocol_unit_success yet

IAlibay changed the title ~~Fix demapping~~ [wip] Bug hunt fixes Nov 13, 2022

IAlibay commented Nov 13, 2022

View reviewed changes

IAlibay commented Nov 14, 2022

View reviewed changes

richardjgowers force-pushed the fix-htf-torsions branch from b629f87 to 6b3092e Compare November 14, 2022 14:17

richardjgowers approved these changes Dec 1, 2022

View reviewed changes

IAlibay commented Dec 1, 2022

View reviewed changes

richardjgowers changed the title ~~[wip] Bug hunt fixes~~ Bug hunt fixes Dec 2, 2022

IAlibay and others added 19 commits December 2, 2022 15:02

add set comparisons

01445cb

o dear

50bee0a

remove set() comparisons

fb606d3

add virtual bonds for imaging, start fixing GPU minimizing problem

32f3345

add back original canonicalization of systemA positions

95492cf

Override minimizer for multistatesampling

2d09bf5

custom CPU fallback for minimize

5f2cd85

remove extra print statement

e19b546

Add vendored / adapted minimize call

774db25

Remove virtual bonds bits (not necessary this point)

991580f

add set comparisons

917d60c

remove set() comparisons

cc2a6c7

add virtual bonds for imaging, start fixing GPU minimizing problem

dddab81

Override minimizer for multistatesampling

7b0c3a9

custom CPU fallback for minimize

16430d3

remove extra print statement

3497dcd

TST: openmm rbfe: Add test for constraint detection

aa4e005

making gathering fail safer

9e06af6

slight update to constraint test

80dda35

richardjgowers and others added 8 commits December 2, 2022 15:09

constraint finder now detects constraint to harmonic bonds

648ab75

added toluene to benzonitrile constraint test

2aedad9

iterate over mapping not constraints

ce9af27

still only works for H constraints mind

parametrize constraint tests to try both directions

71b6c5e

iterate constraints not mapping again...

26df3b4

making room for non-H constraint checking

improved error message on constraint resolution fail

e7e8503

added test for constraint resolution failure

fix .gather for gufe 0.4

8265fd7

no protocol_unit_success yet

Remove extra minimizer code

1d8ea72

richardjgowers force-pushed the fix-htf-torsions branch from e643a3b to 1d8ea72 Compare December 2, 2022 15:10

openmm_rbfe: remove unused argument for shift_insert

cd36a0e

richardjgowers merged commit 78166a7 into main Dec 2, 2022

richardjgowers deleted the fix-htf-torsions branch December 2, 2022 15:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug hunt fixes #225

Bug hunt fixes #225

IAlibay commented Nov 11, 2022 •

edited

Loading

codecov bot commented Nov 11, 2022 •

edited

Loading

IAlibay Nov 13, 2022

IAlibay Nov 14, 2022

IAlibay Nov 14, 2022

richardjgowers commented Nov 15, 2022

IAlibay commented Nov 21, 2022

IAlibay commented Nov 30, 2022

IAlibay Dec 1, 2022

Bug hunt fixes #225

Bug hunt fixes #225

Conversation

IAlibay commented Nov 11, 2022 • edited Loading

codecov bot commented Nov 11, 2022 • edited Loading

Codecov Report

IAlibay Nov 13, 2022

Choose a reason for hiding this comment

IAlibay Nov 14, 2022

Choose a reason for hiding this comment

IAlibay Nov 14, 2022

Choose a reason for hiding this comment

richardjgowers commented Nov 15, 2022

IAlibay commented Nov 21, 2022

IAlibay commented Nov 30, 2022

IAlibay Dec 1, 2022

Choose a reason for hiding this comment

IAlibay commented Nov 11, 2022 •

edited

Loading

codecov bot commented Nov 11, 2022 •

edited

Loading