NCMC Code Refactoring #44

nathanmlim · 2017-03-28T19:50:55Z

This [WIP] is meant to refactor the NCMC core code. I've placed the new refactor code into a separate directory blues_refactor so that I could test them side-by-side to make sure I haven't introduced any new bugs during refactoring. Briefly, I've basically gone in and tried to reduce large code blocks in ncmc.py and moved them into their own functions. I've added 3 classes to handle some basic routines:

SimulationFactory handles generating a set of 3 OpenMM Simulations objects for the MD, NCMC, Alchemical parts of the BLUES engine.
LigandModeller intended to handle calculating any properties relating to the ligand/object we're moving around. As of now, we're just calculating particle masses but I think down the line we may want to calculate other properties on like protein residues or water densities etc.
ProposeMove intended to be a class where we basically define our move sets.
Much of the simplifications from introducing these classes can be seen in the new example.py script

TODO:

Add Docstrings
Add unit tests for new classes
Finish up simplifying/moving the ncmc.Simulation.run() code blocks.

sgill2 · 2017-03-28T22:28:51Z

So these are just my initial thoughts/questions on the code.

I'm not too clear on the sims object that you use as the SImulation object argument. It looks like it's supposed to be an object that holds the openmm simulation, alch_sim and ncmc contexts. If that's the case what benefit is there doing it this way as specifying them via separate arguments?
Could you elaborate on what the model parameter is supposed to be?

Overall though, it seems more nicely structured than the previous code, and is heading in a good direction.

davidlmobley · 2017-03-28T23:41:52Z

README.md

@@ -12,6 +12,7 @@ This also provides a prototype and validation of the SMIRFF SMIRKS-based force f
 * `run_scripts/` - example scripts to run blues
 * `systems/` - some example systems to run blues on.

+<<<<<<< HEAD


Do you have a merge conflict here?

Oops. Will fix that.

davidlmobley · 2017-03-28T23:42:13Z

blues/example.py

+
+* Benchmarking related code adapted from:
+https://github.com/pandegroup/openmm/blob/master/examples/benchmark.py
+(Authors: Peter Eastmen)


davidlmobley · 2017-03-28T23:42:33Z

blues/example.py

 def runNCMC(platform_name):
+=======
+def runNCMC(options):
+>>>>>>> 63cd22390b2e6595bd1defa81aa758e62ae31fc2:examples/example.py


Merge conflicts?

davidlmobley · 2017-03-28T23:43:57Z

README.md

+Install the required packages into a new environment
+
+```
+conda create -c omnia -c omnia/label/dev -n blues python=3.5 openmm==7.0.1 openmmtools alchemy mdtraj


Wait, are we in omnia now?

And, openmm 7.0.1 not 7.1?

(Oh, I see, this is creating a blues environment. Sorry, my bad. But the question on OpenMM still stands.)

It seems I forgot to commit the conflicts in the README.md. It seems this is using a much older version. I'll fix this.

davidlmobley · 2017-03-28T23:46:05Z

You have some extra .pyc files here that you should be able to (a) remove, and (b) get ignored by adding/modifying the .gitignore file.

davidlmobley · 2017-03-28T23:47:20Z

blues_refactor/example.py

+            'verbose' : True }
+
+    #Defines ncmc move eqns for lambda peturbation of sterics/electrostatics
+    opt['functions'] = { 'lambda_sterics' : 'step(0.199999-lambda) + step(lambda-0.2)*step(0.8-lambda)*abs(lambda-0.5)*1/0.3 + step(lambda-0.800001)',


What does this mean? (OK, I can work through it, but it's a bit opaque). What does it control?

(Obviously, it controls lambda_sterics... But what exactly is going on here? )

I'm still in the process of forming all my opinions, so I'll keep writing them down as I go.
I think passing an opt dictionary for run() that handles runoptions makes things unclear to the user what actually can be done with it. I'd much rather have too many parameters, which users at least know exist, than having a mysterious, but important parameter.

@sgill2 - having a dictionary of options can be helpful as it can allow for extension/modification of the options without modifying a list of required arguments. I'm less clear as to whether this specific implementation is the best way to do it or not, but that's the rationale I believe (and to a large extent I buy into it).

@nathanmlim - further comments?

All I've done here is take this line and chucked it into the opt dictionary. In retrospect, I should move this into the SimulationFactory as it only appears to be needed during the nc_integrator creation.

davidlmobley · 2017-03-28T23:50:21Z

blues_refactor/example.py

+    sims.createSimulationSet(opt)
+
+    # Calculate particle masses of object to be moved
+    from blues.modeller import LigandModeller


I wonder if "LigandModeller" is really the best name for this thing if you're expecting perhaps generalizing to calculation of other properties later. What if we wanted to apply BLUES to a protein sidechain (rotamer sampling) or to swapping a ligand with water? (Or to jumping a water, rather than a ligand, between sites?)

I'm not really attached to naming and couldn't think of a good name on the fly. So if you have suggestions I'm open. I don't even think Modeller was a good choice either since OpenMM uses this to manipulate objects rather than get properties on them.

@sgill2 - suggestions? What is the general purpose of this thing? ObjectProperties? MoveProperties? MoveObject? MovingObject? The last one seems to appeal to me most, as it might or might not be a ligand, and I also don't like the modeller terminology because of (a) UCSF Modeller, and (b) Modeller within OpenMM, as you note, @nathanmlim .

I think ObjectProperties seems like the best bet here.

davidlmobley · 2017-03-28T23:52:14Z

blues_refactor/example.py

+    model.calculateCOM()
+
+    # Propse some move
+    rot_move = ncmc.ProposeMove(sims.nc, model)


Should ProposeMove be MoveProposal or similar? Right now it reads like a function name (action implied!) to me, so I was confused when I read it -- I thought "wait, you can't be proposing a move yet, as you haven't said what kind of move you want."

Okay, I can change this to MoveProposal

davidlmobley · 2017-03-28T23:53:00Z

blues_refactor/example.py

+    # Propse some move
+    rot_move = ncmc.ProposeMove(sims.nc, model)
+    rot_step = (opt['nstepsNC']  / 2) - 1
+    nc_move = { 'type' : 'rotation',


@sgill2 - should we be more specific than "rotation", do you think? Since we might imagine other types of rotations (e.g. rotation matrix rather than random quaternion) maybe this should be "random_rotation"?

More specific is probably better, especially since we don't know what other kinds of moves we're adding; more specificity can help avoid name overlaps, so "random_rotation", would be better.

Works for me.

davidlmobley · 2017-03-28T23:58:01Z

blues_refactor/ncmc.py

+
+        if log_ncmc > randnum:
+            self.accept += 1
+            print('NCMC MOVE ACCETPED: {} > {}'.format(log_ncmc, randnum) )


davidlmobley · 2017-03-28T23:58:31Z

blues_refactor/ncmc.py

+        randnum =  math.log(np.random.random())
+
+        ### Compute Alchemical Correction Term
+        if not np.isnan(log_ncmc):


@sgill2 - what happens if it IS a NAN? Not totally obvious to me it doesn't do something strange (does it accept since log_ncmc > randnum below?

I seem to remember something weird happening with NANs in the past, but trying some logical operations with NANs doesn't seem to suggest they would be accepted. I'll make a separate issue to see what happens with the NAN checking removed.

This originally had the if alchemical_correction==True statement with checking if log_ncmc is NaN. So I think that if you were to add the alchemical correction term, it won't let you add it to NaN. Since I've removed the alchemical_correction switch, should we turn NaN into 0 so that we can still add the alchemical correction term?

davidlmobley · 2017-03-29T00:03:31Z

blues_refactor/tests/test_mass.py

+        npos[i] = rotated_particles[index]
+    r_com_coord = test_class.calculate_com(npos)
+    for i in range(3):
+        np.testing.assert_almost_equal(np.asarray(r_com_coord._value)[i], np.asarray(com_coord._value)[i], decimal=4)


Odd character on this line? (Or missing newline?)

nathanmlim · 2017-03-29T16:13:06Z

So these are just my initial thoughts/questions on the code.

I'm not too clear on the sims object that you use as the SImulation object argument. It looks like it's supposed to be an object that holds the openmm simulation, alch_sim and ncmc contexts. If that's the case what benefit is there doing it this way as specifying them via separate arguments?
Could you elaborate on what the model parameter is supposed to be?
Overall though, it seems more nicely structured than the previous code, and is heading in a good direction.

The sims object is just a dictionary containing the 3 simulations (md, alch, ncmc) that is generated out of the SimulationFactory class. Really you can name it whatever you want since it a SimulationFactory object. We have to lug these around together anyways and pass it into the BLUES engine itself to do the run, so for simplicity I just put them into a dictionary to pass as a single object.

model is also an object which, in this example, is referencing the ligand. The properties of the model object can be accessed like model.atom_indices , model.totalmass, model.masses etc.

nathanmlim · 2017-04-21T05:29:50Z

Walked through this with @sgill2 and we have reviewed the refactored code and are certain it does the some process. We did identify some potential bugs and made some other changes that may require another look @davidlmobley. Changes are noted below:

Changed the name from ModelProperties to Model
Changed the lambda functions from this to

functions = { 'lambda_sterics' : 'min(1, (1/0.3)*abs(lambda-0.5))',
 'lambda_electrostatics' : 'step(0.2-lambda) - 1/0.2*lambda*step(0.2-lambda) + 1/0.2*(lambda-0.8)*step(lambda-0.8)' }

Added the negative sign back to the alchemical correction factor in this line

Note: Since the smartdart.py has not been refactored into the new framework and inherits from the old ncmc.py it's tests have been disabled.

davidlmobley · 2017-04-21T20:31:40Z

@nathanmlim - it would be very helpful to have a comment before the functions line relating to lambda which explains a little more about what these lambda_sterics, lambda_electrostatics, etc. are doing and how they are being written. The factors 1/0.3 and 1/0.2 are very non-obvious to me, as are the use of step (which is not defined here); understanding this would presumably require reading the documentation of the relevant code, but this is not linked to. So I'd have to do some archaeology to figure out what this means.

nathanmlim · 2017-04-21T20:42:57Z

@davidlmobley: I think that would be a better question for @sgill2 since I just copied over what he had suggested.

davidlmobley

Do you want to include some of your info on code architecture/organization in a "developer documentation" README or something similar? (Note you can have additional .md files in the main directory which could be linked from the main README.md). I'm just thinking, @nathanmlim , that this could be helpful for people who are getting involved with developing so you don't always have to show them your slides/explain.

davidlmobley · 2017-04-21T20:40:25Z

README.md

@@ -41,9 +41,6 @@ This package takes advantage of non-candidate equilibrium monte carlo moves (NCM
 The heart of the package is found in `blues/ncmc_switching.py`. This holds the framework for NCMC. Particularly, it contains the integrator class that calculates the work done during the NCMC move. It also controls the lambda scaling of parameters. Currently the alchemy package is used to generate the lambda parameters for the ligand, which can potentially modify 5 parameters–sterics, electrostatics, bonds, angles, and torsions.
 The class SimNCMC in `blues/ncmc.py` serves as a wrapper for running ncmc simulations. SimNCMC.runSim() actually runs the simulation.

-## Example Use


Wait, we have no example anymore? What are the plans on this?

Oh, wait, I still see blues/example.py here -- why no mention in README now?

I'm not sure why the entire block got removed. It might have gotten accidently deleted when I was trying to resolve conflicts. It should have just been changed to reference the blues/example.py location.

davidlmobley · 2017-04-21T20:53:34Z

I agree with the sign of the alchemical correction factor, though I've challenged @sgill2 to think about how to check it. But it seems correct to me.

davidlmobley · 2017-04-21T20:55:18Z

blues/example.py

+    simulations.createSimulationSet()
+
+    blues = Simulation(simulations, ligand, ligand_mover, **opt)
+    blues.run()


The example sure becomes nice and concise now. Very nice. :)

davidlmobley · 2017-04-21T20:56:15Z

blues/ncmc.py

+    ligand.masses : list of particle masses of the ligand with units.
+    ligand.totalmass : integer of the total mass of the ligand.
+
+    #Dynamic attribtues that must be updated with each iteration


"attributes"

nathanmlim · 2017-04-21T21:50:01Z

@davidlmobley

Do you want to include some of your info on code architecture/organization in a "developer documentation" README or something similar? (Note you can have additional .md files in the main directory which could be linked from the main README.md). I'm just thinking, @nathanmlim , that this could be helpful for people who are getting involved with developing so you don't always have to show them your slides/explain.

This is a good idea, I'll make the diagram a bit more detailed so it can serve as a quick overview image of the engine.

davidlmobley

This looks good to merge once minor typos and clarification requests are addressed. Let's get this in! Nice work, @nathanmlim .

davidlmobley · 2017-04-21T20:57:54Z

blues/ncmc.py

-            specified uses the residueList in self.residueList
+class MoveProposal(object):
+    """MoveProposal provides perturbation functions for the model during the NCMC
+    simulation. Current supported methods: random rotation.


"random_rotation" probably better.

davidlmobley · 2017-04-21T21:42:28Z

blues/ncmc.py

+        """
+        if ncmc:
+            #Defines ncmc move eqns for lambda peturbation of sterics/electrostatics
+            functions = { 'lambda_sterics' : 'min(1, (1/0.3)*abs(lambda-0.5))',


As noted, it still bothers me I can't understand this with the info in the comments.

davidlmobley · 2017-04-21T21:43:12Z

blues/ncmc.py

+        self.md = self.generateSimFromStruct(self.structure, self.system, **self.opt)
+        self.alch = self.generateSimFromStruct(self.structure, self.system,  **self.opt)
+        self.nc = self.generateSimFromStruct(self.structure, self.alch_system,
+                                            ncmc=True, **self.opt)


Odd character at end of line 332?

davidlmobley · 2017-04-21T21:43:59Z

blues/ncmc_switching.py

 #https://github.com/choderalab/perses/blob/master/perses/annihilation/ncmc_switching.py
+
+Authors: Patrick B. Grinaway, Julie M. Behr, and John D. Chodera


@sgill2 - add note on outline of paper to see if Julie should be an author.

davidlmobley · 2017-04-21T21:46:33Z

blues/simulation.py

+
+    def getStateInfo(self, context, parameters):
+        """Function that gets the State information from the given context and
+        list of parameters to queuey it with.


davidlmobley · 2017-04-21T21:48:00Z

blues/simulation.py

+        if log_ncmc > randnum:
+            self.accept += 1
+            print('NCMC MOVE ACCEPTED: log_ncmc {} > randnum {}'.format(log_ncmc, randnum) )
+            #print('accCounter', float(self.accept)/float(stepsdone+1), self.accept)


Remove old comments? Otherwise use a debug/logging option so they can be used if needed?

davidlmobley · 2017-04-21T21:49:19Z

blues/simulation.py

+        self.nc_context.setVelocities(md_state0['velocities'])
+
+    def run(self):
+        """Function that runs the BLUES engine that iterates of the actions:


"that iterates of the actions" -> "to iterate over the actions"...?

davidlmobley · 2017-04-21T21:53:13Z

@sgill2 - can you get Nathan info to include on how the lambda vectors work? either an explanation or a documentation link to include in the comments or something similar?

sgill2 · 2017-04-21T21:57:45Z

@davidlmobley Could you clarify what you mean by the lambda vector? Do you mean the equations for the functions that control the lambda sterics and electrostatics ?

…ary lines

… in docstring

davidlmobley · 2017-04-21T23:24:16Z

@davidlmobley Could you clarify what you mean by the lambda vector? Do you mean the equations for the functions that control the lambda sterics and electrostatics ?

Yes, @sgill2 - see my comments above to Nathan about that. Basically from the code/comments in the code it's impossible for me to know what the lambda sterics/lambda electrostatics stuff means. I'd like this to be better explained in comments, and/or there to be links to documentation. I don't get the factor 1/3, or what step does, etc. Presumably I could figure it out with some archeology/Googling but since it's an obvious question it makes sense to have SOME info in comments or in the code.

davidlmobley · 2017-04-21T23:25:04Z

Good to merge once tests pass/you're done, I think, @nathanmlim .

nathanmlim · 2017-04-21T23:31:21Z

Added in some comments from Sam for the lambda functions. Also added in devdocs/ folder which has a somewhat incomplete class diagram (haven't gone in and added in the full list of attributes to the other classes) but I'll do that in a separate PR so we can get this merged.

nathanmlim · 2017-04-21T23:32:13Z

Merged.

davidlmobley · 2017-04-21T23:34:15Z

Yay, congrats!

Nathan Lim and others added 8 commits January 27, 2017 16:20

Optimize helper functions

89b7508

Added section to provide credit/acknowledgments

45f82a1

Updates docs, add progress reporter, and remove benchmarking functions

ec4a634

Fix formatting

df659b1

Fix formatting

3772659

Upload notebook for code refactoring

63cd223

Resolve conflicts

04cb986

Initial commit on code refactor

110979b

davidlmobley reviewed Mar 28, 2017

View reviewed changes

davidlmobley reviewed Mar 29, 2017

View reviewed changes

Nathan M. Lim added 8 commits March 29, 2017 10:55

Fix conflicts in README.md

89255a1

Removed .pyc files

b82797b

Move lambda functions into SimulationFactory

a01374f

Change class names and change move dictionary structure

b46f103

Change LigandModeller to ModelProperties

c9655fd

Simplified NCMC code block in ncmc.py

a1f7691

Moved MD block

315428a

Change NCMC print statement

92830ba

Nathan Lim and others added 3 commits April 21, 2017 09:13

Merge branch 'master' into refactor

a9f1012

Update travis.yml to run blues tests

df1cb27

Merge branch 'refactor' of github.com:MobleyLab/blues into refactor

9dd3cff

sgill2 mentioned this pull request Apr 21, 2017

Refactor Smart Darting #47

Closed

davidlmobley reviewed Apr 21, 2017

View reviewed changes

davidlmobley requested changes Apr 21, 2017

View reviewed changes

Nathan Lim added 4 commits April 21, 2017 16:01

Add DLM suggested changes, mainly fixing typos, removing some unecess…

4576e41

…ary lines

Update docstrings at top of each python script, removed version field…

f1570a3

… in docstring

Add comments to clarify lambda functions

5844055

Add devdocs/ to contain class diagram, to be updated later

97d777a

davidlmobley approved these changes Apr 21, 2017

View reviewed changes

Nathan Lim added 2 commits April 21, 2017 16:27

Updated version.py and README to show class diagram

3a03d7a

Fix file pathing for image

36dd6d7

nathanmlim merged commit 6020ae2 into master Apr 21, 2017

nathanmlim deleted the refactor branch April 21, 2017 23:32

davidlmobley mentioned this pull request Apr 21, 2017

Do new point release #49

Closed

		#https://github.com/choderalab/perses/blob/master/perses/annihilation/ncmc_switching.py

		Authors: Patrick B. Grinaway, Julie M. Behr, and John D. Chodera

NCMC Code Refactoring #44

NCMC Code Refactoring #44

Conversation

nathanmlim commented Mar 28, 2017 • edited Loading

sgill2 commented Mar 28, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nathanmlim Mar 29, 2017 • edited Loading

Choose a reason for hiding this comment

davidlmobley commented Mar 28, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nathanmlim commented Mar 29, 2017 • edited Loading

nathanmlim commented Apr 21, 2017

davidlmobley commented Apr 21, 2017

nathanmlim commented Apr 21, 2017

davidlmobley left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

davidlmobley commented Apr 21, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nathanmlim commented Apr 21, 2017

davidlmobley left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

davidlmobley commented Apr 21, 2017

sgill2 commented Apr 21, 2017

davidlmobley commented Apr 21, 2017

davidlmobley commented Apr 21, 2017

nathanmlim commented Apr 21, 2017

nathanmlim commented Apr 21, 2017

davidlmobley commented Apr 21, 2017

nathanmlim commented Mar 28, 2017 •

edited

Loading

nathanmlim Mar 29, 2017 •

edited

Loading

nathanmlim commented Mar 29, 2017 •

edited

Loading