Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposed conversion from u_kln->u_kn formalism #83

Merged
merged 5 commits into from Jul 16, 2014
Merged

Conversation

mrshirts
Copy link
Collaborator

@mrshirts mrshirts commented Jul 9, 2014

includes a slightly modified harmonic_oscillator_update.py, which gives identical behavior with the new code to pymbar-examples/harmonic_oscillator.py run with the previous code up to numerical precision.

This new code automatically converts from old-style u_kln matrices internally.

It supports all existing functionality, including updating the weave code and the _pymbar.c code.

The one major difference is in handling computeExpectations. This change hopefully provides a much more general solution which can possibly reduce the total amount of repetitive code as well.

Previously, computeExpectations recognized whether observables were state_dependent or state_independent by whether the observable array was KxN_max, or KxLxN_max. Now, this is done by giving an explicit state_dependent flag. If state_dependent is False, computeExpectations handles the code like the 2D version before (although now the array is 1D, the observable at each state).
The big difference is if state_dependent is True. In this case, computeExpectations expects K different observables (the K different observables at each state - for example, the potential energies in each of the K states), as well as K different energies (which defaults to just the u_kn matrix MBAR was generated with). It then calculates the expectations of the first observable and the first state, 2nd observable and the 2nd state, and so forth, which is what the previous code did.

In order to make this happen, I've created a computeGeneralExpectations routine, that takes in I observable arrays and K state observables (see see #89 ).

I'm up for lots of other improvements; but I though it would be easier to start by solving the biggest problem first independently of other issues.

@kyleabeauchamp
Copy link
Collaborator

Do you know why this can't be automatically merged? It seems like your branch is messed up somehow.

@kyleabeauchamp
Copy link
Collaborator

Also, eventually we should probably get all tests running via nose, but we can help you migrate your tests later.

@mrshirts
Copy link
Collaborator Author

mrshirts commented Jul 9, 2014

No idea why it can't be merged automatically. I can try to find a way to commit it some other way as a patch if you want -- it's two clean commits, plus manually adding the master patch which was on top of it.

Nose tests can be done after in the polishing if we like what's here.

@kyleabeauchamp
Copy link
Collaborator

Are you basing this on a branch that doesn't include your recently merged pull request? I think that could be the issue.

@mrshirts
Copy link
Collaborator Author

mrshirts commented Jul 9, 2014

Yes. It's a trivial change to do, but perhaps GitHub can't handle. I supposed I should have rebased onto that version first! If you can't think of an easy way, I can try to just create a new branch with the changes.

@mrshirts
Copy link
Collaborator Author

mrshirts commented Jul 9, 2014

Resolving the conflicts will be trivial, I just need to figure out how to do the right resolves . . .

@kyleabeauchamp
Copy link
Collaborator

@mrshirts
Copy link
Collaborator Author

OK, looks better now. Travis CI will probably fail, I'm guessing (since it wasn't designed to get it right). Might be best to check it out independently and look at it directly for the logic. Not that important to get it perfect, since it's in its own branch.

@mrshirts
Copy link
Collaborator Author

Looks like it failed on the imports? I'm not really sure why, that didn't change at all.

@kyleabeauchamp
Copy link
Collaborator

That's the version issue which I will look into independently.

@kyleabeauchamp
Copy link
Collaborator

I guess you've made lots of changes to pymbar.py? It seems like GitHub can't even show the diff because there are so many changes. Does that sound right?

@mrshirts
Copy link
Collaborator Author

Yes, mbar.py has a lot of changes. All pretty straightforward, but a lot. Lots of overhaul required to change the entire formalism.

@kyleabeauchamp
Copy link
Collaborator

So I've gone through most of the code (halfway through mbar.py, through the rest). Overall things are looking good. Below is what I think is the to-do list. For clarity, we can make some of these changes in later pull requests where a single pull request is a single "transformation" of the code. However, things that are obvious bugs should probably get fixed before merging IMHO.

  1. Run pyflakes to fix various possible syntax errors (see below for log)
  2. Move MBAR._kln_to_kn(u_kn) and MBAR._kn_to_n() to separate functions in utils.py. These function are generally useful beyond just MBAR and should be exposed to users looking to migrate their workflows to the new style.
  3. "# take the square root of the matrix" we should probably change this comment, as it's not the "square root of a matrix"
  4. Pull the various #todo comments into separate issues
  5. Sort out useGeneral stuff--right now, we've doubled the number of branches in computeExpectations(). I would prefer reducing the branching and having separate member functions (private?) for the different ways of calculating things. IMHO having product-spaces of code branches inevitably leads to corner case errors.
  6. Next time we make a giant change like this, we should separate the numpy -> np conversion to a separate pull request that only contains that change. That will make our bookkeeping and code review much simpler and hopefully make it easier to identify bugs. This will also make our lives easier if we ever need to go back and look at the log of code changes.
pymbar/mbar.py:71: 'ConvergenceError' imported but unused
pymbar/mbar.py:71: 'BoundsError' imported but unused
pymbar/mbar.py:72: 'pdb' imported but unused
pymbar/mbar.py:197: '_pymbar' imported but unused
pymbar/mbar.py:441: undefined name 'numpy'
pymbar/mbar.py:524: local variable 'L' is assigned to but never used
pymbar/mbar.py:736: undefined name 'general_returns'
pymbar/mbar.py:818: undefined name 'dot'
pymbar/mbar.py:818: local variable 'T_kk' is assigned to but never used
pymbar/mbar.py:1230: undefined name 'numpy'
pymbar/mbar.py:1361: undefined name 'numpy'
pymbar/mbar.py:1633: undefined name 'bin_k'
pymbar/mbar.py:1659: undefined name 'bin_kn'
pymbar/mbar.py:2134: undefined name 'x_kindices'
pymbar/mbar.py:2134: undefined name 'x_kindices'
pymbar/mbar.py:2138: undefined name 'x_kindices'
pymbar/mbar.py:2138: undefined name 'x_kindices'
pymbar/mbar.py:2187: local variable 'z' is assigned to but never used
pymbar/mbar.py:2193: local variable 'K' is assigned to but never used
pymbar/mbar.py:2194: local variable 'f_k' is assigned to but never used
pymbar/mbar.py:2195: local variable 'N' is assigned to but never used
pymbar/mbar.py:2196: local variable 'N_k' is assigned to but never used
pymbar/mbar.py:2197: local variable 'u_kn' is assigned to but never used
pymbar/mbar.py:2226: local variable 'info' is assigned to but never used
pymbar/mbar.py:2532: undefined name 'N_k'

@kyleabeauchamp
Copy link
Collaborator

I've also run the test script on my machine--it seemed to be working OK.

@kyleabeauchamp
Copy link
Collaborator

Possible typo in docstring of computePMF(): "bin_n[n]" (later called "bin_kn")

Also in computePMF_states()

@kyleabeauchamp
Copy link
Collaborator

OK I made a complete pass through the code and have no additional complaints at this time...

@mrshirts
Copy link
Collaborator Author

Commenting one at a time;

Next time we make a giant change like this, we should separate the numpy -> np conversion to a
separate pull request that only contains that change. That will make our bookkeeping and code
review much simpler and hopefully make it easier to identify bugs. This will also make our lives easier > if we ever need to go back and look at the log of code changes.

Ah, my bad. I was getting carried away with editing and getting tired of writing numpy instead of np. Should I revert that change in the pull request, and then we put it back in in separate commit?

@mrshirts
Copy link
Collaborator Author

"# take the square root of the matrix" we should probably change this comment, as it's not the "square root of a matrix"

Fixed (will update the pull request after number of these).

@mrshirts
Copy link
Collaborator Author

Run pyflakes to fix various possible syntax errors (see below for log)

Done, except for ones that are caused by weave or importing the helper functions (pyflakes doesn't seem to know about the syntax there).

@jchodera
Copy link
Member

We seem to be having the import version error we've had with other projects here now too:
https://travis-ci.org/choderalab/pymbar/builds/29558424#L518

@jchodera
Copy link
Member

@mrshirts: @kyleabeauchamp is out for a week, so it's up to us to proceed, I think.

Do all the nosetests work on your local version? If so, I wonder if we should merge this and debug the from pymbar import version issue separately?

@mrshirts
Copy link
Collaborator Author

@kyleabeauchamp is out for a week, so it's up to us to proceed, I think.

Ah, OK. I'll probably get the quick fixes in. The one remaining item is point 5, Expectations vs. General Expectations. Let's have that conversation when Kyle comes back.

I'll check the nosetests again. If it passes those, and when everything except point 5 is resolved, yes, I think we should go ahead and merge it in to the kn branch. Maybe hold off merging the branches.

@jchodera
Copy link
Member

What is point 5?

@jchodera
Copy link
Member

Most importantly, did you fix the undefined variable names? Those are actual bugs which apparently don't have corresponding tests to catch them.

@mrshirts
Copy link
Collaborator Author

What is point 5?

[5] Sort out useGeneral stuff

from Kyle's big comment. I'll file a separate issue on this -- it deserves some thinking. The current state of computeExpectations is not what we want, that's just to demonstrate the potential new approach (And that it can be made equivalent to what we want to do next).

Current approach - writing very similar functions 10x
Potential new approach- Writing one somewhat more complex general function with 10 wrappers.

@mrshirts
Copy link
Collaborator Author

Most importantly, did you fix the undefined variable names? Those are actual bugs which apparently > don't have corresponding tests to catch them.

Yes, those are fixed, mod ones that are actually weird syntax due to C++ imports or weave.

@jchodera
Copy link
Member

If the current approach works, then we can work on simplifying the codebase in subsequent PRs. If the current approach is actually "everything is broken", we've got a bit of a problem...

@mrshirts
Copy link
Collaborator Author

Current approach works. this is a question of simplifying the codebase (and in parallel, the theory itself).

@jchodera
Copy link
Member

Great. Go ahead and merge, and we'll simplify in subsequent PRs.

Please make small, individual changes in future! And as soon as you start working on a feature, open a new PR and title it "My new feature [WIP]" to denote work in progress...

@mrshirts
Copy link
Collaborator Author

Please make small, individual changes in future! And as soon as you start working on a feature,
open a new PR and title it "My new feature [WIP]" to denote work in progress...

I think in this case, most of it all had to go in together so that the u_kn infrastructure could be supported. I agree smaller changes are usually better.

@jchodera
Copy link
Member

I think in this case, most of it all had to go in together so that the u_kn infrastructure could be supported. I agree smaller changes are usually better.

Logically grouped changes should occur together, but as Kyle suggested, this shouldn't include things like changing numpy. to np. at the same time, since it completely eliminates our ability to do meaningful code review by looking at the autogenerated diffs. That's all that was meant!

@mrshirts
Copy link
Collaborator Author

Sounds good. Also, the latest pull request reverted the numpy change.

@jchodera
Copy link
Member

Thanks! OK, I'll merge this, and we can work out the import version issues separately!

jchodera added a commit that referenced this pull request Jul 16, 2014
Extensive conversion from u_kln->u_kn formalism.  Still need to debug `import version` issues with travis-ci.
@jchodera jchodera merged commit 7041caa into choderalab:nk Jul 16, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants