Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use a explicit iteration scheme #2269

Open
goanpeca opened this issue May 29, 2019 · 6 comments

Comments

@goanpeca
Copy link

commented May 29, 2019

First of all thanks for this amazing library!

I am not working in this field directly, but some friends are and asked me for some help in solving some issues. I have been working with python for quite some time and the first snippet they showed me from the examples:

import MDAnalysis
from MDAnalysis.tests.datafiles import PSF, DCD   # test trajectory
import numpy.linalg

u = MDAnalysis.Universe(PSF,DCD)  # always start with a Universe
# can access via segid (4AKE) and atom name
# we take the first atom named N and the last atom named C
nterm = u.select_atoms('segid 4AKE and name N')[0]
cterm = u.select_atoms('segid 4AKE and name C')[-1]

bb = u.select_atoms('protein and backbone')  # a selection (AtomGroup)

for ts in u.trajectory:     # iterate through all frames
    r = cterm.position - nterm.position # end-to-end vector from atom positions
    d = numpy.linalg.norm(r)  # end-to-end distance
    rgyr = bb.radius_of_gyration()  # method of AtomGroup
    print("frame = {0}: d = {1} A, Rgyr = {2} A".format(
          ts.frame, d, rgyr))

What stroke me at first is the implicit iteration scheme where the value of cterm, bb and nterm is implicitly changing instead of (what I was expecting) cterm[ts].position or cterm.position[ts] or via a helper method cterm.position.frame[ts] or cterm.frame[ts].position (not suggesting any of those in particular, but looking at that code as a seasoned Python developer was very confusing).

My first question to them was Why are you not using the ts for anything?

This feels odd and is not pythonic. (explicit is better than implicit)

Also this does not seem to work as expected on some cases where the selection from the universe had to be made again inside the for loop or the value was not actually changing for some parameter (I cannot recall which one at the moment but I will come back to update this).

Are there any plans to using a pythonic and explicit approach in a future version so that users can select the actual time step using the time step value?

I would be happy to help :-)


On another note the conda packages for python3 seem to not work. I have some experience with conda and would be happy to look into it as well

@orbeckst

This comment has been minimized.

Copy link
Member

commented May 29, 2019

In what way do the Python 3 packages not work?

@goanpeca

This comment has been minimized.

Copy link
Author

commented May 29, 2019

In what way do the Python 3 packages not work?

Importing the package was resulting in a package not found error
Like this #1446

But for this issue I was actually more interested in the explicit vs implicit thing

@orbeckst

This comment has been minimized.

Copy link
Member

commented May 29, 2019

Short answer to your question: Object orientation. The basic philosophy is that each object knows their own data, even if the data can change with time. An AtomGroup knows what atoms it contains and the positions (and velocities and forces). Conceptually, there's state associated with a Universe, in particular the time step at which the trajectory currently is (think of a cursor or a read head of a tape). This state determines some of the data that the objects that represent constituents of the simulation system (such as atoms or groups of atoms) currently hold.

I agree that users tend to have to get used to the idea that their AtomGroups can contain changing data and people do sometimes get confused. It is something that we need to do a better job explaining. Changing to an "apply functions to Timestep" API would be a bit more intuitive at first.

On the other hand, "everyone" who's been using MDAnalysis for a while would find the explicit approach clunky and un-necessarily complicated. (Ok, obviously I am wildly extrapolating from a few people to everyone...). However, the point is that the user interface of MDAnalysis has been worked on for many years and incorporates a lot of experience of what seems to work well in the context of working with MD trajectories. We used to have AtomGroup.coordinates(ts) but this was never used by anyone so it was turned into a (managed) attribute AtomGroup.positions that just contains the current coordinates. (Internally, we do take the Timestep object and effectively apply a function to it.)

Maybe @richardjgowers @dotsdl @kain88-de remember some of the discussion around coordinates(ts) vs positions...

@orbeckst orbeckst added the question label May 29, 2019

@orbeckst

This comment has been minimized.

Copy link
Member

commented May 29, 2019

some friends are and asked me for some help in solving some issues.

Please encourage them to ask for help on the user mailing list https://groups.google.com/group/mdnalysis-discussion

@goanpeca

This comment has been minimized.

Copy link
Author

commented May 29, 2019

Thanks for the explanation @orbeckst

I understand the reasons you expose (and object orientation ;-) ) and if this has been the case for many years, well it is easy to make that the norm.

However, the point is that the user interface of MDAnalysis has been worked on for many years and incorporates a lot of experience of what seems to work well in the context of working with MD trajectories.

Sure, that makes sense!

We used to have AtomGroup.coordinates(ts) but this was never used by anyone so it was turned into a (managed) attribute AtomGroup.positions

I am probably missing a lot of history here as well :-)

My "concern" is not really about having more documentation to explain why this is the way it is or why it needs to be used the way it is used. This Is more that by using the python language, a lot of things are expected from it, and this behavior deviates from that expectation by "hiding" things from the user as you point out with:

(Internally, we do take the Timestep object and effectively apply a function to it.)

If this was a explicit parameter, no extra documentation would be needed to explain what I believe is "an odd/non standard behavior". Yes it would probably require a few more characters (approach clunky and un-necessarily complicated) to achieve the same thing that users have now become accustomed after years of following the same pattern. Somehow it feels like the api goes against the language it is implemented in.

Thanks again for the explanations


Please encourage them to ask for help on the user mailing list

Will do!

@orbeckst

This comment has been minimized.

Copy link
Member

commented May 29, 2019

Thanks for your comments!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.