Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sim should take a real MDAnalysis.Universe instance, too #25

Closed
orbeckst opened this issue May 26, 2015 · 16 comments
Closed

Sim should take a real MDAnalysis.Universe instance, too #25

orbeckst opened this issue May 26, 2015 · 16 comments
Assignees
Milestone

Comments

@orbeckst
Copy link
Contributor

It would be convenient (and more object oriented) if I could say

u = MDAnalysis.Universe(TPR, XTC)
s = Sim(name)
s.universes.add('anyname', u)

instead of s.universes.add('anyname', universe=[TPR, XTC, ...]).

Although I appreciate that this will make it more difficult to recreate the universe. Perhaps should be tackled together with MDAnalysis/mdanalysis#173 , which could supply the necessary state information to reconstitute the universe.

@orbeckst
Copy link
Contributor Author

By the way, Sim.universes.add(name, topol, *trajectories) is inconsistent with the __init__ based universe loading in that it won't take keywords. (Actually, from the docs it is not even clear if kwargs are allowed with the universe kw of the Sim constructor... they should be!)

@dotsdl
Copy link
Member

dotsdl commented May 26, 2015

Thanks for the suggestion. Expanding the universe keyword support is easy enough to add, but as you suggest, won't necessarily re-create the universe since the only thing the Sim stores is topology/trajectory paths, and perhaps resnums (which could be included with this usage with no problems I could see).

I don't know an easy solution to this since universes are not serializable (MDAnalysis/mdanalysis#173) at the moment, and even if they were, I can quickly imagine having serialized universes with so much state built into them that it becomes difficult to tell how they are defined anymore. MDS already sacrifices statelessness at the Container level, but would it make sense at the Universe level, too? Would it create more problems than it solves? I don't know.

@dotsdl
Copy link
Member

dotsdl commented May 26, 2015

By the way, Sim.universes.add(name, topol, *trajectories) is inconsistent with the init based universe loading in that it won't take keywords. (Actually, from the docs it is not even clear if kwargs are allowed with the universe kw of the Sim constructor... they should be!)

Ah...currently Sims only store filenames to construct an MDAnalysis Universe, relying on the MDAnalysis.Universe constructor to interpret what to do with these files. I think it makes sense to be able to (and perhaps even necessary in some cases) to include keywords as part of a universe definition. Storing these would only mean another table in the universe tree inside the Sim's state file, and these could be read and passed to the Universe constructor whenever the definition is instantiated.

@dotsdl dotsdl added this to the 0.6.0 milestone May 26, 2015
@richardjgowers
Copy link
Contributor

I think one stumbling block here is Universe init doesn't remember the kwargs used, so you'll likely want to add a line like

self._kwargs = kwargs.copy()

To Universe __init__, so that MDS can correctly "copy" the Universe. I can't see a problem with a MDA.Universe storing a dict of kwargs?

@orbeckst
Copy link
Contributor Author

On 26 May, 2015, at 00:41, Richard Gowers wrote:

I think one stumbling block here is Universe init doesn't remember the kwargs used, so you'll likely want to add a line like

self._kwargs = kwargs.copy()
To Universe init, so that MDS can correctly "copy" the Universe. I can't see a problem with a MDA.Universe storing a dict of kwargs?

We can definitely add this to Universe as a stop-gap measure until we have something better for persistence.

(One advantage of being involved in multiple projects is that you can easily add hooks for one to the other...)

@richardjgowers
Copy link
Contributor

With this, if we make Universes picklable, then would it be better to store the Universe objects as a pickle of their state rather than their filepaths? This would then mean that universes given as filenames would be loaded, and the pickle of that Universe saved. So the problem gets turned inside out, and we don't have to worry about how Universes were init'ed.

@dotsdl
Copy link
Member

dotsdl commented Jun 26, 2015

That does change the options. I think some experimenting will be in order once Universes become pickleable. Depending on the limitations and caveats with pickling and unpickling universes, this may give cause for rethinking how MDS handles universe definitions.

One advantage of storing the universe definitions as arguments/keywords to the Universe constructor is that the state of the Universe on recall is unambiguous: it is entirely clear what the Universe should look like and how it should behave based on these inputs. Pulling from a binary blob like a pickle, however, means that you get whatever is there, and that could be completely opaque even after loading the Universe.

In other words: I think statelessness is important for doing science in practice. MDSynthesis sacrifices some statelessness for convenience, but it should try to make what state it maintains as transparent as possible. I don't know where the sweet spot is for this philosophy with respect to pickled universes.

@orbeckst
Copy link
Contributor Author

On 26 Jun, 2015, at 12:41, David Dotson wrote:

One advantage of storing the universe definitions as arguments/keywords to the Universe constructor is that the state of the Universe on recall is unambiguous: it is entirely clear what the Universe should look like and how it should behave based on these inputs. Pulling from a binary blob like a pickle, however, means that you get whatever is there, and that could be completely opaque even after loading the Universe.

In other words: I think statelessness is important for doing science in practice. MDSynthesis sacrifices some statelessness for convenience, but it should try to make what state it maintains as transparent as possible. I don't know where the sweet spot is for this philosophy with respect to pickled universes.

I think there's some wisdom in the approach of starting from a clean slate. You might not want to start from a garbled universe. Or we need a "global reset" method on universes ("big bang"...) to start from the initial state. Honestly, I don't know how important that's going to be in practice but I can see pros and cons.

@richardjgowers
Copy link
Contributor

To play devil's advocate, if I'd loaded a Universe and renamed all the atoms to my liking, I'd be pretty dismayed if I'd lost all that when I reloaded the Universe with my favourite persistence engine :)

@orbeckst
Copy link
Contributor Author

On 26 Jun, 2015, at 13:07, Richard Gowers wrote:

To play devil's advocate, if I'd loaded a Universe and renamed all the atoms to my liking, I'd be pretty dismayed if I'd lost all that when I reloaded the Universe with my favourite persistence engine :)

So what you're really saying is, "please @dotsdl, give me a choice" ;-)

@dotsdl
Copy link
Member

dotsdl commented Jun 26, 2015

To play devil's advocate, if I'd loaded a Universe and renamed all the atoms to my liking, I'd be pretty dismayed if I'd lost all that when I reloaded the Universe with my favourite persistence engine :)

I think we can have our cake and eat it too, so long as there is some kind of Sim.universes.reset() method (as @orbeckst suggests) that perhaps takes a universe name as input and removes the pickled version of that universe (or replaces it with a fresh one based on its stored arguments/keywords). This would allow a user to nuke the universe's stored state without having to mess with re-adding the universe definition entirely.

Sound good? I'm not up on where we're at with Universe pickles; is it possible to begin prototyping this functionality now?

@richardjgowers
Copy link
Contributor

I think it's just the file handles open on Readers that stops them being pickled, I've got a first draft of a fix for that sort of working:

MDAnalysis/mdanalysis@d72ff7b

Once that's working, the getstate/setstate for Reader become easy, and I think that's everything, so soon(ish)

@dotsdl
Copy link
Member

dotsdl commented Jun 26, 2015

Awesome. In the meantime I'll work on the issue of handling and storing keywords. I think this hybrid approach of storing inputs for Universe.__init__ and pickling might work rather well without being too confusing. The details will require some thought, though.

@dotsdl dotsdl modified the milestones: 0.5.1, 0.6.0 Aug 25, 2015
dotsdl added a commit that referenced this issue Apr 1, 2016
Added test to make sure this keeps working; caveat: does not
automatically pull keywords used to initialize the universe, so not
perfect. See #25.
@dotsdl
Copy link
Member

dotsdl commented Apr 1, 2016

It's now possible to set a Sim's universe directly:

import mdsynthesis as mds
import MDAnalysis as mda
from MDAnalysisTests.datafiles import GRO, XTC

u = mda.Universe(GRO, XTC)

s = mds.Sim('adk')
s.universe = u

Note that at the moment this only extracts the topology and trajectory paths and stores these; it does not extract any keyword arguments used to initialize the Universe as this requires the Universe to know these (MDAnalysis/mdanalysis#292).

@richardjgowers
Copy link
Contributor

So once mda can pickle the topology, we can then make this work 'perfectly', right?

@dotsdl
Copy link
Member

dotsdl commented Apr 1, 2016

Basically, yeah. The long game I'm thinking is that once mda can JSON serialize the topology (MDAnalysis/mdanalysis#643), then perhaps we make a setting for the Universe such that it serializes the topology to a file on every change. MDS can then take advantage of this. There are certainly details to think about, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants