-
Notifications
You must be signed in to change notification settings - Fork 641
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding auxiliary information to trajectory #785
Comments
From what I understand of your examples, you see the auxiliaries as objects that implement a Also, how do you see the assignment of the auxiliaries to an AtomGroup? In the case of |
Answering to myself. It has to be a custom object, and a simple iterator won't do. Indeed, we will have to handle the random seeking of the trajectory, so we will not only have to read values from the auxiliaries sequentially. Also, we will have to handle alignement of the auxiliaries: an auxiliary may be save with a different frequency as the coordinates. |
What I was thinking is that if the Ah yeah, sparse attributes aren't really handled currently. We'd need something like how pandas does missing information. And yes, meshing different frequencies is a lovely headache for someone. |
OK. So this is very similar to something I had in mind for coordinate transformations. I'll tidy up my proposal so it can be discussed in parallel, then. |
@richardjgowers I thought the class Reader(object):
def add_auxillary(self, name, val):
self.__dict__[name] = val
read = Reader()
read.add_auxillary('pulling_force', np.arange(5))
read.pulling_force This makes it very easy to add new values. We woudn't need to write special classes that can be added as auxillaries as in your last example. This way the information is easily accesible like |
@kain88-de yeah hacking it into the namespace is cool. But there still needs to be some sort updating of the auxiliary data every time a frame is read? |
Yes. I've just given a simple example of the
Of course each needs to be handled specially. Maybe with a class. And we need to do more then just add to the |
I would prefer the auxiliary data to be in a separate namespace. I think it would be cleaner to call @kain88-de I am not sure I see how what you suggest differs from @richardjgowers initial proposal. Could you elaborate? |
It doesn't differ that much. Rather extends and generalizes it. With @richardjgowers example we would need to write a new class for every new auxillary that we want to add. While I propose that we simple write classes to read the different file formats, this is because xvg and others can actually contain a lot of different observables, this leaves the user more freedom. # Richard
u.trajectory.add_auxiliary(PullForce('pull.xvg'))
# Me
u.trajectory.add_auxiliary('pulling_force', XVGReader('pull.xvg'))
## And an alias
u.trajectory.add_pulling('pull.xvg') |
On 21/03/16 23:02, kain88-de wrote:
|
yes that we the headache @richardjgowers talked about would only need to be solved once. |
Is this what @fiona-naughton is going to do for GSoC? If so, we should restart the discussion here, especially with Fiona included. |
Adding to dict directly indeed sounds good. On the time sync front: when the auxiliary/trajectory are in phase (as it were), I guess it's just a matter of pulling up the matching timepoint, with some form of 'missing' label when the auxiliary data is less frequent. Otherwise we could find the closest time instead, but should probably flag that it doesn't exactly correspond to that point in the trajectory (the missing values in the less-frequent case could also be filled with the closest value); it may be worth also offering some sort of interpolation option to guess the value instead? |
Having finally had a proper look through the existing timestep/reader stuff, if I understand correctly this will work something like this: As above, given an aux name and a filename Similar to the existing readers, the Then in addition to a
Then call So, ultimately: auxiliary data is added by: trajectory.add_auxiliary(auxname, filename, **kwargs) with kwargs optionally including the auxiliary reader, dt, offset, or rep_method. The data can then be accessed either in full through Hopefully that sounds like it would work? I'm not sure where and how best to get all the |
From what I understand, As I wrote earlier in the thread, I think it would be better to access the auxiliary data through Going for the closest time looks to me as the right way to do, assuming we have a user-defined tolerance. That tolerance can be low if we want exact match for the time between the trajectory and the auxiliary (we are talking about floating point number so exact should be understood as "close enough"), it can be higher if the sampling rates do not exactly match and we are OK with this. The pandas library deals with aligning time series. It might be worth having a look at what they do. In addition to what to propose, I would suggest to have the option of raising an exception if a time is missing in the auxiliary data, and the option to iterate over the frames at the pace of the auxiliary data if the auxiliary data are more sparse. |
Yeah data has to be stored in the Timestep object, it'd be annoying if I had to one day rummage around in the Reader to find things. So wrt getting Reader to read aux things, we could add some things to base.Reader like # in base.Reader
def next(self):
ts = self._read_next_frame() # Regular timestep read
for aux in self._auxs:
aux.read_next(ts) # AuxReader examines ts to figure out what to put inside ts
return ts Where |
Ok, thanks. So we end up with the readers being stored in a list as |
On 22/05/16 04:08, Fiona Naughton wrote:
|
You should be able to break up this task by just writing a standalone XVGReader at first... so the following code should work from MDAnalysis.trajectory.auxreaders import XVGReader
myreader = XVGReader('pull.xvg')
for value in myreader:
print value
print myreader.get_value(4) # not sure if this should pass in a Timestep or frame number... can fix later Then all trajectory Readers are just "users" of this code. |
There is a XVGReader in GromacsWrapper. You can use that code as a starting point of you like. Oliver Beckstein Am May 24, 2016 um 3:00 schrieb Richard Gowers notifications@github.com:
|
While it's relatively straightforward to in general get the auxiliary to update with the trajectory when iterating (since (I'm also unsure on the best place to first create the |
You can add a layer to
This might also be useful for You can add an |
Reading through the WIP PR (#868) I came across @richardjgowers' comment that we could read in xvg data all at once. I second that idea. However, I think the same will be applicable to any aux data format. Besides, holding all aux data in memory will certainly speed things up when loading new timesteps. I think it's important to decide on this right away, because the nascent object model will have to change accordingly. I'm bringing this up here, instead of in the PR, to keep the discussion together.
|
On 06/06/16 11:30, mnmelo wrote:
On a shorter term, I would go for the second option as it is what the Any way, AuxReader should not care abut how the data is obtained. It is |
Aux data could easily be something like 3 floats for every atom every step (maybe some derived quantity you don't want to recalculate) which would then make the data comparable in size to the trajectory. I'd rather let individual implementations ( |
Yeah, this was something I was unsure on; I went with the reading-in-step-at-a-time largely because I wanted to try get that figured out first. I agree it's probably safe to assume the xvg files won't be too large; but I can leave it for now, and come back to it later? |
Seems a good and consensual plan. Still, I'd make sure to clearly define the roles of |
With value interpolations, there's already an interpolation class in scipy which does quadratic and cubic interpolations too. Scipy isn't a strict dependency, but it does seem daft to reinvent this wheel... |
Closed by #868 |
Original started on discussion board
It would be nice if we could add arbitrary data to run alongside a trajectory, so..
I think the best route might be adding in the hooks after a Timestep has been read, so
So each
aux
object is like a mini Reader object, which augments a trajectory Reader.In the longer term, it would be cool if
u.trajectory.add_auxiliary(PullForce('pull.xvg'))
, then madeAtomGroup.pullforce
a viable attribute, this should be possible with the new dynamic classes + transplant system.The text was updated successfully, but these errors were encountered: