Timestep API interface discussion (was: should Timestep.time start with 0.0?) #252

dotsdl · 2015-04-14T01:49:10Z

The coordinates.Reader.time property gives the frame number (1-based) multiplied by the timestep (dt) between frames of the trajectory (determined from the first two frames of the trajectory for xdrfiles). However, the result of this is that the time of the first frame is shifted by dt from what is commonly found in trajectory files, where the first frame (t = 0) is usually written.

In other words, where universe.trajectory.time would be more accurate to give 0.0, with a dt = 1000.0 ps it would instead give 1000.0 for the time of the first frame.

It might be a bit disruptive to existing code, but should this be changed? It becomes less relevant if time becomes a common attribute for all Readers (#250), but then again, it also offers a point for confusion.

The text was updated successfully, but these errors were encountered:

richardjgowers · 2015-04-14T09:21:47Z

I think this makes sense. The obvious exception should be that if a Timestep has its own record of time (eg the simulation didn't start at t=0) then this should override everything.

The only thing I can think of is that this is technically something that should "belong" to the Timestep, a Reader doesn't have an inherent time, but a Timestep does. So all the handling should be done there... so probably all Reader should do is access Timestep.time... ie as a Timestep, I try and return my record of time, but failing that I fudge one for the Reader.

So something like:

class Reader
    @property
    def time(self):
        return self.ts.time

class Timestep
    @property
    def time(self):
        try:
            return self.ts.time
        except AttributeError:
            return self.ts.frame * self.ts.dt

I'd consider the current implementation of Reader.time a bug worth changing.

dotsdl · 2015-04-14T20:25:36Z

So all the handling should be done there... so probably all Reader should do is access Timestep.time... ie as a Timestep, I try and return my record of time, but failing that I fudge one for the Reader.

I agree that Reader.time should just return whatever Reader.ts.time gives. Since ts.time should be present across all Timestep objects (#250), is there ever an instance where we would ever be compelled to fudge a time? The current implementation assumes a constant dt, which will be completely off if a frame is missing or if the timestep varies across the trajectory (perhaps rare, but still).

In short, are there trajectory formats that don't include time information? If so, how to handle them cleanly?

richardjgowers · 2015-04-14T21:56:55Z

Yeah I think I was half asleep when I wrote that, if .time is missing then I'm not sure how .dt would be populated...

I think it'd just be single frame stuff that doesn't have time.

mnmelo · 2015-04-15T12:35:25Z

I'm sure we'll run into exceptional cases. For instance, one can load multiple single-frame .pdb's or .gro's using the Chain Reader.
What time should be then made up?

Also, with the Chain Reader, will we reset the time back to 0 when multiple trajectories are loaded and the iteration goes from one to the next?

orbeckst · 2015-04-16T00:07:19Z

IIRC, the original idea for having Reader.time in the reader was to make clear that this is something that the reader guesses and not something read from the trajectory. However, this seems pretty confusing and we should rather strive for a consistent interface and simply doing the best we can under the hood.

There are a bunch of formats like XYZ or multi-PDB, which are common lowest common denominator — and in fact not even DCD stores the time in a frame. You have to compute it from the header pretty much as frame_number * dt.

orbeckst · 2015-04-16T00:10:08Z

It sounds like a good idea to me to make Timestep responsible for the time, similar to what it already does for dimensions.

We could get rid of Reader.time for the 1.0 release but in the meantime deprecate trajectory.time (while just returning whatever Timestep.time says).

orbeckst · 2015-04-21T06:28:25Z

I propose to

introduce universally Timestep.time (as part of the API):
- If the trajectory records time, it will contain the current frame's time (in MDA units)
- If the trajectory records time between frames dt and frame numbers N, it will calculate the time as N * dt.
  - Note: we need to decide if frame numbers should start at 0 or 1!
  - We could also introduce a new optional trajectory keyword time_offset (default 0) which could be used to start a trajectory at a later time, i.e. Timestep.time = N * dt + Reader.time_offset.
- If the trajectory does not contain time information then log a warning and pretend that dt = 1 and use the N * dt calculated time.
deprecate Reader.time
let the ChainReader try to use the trajectory time or the calculated time but if this fails (e.g. time for step N+1 is less than time for step N) use time_offset with some heuristic guesses to appropriately glue trajectories together.

The original question ("should trajectory time start at 0?") really boils down to "should frame numbering be 0 or 1 based". Currently it is supposed to be 1-based but this always creates confusion with the trajectory indexing and slicing notation so I would be in favor of harmonizing frame numbers and slicing/indexing and making it 0-based. I am not really sure, though, if there are deeper/annoying consequences to that decision.

Comments?

Once we have a sensible plan, we should open a few separate issues on sub-tasks. For instance, something like time_offset could be added later and the ChainReader can be handled separately once all other parts are in place. Just changing frame numbers from 1 to 0-based deserves its own issue.

richardjgowers · 2015-04-21T09:21:16Z

I think all the above sounds good. Changing frame numbering to 0 based makes sense in that trajectory[0] gives frame 0.

So if we're moving time into Timestep, then ChainReader will likely need to pass some sort of time_offset to Timestep.. (another reason why having a better constructor would be useful #250)

class Timestep
    def __init__():
        self.time_offset = kwargs.pop('time_offset', 0)

    @property
    def time(self):
        return self.frame * self.dt + self.time_offset

Then it's confusing if you need a frame offset or a time offset (ie are all the dts constants across your trajectories?).. but that's a headache for another Issue.

I'm putting this into 0.11 (the big API break).

dotsdl · 2015-04-26T07:16:49Z

If the trajectory records time between frames dt and frame numbers N, it will calculate the time as N * dt.

Are there trajectory formats that do this? That is, that assume a constant dt between each frame stored?

richardjgowers · 2015-04-26T07:17:42Z

Dcd I think
On 26 Apr 2015 08:16, "David Dotson" notifications@github.com wrote:

If the trajectory records time between frames dt and frame numbers N, it
will calculate the time as N * dt.

Are there trajectory formats that do this? That is, that assume a constant
dt between each frame stored?

—
Reply to this email directly or view it on GitHub
#252 (comment)
.

orbeckst · 2015-05-20T08:54:18Z

Yes, DCD does not store time stamps and instead you need to calculate time = frame * dt where dt is stored in the DCD header.

As I haven't heard more on this, I have summarized the conclusions and we should go ahead with the following changes:

Frame numbers

Frame numbers will start at 0 (i.e. zero-based instead of the current 0.9.2 practice of being 1-based).

This will harmonize slice notation with the frame numbering.

We'll need to be sure to have the tests working properly and go through the code with a fine comb because there might be various places where 1 is substracted/added....

Handling of time in `Timestep`

We will introduce a new optional trajectory keyword time_offset (default 0) to each Reader and the Timestep. I is used to start a trajectory at a later time. The Reader needs to pass the keyword's value to its Timestep, e.g.
```
class Timestep
def __init__():
    self.time_offset = kwargs.pop('time_offset', 0)

@property
def time(self):
    return self.frame * self.dt + self.time_offset
```
Note that in the above example the time property calculates the time from the frame number and the dt (time per frame) but in other cases it might take the time directly from a time stamp in the trajectory. In any case, time_offset is always added.
introduce a new optional trajectory keyword dt (default None) to Timestep so that the Timestep may compute a time as time = frame * dt (see next point).
introduce universally Timestep.time (as part of the API):
- If the trajectory records time, it will contain the current frame's time (in MDA units)
- If the trajectory records time between frames dt and frame numbers N, it will calculate the time as N * dt.
- If the trajectory does not contain time information then log a warning and pretend that dt = 1 and use the N * dt calculated time.
deprecate Reader.time (for 1.0)
update ChainReader
- Let the ChainReader try to use the trajectory time or the calculated time.
- But if this fails (e.g. time for step N+1 is less than time for step N) use time_offset with some heuristic guesses to appropriately glue trajectories together.

Further actions

Please open issue for relevant sub-tasks such as

implement time_offset
implement Timestep.time
change frame numbering from 1 to 0 based (test cases!!)
make the ChainReader work with the new system

Add relevant issues number to this comment or mention issue #252.
Close this issue once all sub-issues have been raised.

Updates to this comment

2015-05-20: added keyword dt for Timestep (obviously needed)

orbeckst · 2015-05-24T17:58:06Z

@dotsdl – can I please leave it to you to follow-up on this issue (raise new ones and monitor implementation)? I suggest that you copy the API proposal from my last comment to the wiki, raise the sub-issues as needed and reference the new wiki page. Once the issues are raised, close this one.

Thanks,
Oliver

dotsdl · 2015-05-24T20:38:14Z

Yes. Setting aside tomorrow (Memorial Day) to hammer on this, and probably the related issue #250.

orbeckst · 2015-06-24T22:22:46Z

I think I can close this issue because @dotsdl split it nicely into sub-issues #306, #307, #308, #309, and #310.

All these issues link back here so we can always come back to the discussion.

dotsdl added question usability labels Apr 14, 2015

orbeckst added the API label Apr 16, 2015

orbeckst removed the question label Apr 21, 2015

richardjgowers added this to the 0.11 milestone Apr 21, 2015

dotsdl mentioned this issue Apr 26, 2015

Trajectory reader Timestep objects give inconsistent interfaces #250

Closed

orbeckst assigned orbeckst and dotsdl and unassigned orbeckst May 21, 2015

richardjgowers added the Component-Readers label Jun 6, 2015

orbeckst added the Priority-Medium label Jun 20, 2015

orbeckst closed this as completed Jun 24, 2015

orbeckst mentioned this issue Jul 15, 2015

Reader API cleanup #350

Closed

orbeckst changed the title ~~Trajectory.time property: should it give 0.0 ps for first frame?~~ Timestep API interface discussion (was: should Timestep.time start with 0.0?) Jul 15, 2015

orbeckst mentioned this issue Aug 4, 2015

helper script for upgrading from 0.10 to 0.11 (ten2eleven.py) #377

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Timestep API interface discussion (was: should Timestep.time start with 0.0?) #252

Timestep API interface discussion (was: should Timestep.time start with 0.0?) #252

dotsdl commented Apr 14, 2015

richardjgowers commented Apr 14, 2015

dotsdl commented Apr 14, 2015

richardjgowers commented Apr 14, 2015

mnmelo commented Apr 15, 2015

orbeckst commented Apr 16, 2015

orbeckst commented Apr 16, 2015

orbeckst commented Apr 21, 2015

richardjgowers commented Apr 21, 2015

dotsdl commented Apr 26, 2015

richardjgowers commented Apr 26, 2015

orbeckst commented May 20, 2015

orbeckst commented May 24, 2015

dotsdl commented May 24, 2015

orbeckst commented Jun 24, 2015

Timestep API interface discussion (was: should Timestep.time start with 0.0?) #252

Timestep API interface discussion (was: should Timestep.time start with 0.0?) #252

Comments

dotsdl commented Apr 14, 2015

richardjgowers commented Apr 14, 2015

dotsdl commented Apr 14, 2015

richardjgowers commented Apr 14, 2015

mnmelo commented Apr 15, 2015

orbeckst commented Apr 16, 2015

orbeckst commented Apr 16, 2015

orbeckst commented Apr 21, 2015

richardjgowers commented Apr 21, 2015

dotsdl commented Apr 26, 2015

richardjgowers commented Apr 26, 2015

orbeckst commented May 20, 2015

Frame numbers

Handling of time in Timestep

Further actions

Updates to this comment

orbeckst commented May 24, 2015

dotsdl commented May 24, 2015

orbeckst commented Jun 24, 2015

Handling of time in `Timestep`