trajectory writer interface is inconsistent #206

GoogleCodeExporter · 2015-04-04T00:44:18Z

The interface to our trajectory writers is inconsistent and should be fixed.

Not all of them accept the same options (or fail if one provides an unrecognized one), which is bad if one wants to write general code that can write out any format understood by MDAnalysis.
The docs for MDAnalysis.coordinates.core. writer() (the dispatch function that selects a writer based on file extension and user input) says, for instance, that "some writers will have special arguments that have to be looked up" but also that there a number of common arguments are common with specific universal meanings for all writers.
Specifically, "delta is length of time between two frames, in ps [1.0]". The DCDWriter, for instance, breaks this promise and instead expects delta to be the integrator timestep in AKMA units. The TRZWriter takes a clue from DCDWriter and also uses delta for integrator timestep. The Gromacs XTC/TRR writer on the other hand use delta as the time between saved frames in ps.

The Trajectory API is actually vague on what Writers have to accept. This should be tightened and synced with the docs to writer().

At a minimum

all writers need to treat start, stop, step, delta in the same manner
all writers should gracefully deal with unknown parameters, i.e. implement catch-all **kwargs

Opinions?

Original issue reported on code.google.com by orbeckst on 20 Jan 2015 at 7:02

The text was updated successfully, but these errors were encountered:

GoogleCodeExporter · 2015-04-04T00:44:18Z

Original comment by orbeckst on 20 Jan 2015 at 7:03

Added labels: Maintainability, Milestone-Release1.0, Usability

GoogleCodeExporter · 2015-04-04T00:44:19Z

The testing for Writers is pretty haphazard right now, which is arguably how they've ended up diverging.

After the next release I can write some tests like I did for Timesteps where there's a mixin test for Writer and then each format.

Timestep tests:
https://code.google.com/p/mdanalysis/source/browse/testsuite/MDAnalysisTests/test_coordinates.py?name=develop#2778

Original comment by richardjgowers on 22 Jan 2015 at 4:00

richardjgowers · 2015-04-25T10:08:30Z

So I've bumped into this a little when playing with the ncdf Writer (Issue #257). This was previously not deciding on whether to write velocities/forces on creation, but instead deciding when passed its first Timestep to write. So previously:

W = mda.Writer('out.ncdf', numatoms=100)
# at this point, W hasn't created the header of the file, and isn't commited to writing velocities

W.write(ts)
# once passed a Timestep, it looked if _velocities were present, then wrote the header with velocities and forces fields

And now:

W = mda.Writer('out.ncdf', numatoms=100, velocities=True, forces=True)

# W is now commited to writing velocities and forces

W.write(ts)  # this will now expect Velocities and Forces

Which I think gives much better control, eg you can choose to write a new trajectory without vels and forces if you want. Treatment of velocities and forces is something that needs defining in the API however.

orbeckst · 2015-05-02T00:55:26Z

For formats that either contain velocities forces or not I agree with @richardjgowers 's approach to decide on what to do when creating the Writer instance.
For formats like TRR where each Timestep can decide on its own what it wants to contain, we cannot do that. It has to depend on what is passed to the Timestep – or @mnmelo might correct me here?

Perhaps we can combine this by allowing

positions=True|False|None
velocities=True|False|None
forces=True|False|None

with the following meanings:

True: "you must provide this for each Timestep or I raise a ValueError"
False: "you must not provide this for each Timestep or I raise a ValueError"
None: "you may provide this for each Timestep and I will save it"

Writers that do not allow the per-Timestep decision should raise an exception (ValueError) on init if they get fed None instead of either True or False. The defaults would be set to something sensible...

Would this make sense?

mnmelo · 2015-05-02T07:42:20Z

Yes, you're right @orbeckst, TRR writers can assume nothing from the start (see #162). I didn't realize earlier I this discussion this could also be a problem.

The following is a common setup for GROMACS run outputs:
Positions every x frames;
Velocities every v frames;
Forces every f frames;
Where typically v and f are the same but larger than x (and a multiple of it).

In this case there'll be frames only with positions, and others with all three properties. The TRR format only writes what's there, i.e., if there's no velocities/forces only positions get written to file; no space is reserved for the missing properties.

A somewhat more uncommon, but equally possible and valid, situation is to have all three x v f steps different and not multiple of each other. In this case there might be frames with any combination of positions velocities and forces.

If the user wants to replicate such a behavior either:
The writer is initialized with the step number for each property;
Or it must be flexible enough to let arbitrary properties be set on the fly.

The second option is also the most faithful to the format, which doesn't even require any sort of regular spacing, and even allows for frames without any positions/velocities/forces.

richardjgowers · 2015-07-19T13:46:55Z

So this has cropped up with #350 and #308 with delta and dt being used interchangeably in different modules.

The official API (which is based on DCD) defines delta as the MD integrator step (~1fs) and dt as the time step between saved frames (~1ps).
DCD & TRZ Writer used "dt" like this

TRR/XTC/NCDF used "delta" for time between frames.

I'm changing the "delta" writers to use "dt" instead.

fixed, skip_timestep, periodic, skip and delta no longer required for Readers (Issue #350) All writers now refer to time between steps as "dt" (was previously delta in XTC, TRR and NCDF Writers) (Issue #206) Timesteps now have a default dt of 1.0 ps (and issue warning when using this) Timestep attribute time now tries to return data['time'] first, then dt * frame, using the default value of 1.0 ps if necessary. (Issue #308)

orbeckst · 2015-07-19T16:08:28Z

I think somewhere in these long discussions we decided to use "dt" for the time between frames.

delta can then be one the data variables because it's not provided by all readers.

On 19 Jul, 2015, at 06:46, Richard Gowers wrote:

So this has cropped up with #350 and #308 with delta and dt being used interchangeably in different modules.

The official API (which is based on DCD) defines delta as the MD integrator step (~1fs) and dt as the time step between saved frames (~1ps).

DCD & TRZ Writer used "dt" like this

TRR/XTC/NCDF used "delta" for time between frames.

I'm changing the "delta" writers to use "dt" instead.

Oliver Beckstein * orbeckst@gmx.net
skype: orbeckst * orbeckst@gmail.com

richardjgowers · 2015-07-19T16:11:56Z

Yeah the problem is the Writers wanted differently named kwargs for this

richardjgowers · 2018-07-28T17:50:41Z

Ok, I'm raising this issue from the dead, we're (@micaela-matta) trying to write a TXYZ Writer... So currently Writer.write() expects either a Timestep, AtomGroup or Universe. For ascii formats especially, sticking to a Timestep is pretty lame for data, and also there's some weird hacks for Writers which need this.

Is it OK for Writers to just throw a failure if they get a Timestep and really can't make a sensible file from one? (ie a pdb file from just a timestep will be ugly)

kain88-de · 2018-07-28T17:53:30Z

Can't we just deprecate the Timestep? I find it cleaner to give a atomgroup or universe anyway.

richardjgowers · 2018-07-28T17:56:51Z

@kain88-de I'd prefer that yeah. Timestep is in AtomGroup, but not vice versa

orbeckst · 2018-08-12T16:57:34Z

ok

GoogleCodeExporter added Priority-Medium auto-migrated defect usability maintainability labels Apr 4, 2015

orbeckst removed the Milestone-Release1.0 label Apr 9, 2015

orbeckst modified the milestone: 1.0 Apr 9, 2015

orbeckst added the API label Apr 13, 2015

orbeckst self-assigned this May 2, 2015

richardjgowers added the Component-Readers label Jun 6, 2015

orbeckst added Priority-High and removed Priority-Medium labels Jun 20, 2015

richardjgowers mentioned this issue Jul 19, 2015

Feature readercleanup #357

Merged

mnmelo mentioned this issue Nov 8, 2015

Cleanup 'delta' and 'dt' consistency #522

Closed

richardjgowers removed the auto-migrated label Nov 28, 2017

orbeckst mentioned this issue Aug 12, 2018

Deprecate Timestep as argument for writers #2043

Closed

5 tasks

richardjgowers mentioned this issue Oct 16, 2018

WIP Issue 2043 deprecate ts write #2110

Merged

4 tasks

lilyminium mentioned this issue Jun 5, 2020

Release 1.0 #2443

Closed

6 tasks

richardjgowers modified the milestones: 1.0, 2.0 Jun 6, 2020

richardjgowers removed this from the 2.0 milestone May 4, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

trajectory writer interface is inconsistent #206

trajectory writer interface is inconsistent #206

GoogleCodeExporter commented Apr 4, 2015

GoogleCodeExporter commented Apr 4, 2015

GoogleCodeExporter commented Apr 4, 2015

richardjgowers commented Apr 25, 2015

orbeckst commented May 2, 2015

mnmelo commented May 2, 2015

richardjgowers commented Jul 19, 2015

orbeckst commented Jul 19, 2015

richardjgowers commented Jul 19, 2015

richardjgowers commented Jul 28, 2018

kain88-de commented Jul 28, 2018

richardjgowers commented Jul 28, 2018

orbeckst commented Aug 12, 2018

trajectory writer interface is inconsistent #206

trajectory writer interface is inconsistent #206

Comments

GoogleCodeExporter commented Apr 4, 2015

GoogleCodeExporter commented Apr 4, 2015

GoogleCodeExporter commented Apr 4, 2015

richardjgowers commented Apr 25, 2015

orbeckst commented May 2, 2015

mnmelo commented May 2, 2015

richardjgowers commented Jul 19, 2015

orbeckst commented Jul 19, 2015

richardjgowers commented Jul 19, 2015

richardjgowers commented Jul 28, 2018

kain88-de commented Jul 28, 2018

richardjgowers commented Jul 28, 2018

orbeckst commented Aug 12, 2018