Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reading Gromacs tpr files instead of psf #2

Closed
GoogleCodeExporter opened this issue Apr 4, 2015 · 33 comments
Closed

reading Gromacs tpr files instead of psf #2

GoogleCodeExporter opened this issue Apr 4, 2015 · 33 comments

Comments

@GoogleCodeExporter
Copy link

In order to work seamlessly with Gromacs it should be possible to use the
Gromacs binary [http://wiki.gromacs.org/index.php/.tpr_File tpr] file as
the basis of the topology.

Original issue reported on code.google.com by orbeckst on 29 Jan 2008 at 2:08

@GoogleCodeExporter
Copy link
Author

If there was documentation on the format (besides the gromacs source) this 
would be
pretty easy to add (using python struct module for reading binary data files).
Unfortunately I could never find said documentation.

Original comment by naveen.m...@gmail.com on 11 Feb 2008 at 5:45

@GoogleCodeExporter
Copy link
Author

Maybe we settle for a "poor man's" topology by simply reading a pdb. We don't 
use
connectivity at the moment anyway, and that should give us the basics to work 
with.

Original comment by orbeckst on 5 Feb 2009 at 2:13

@GoogleCodeExporter
Copy link
Author

We'll use a pdb (or maybe gro) file and built a simple topology; re-open this 
Issue
if you need/want a tpr reader.

Original comment by orbeckst on 2 Apr 2010 at 1:00

  • Changed state: WontFix

@GoogleCodeExporter
Copy link
Author

Actually, the information needed to make the psf file is in the .top/.itp files 
which are plain text files. There's a perl script that can be used to convert a 
topology file to psf at 
http://www.ks.uiuc.edu/Research/vmd/script_library/scripts/top2psf/ 

Original comment by jmvanega...@gtempaccount.com on 15 Jun 2010 at 4:37

@GoogleCodeExporter
Copy link
Author

Ah, good point: I forgot about Marc Baaden's script. Thanks.
(Although it would still be convenient if one could simply give analysis the 
"canonical" input that one would also feed to the g_* tools.)

Original comment by orbeckst on 15 Jun 2010 at 10:51

@GoogleCodeExporter
Copy link
Author

Seems this issue has been there for a couple of years, how far are we from 
analyzing the tpr file using python?

Original comment by alfred53...@gmail.com on 30 Nov 2011 at 8:00

@GoogleCodeExporter
Copy link
Author

Hi Oliver, in response to your post on April 2, 2010. I'd like to reopen this 
issue and request a tpr reader.  Actually, I wouldn't mind contributing to this 
feature.  What can I do to get started?

Original comment by fullofgr...@gmail.com on 30 Nov 2011 at 8:24

@GoogleCodeExporter
Copy link
Author

There hasn't been any work on a tpr reader but I'll reopen it, given that 
people are interested in it and are willing to work on it. 

I'd start by analyzing the gmxdump tool: It contains the functionality to turn 
a tpr into human-readable output. A range of possible approaches:

1) It would be possible to simply call 'gmxdump -s TPR' and parse the output 
although that's a pretty ugly solution as this means that you always need 
gmxdump on the system. 

2) It would be better to analyze the gmxdump (and related library functions) 
source code and write a fully self-contained gmxtprdump C program that easily 
compiles on its own. Then we could use that as the basis for a library to link 
into MDAnalysis. The only problem is that the TPR file format seems to change 
frequently and one would need to keep updating the code constantly.

3) Or analyze the layout of the tpr file from the gmxdump code and then use 
Python to read the file. I would suspect that it is a XDR file, which can be 
accessed with the standard xdrlib http://docs.python.org/library/xdrlib.html.


Regarding the older comment on ITPs: I wrote a simple ITP parser 
https://github.com/orbeckst/GromacsWrapper/blob/master/gromacs/fileformats/itp.p
y for GromacsWrapper, which we could easily add to MDAnalysis if we needed to. 
We could get a full topology from a processed topology that 'grompp -pp' can 
write. The downside is that one must produce this topology, and most people 
don't do this when they create a tpr.

If anyone wants to get started then go ahead! You can either send patches 
(attach to the Issue) or clone the git repository and work on your clone. Once 
you have something ready to go, a developer can pull the changes from your 
repository.

Whoever wants to coordinate work on this issue should please reply in a 
comment. Then I add them to the contributors and assign the issue to them.




Original comment by orbeckst on 30 Nov 2011 at 9:06

  • Changed state: New

@GoogleCodeExporter
Copy link
Author

Original comment by orbeckst on 30 Nov 2011 at 9:09

@GoogleCodeExporter
Copy link
Author

Hi Oliver, I can investigate this issue.  I'd like to go with the git repo 
cloning route.  Once I've made the changes,  will I be able to merge changes by 
pushing to the master repo?  Thanks a lot for your help!

Original comment by fullofgr...@gmail.com on 30 Nov 2011 at 11:54

@GoogleCodeExporter
Copy link
Author

Hi Oliver, I've created a code.google.com private clone under my account.  I'll 
be making the changes to that repo.  I didn't fully understand what you meant 
by 'clone the git repository and work on your clone' earlier, but I'm guessing 
this was what you meant as the alternative route to uploading patches.

Original comment by fullofgr...@gmail.com on 30 Nov 2011 at 11:59

@GoogleCodeExporter
Copy link
Author

[deleted comment]

@GoogleCodeExporter
Copy link
Author

Hi fullofgrace88,

I assigned the issue to you and you can coordinate work on it. You will be able 
to change status of the issue and create wiki pages. 

With git we can now go to a more distributed development model (something we 
wanted to do for a long time): You've already cloned into 
http://code.google.com/r/fullofgrace88-mdanalysis-tpr-reader/ --- good! Work on 
your clone. Once you have a working implementation you give me a "pull request" 
(i.e. add a comment to the issue or email me). Then I can review and "git pull" 
the changes into the main line MDAnalysis, where they will appear as if you had 
added them directly.

Similarly, I'd recommend that anyone else who wants to contribute (e.g. 
alfred532008) also clones MDAnalysis and does pulls (or fullofgrace88 makes 
them committers on his clone). 

In any case, in my experience communication between developers is important. 
You could

* use the mailing list (if need be, we can create a "mdnalysis-developers" list)
* use the comments here
* create a wiki page (e.g. "TPR Reader Development")

But this is up to you how you organise the work. I'll be happy to answer 
questions but I won't have time to be directly working on this (I have lots of 
other things to do!) so I am very happy that you stepped up.

I'd start with thinking about what we need from the TPR. At the moment, the 
"topology" in MDAnalysis consists of

1. A list of Atom instances 
http://code.google.com/p/mdanalysis/source/browse/MDAnalysis/core/AtomGroup.py#8
3 . At a minimum, each Atom needs to know its name (e.g. CA), number, type, 
resname, resid, segid, mass, charge. 

2. A list of bonds (right now a list of atomnumber pairs).

(We can also store angles, dihedrals, etc but we don't use this at the moment.)

If you can (in principle) get this information from a TPR then you can continue 
:-) (maybe check with gmxdump).

Have a look at the way the topology.PSFParser.__parseatoms() 
http://code.google.com/p/mdanalysis/source/browse/MDAnalysis/topology/PSFParser.
py#94 generates the list of atoms[]. This is where we want to get.



If you are new to git then perhaps the git docs at 
http://git-scm.com/documentation will help. (Also, as you know, MDAnalysis only 
switched to git recently so we're still trying to figure out how to make best 
use of git for our development process and any feedback will be highly 
appreciated!)

Oliver



Original comment by orbeckst on 1 Dec 2011 at 4:13

@GoogleCodeExporter
Copy link
Author

Hi Zhuyi/alfred532008, 

of course there are lots of ways to participate even if you're not coding right 
away. The good thing about "open source" is that the code is in the open so you 
can read that code and learn from it. On googlecode you can even read code and 
make comments on the code (googlecode has this nice feature of "code review" 
and MDAnalysis allows code reviews from everyone). You can also use the 
comments in the Issue tracker or on the wiki to communicate your ideas. And 
then there's the mailing list for discussion. Some of the important other 
things one can do is to submit patches, run tests, write documentation.

Any of this is a valuable contribution to an open source project!

If you specifically want to learn more C I'd suggest to read code and then try 
to modify existing code and make it work. It's hard in the beginning but at 
least you'll be working on something that is going to be useful right away.

For instance, you could try to take gmxdump and make it process TPRs only. The 
new 'tprdump' program should only contain code necessary for that purpose. It 
should e stripped down to a minimum in the sense that removing anything else 
from it will not work anymore. In this way we would learn what code 
functionality is required at a minimum to process TPRs.

Otherwise: make your ideas and suggestions known here! 

Oliver




Original comment by orbeckst on 1 Dec 2011 at 4:29

@GoogleCodeExporter
Copy link
Author

Hi Oliver,

Zhuyi and I are from the same lab (Pomes lab), so we can communicate/ 
coordinate something fairly easily.  We really like to contribute to mdanalysis 
and make it interface well with Gromacs.  Thanks for your guidance for getting 
us start and all your other contributions to the MD open source software 
community, we really appreciate all your efforts over here.

Grace

Original comment by fullofgr...@gmail.com on 1 Dec 2011 at 6:47

@GoogleCodeExporter
Copy link
Author

Hey Oliver,  It doesn't seem like I can add alfred532008 a committer to my 
clone using code.google.com.  If we're both doing part of the work on the same 
issue,  what do you think the best method would be to contribute to the source 
repository?  Having two independent clones seem a little redundant. Thanks.

Original comment by fullofgr...@gmail.com on 1 Dec 2011 at 7:35

@GoogleCodeExporter
Copy link
Author

First, I am happy that you're both working on this task. Once this is working 
it will be a extremely useful addition. In fact, I don't know any other 
"general purpose" analysis package outside of Gromacs that reads tprs.

Secondly, I didn't realise that gcode handles clones very differently from full 
repositories – sorry about that. Nevertheless, it is kind of in line with the 
typical distributed development model where there are few "pushs" but many 
"pulls". For the moment I can think of a range of reasonable options:

1) Zhuyi simply clones your clone and you pull from each others repo. In 
practice that's not a problem in my experience (I've done this with other 
projects) and it actually makes for a very good workflow. You can work on 
exactly your problem for a while and when you're ready you 'git pull' what 
everyone else has been up to.

2) Given that you're in the same lab, you set up an internal "bare repository" 
that you both can push into. Then Grace pushes that repo to gcode.

3) You file an enhancement request with google code to implement committers for 
clones. No idea if this will be ever implemented.

You could also create a full-fledged git repo (either on gcode or e.g. 
http://github.com or http://gitorious.org) – git does not care where it pulls 
changes from and the only advantage of the gcode clone setup is that it is 
shown under "clones". 

All things considered, (1) seems to be reasonable, e.g. having a clone named 
alfred532008-mdanalysis-tpr-reader would be a sensible thing.

I added some notes on DistributedDevelopment which might be of interest.

Original comment by orbeckst on 1 Dec 2011 at 9:13

@GoogleCodeExporter
Copy link
Author

Any updates here? I'm considering giving this a look and would happily refrain 
if the solution is behind the corner (or harness any leftovers;) )

Original comment by jan...@gmail.com on 3 Aug 2012 at 3:32

@GoogleCodeExporter
Copy link
Author

Here is a wiki article on the TPR Reader development, together with a few lines 
of code to extract the basic header information from a file

http://code.google.com/p/mdanalysis/wiki/TPRReaderDevelopment

Any hands on board with this work are very welcome! The wiki article above is a 
good starting point and shows that the task at hand is v easy. I'll bake an 
e-cake for any brave participants ;)

More broadly, this reader should probably be advertised and expedited outside 
of MDAnalysis since it might just be helpful to the MD community. 

Original comment by jan...@gmail.com on 7 Aug 2012 at 2:53

@GoogleCodeExporter
Copy link
Author

Hi, Oliver,

I picked up this issue again. Continued to from the header parser from Jan 
(http://code.google.com/p/mdanalysis/wiki/TPRReaderDevelopment), I have 
completed parsing the tpr for my particular version, which is 58. Next, I will 
try to integrate it into the MDAnalysis package. Do you have any guide to speed 
up my analyzing on the organization of MDAnalysis package. Once I finish that. 
I may look back into the source code again and add support for other versions 
of tpr files. You are very right, the format look changed very frequently and 
the source code (mainly gromacs/gmxlib/tpxio.c) contains a lot of (if..else..) 
statement for handling different versions.

Original comment by alfred53...@gmail.com on 14 Nov 2012 at 8:00

@GoogleCodeExporter
Copy link
Author

Hi alfred532008,

The first step is to clone the MDAnalysis repository and create a feature 
branch off the devel branch (see 
https://code.google.com/p/mdanalysis/wiki/DevelopmentWorkflow ). Then work on 
your new branch (e.g. named "TPRReader" or "Issue/2"). This will make merging 
with the main code a lot easier.

I should also say that andy.somogyi has been working on a major rework of the 
topology reader, including using the Gromacs libraries directly. However, his 
code isn't quite ready yet so I'd be interested to see a Python-only solution 
like yours, too. 

In order to integrate a new topology reader you will need to add code to

1. MDAnalysis/topology, e.g. MDAnalysis/topology/TPRParser.py

   - look at PSFParser.py, GROParsers.py  for an example: you essentially have one function parse(filename) that does all the work

   - your code needs to read the topology file and build a universe from atoms, which is represented as a dict structure[] with keys _atoms, _bonds, _angles, _dihe, _impr. Only _atoms are absolutely required.

2. MDAnalysis/topology/__init__.py

   Register your parser by adding it to __all__, the import statement, and add the parse() function to the topology_parsers[] dict for key 'TPR'.

3. Write a test case and add a test tpr file, see 
https://code.google.com/p/mdanalysis/wiki/UnitTests 


If you have questions ask on the developer mailing list.

Oliver



Original comment by orbeckst on 14 Nov 2012 at 8:26

@GoogleCodeExporter
Copy link
Author

alfred532008's TPR reader  
https://code.google.com/r/alfred532008-mdanalysis-tprreader/source/list?name=tpr
parser is available as the feature branch 'pyTPRparser' 
https://code.google.com/p/mdanalysis/source/browse/package?name=pyTPRparser in 
the repository 

Anyone who wants to play with the code should be able to do so with commands 
such as

git fetch origin
git checkout pyTPRparser

Current limitations:
* Only reads TPR file version 58 (Gromacs around release 4.5.3)

For more information see the developer thread 
https://groups.google.com/d/topic/mdnalysis-devel/nMwUjAZR-iQ/discussion


Original comment by orbeckst on 24 Nov 2012 at 10:25

@GoogleCodeExporter
Copy link
Author

Hi alfred532008, I'm still looking at this thread (monitoring only) and your 
work here is absolutely excellent!

The version issue is a serious problem. It may be possible to use CTYPES 
bindings here, to bind directly to gromacs/gmxlib/tpxio.c and use that c code 
to read the tprs and then access the c data structure from python. For an 
example, see 'grompy' by (Martin Hoefling, Roland Schulz) 
http://orbeckst.github.com/GromacsWrapper/alternatives.html

If this C binding works the way I think it does (see grompy/tpxio.py), it will 
make a lot of the code you wrote necessary, which is very sad, but we'll 
navigate away for version issues relying on gromacs itself. 

It is clear at this stage that absence of a python interface to gromacs (or any 
MD code out there) is a major hurdle to some users. 

Original comment by jan...@gmail.com on 2 Dec 2012 at 2:23

@GoogleCodeExporter
Copy link
Author

Now seriously I'm reading the source, alfred532008, and this code is great. I 
take my point back: we should keep your pure-python version no matter what, 
because it gives us independence from the gromacs libraries (usage case: a user 
without gromacs who wants to access a tpr file sent by a collaborator). 

Original comment by jan...@gmail.com on 2 Dec 2012 at 2:30

@GoogleCodeExporter
Copy link
Author

Hi, Jan,

I have some updates, I am not familiar with the CTYPES you mentioned, but feel 
it worth looking into it when I have time.

The good news is that I just find out that gromacs-4.0.x all have the same tpx 
version, which is 58, and similarly gromacs-4.5.x all have tpx version 73. This 
is quite unexpected, I don't know why there is such a big jump.

I am almost done with version 73, and probably will push it next week.  
Assuming very few people are using gromacs-3.x or even earlier versions now, I 
think we are pretty much done with the tpr parsing, unless a lot of people 
request for early versions. Of course, there is not bug-free guarantee in the 
my code, even if it passes these simple unit test.

Zhuyi

Original comment by alfred53...@gmail.com on 2 Dec 2012 at 6:22

@GoogleCodeExporter
Copy link
Author

Nothing concrete yet but i've summarized some ideas about wrapping gromacs 
libraries in python 
[https://code.google.com/p/mdanalysis/wiki/WrappingGromacsInPython]

Original comment by jan...@gmail.com on 2 Dec 2012 at 11:15

@GoogleCodeExporter
Copy link
Author

Hi Zhuyi,

Have you got updated code to read Gromacs 4.6.1 TPR files? 

I would like to integrate your TPRParser into the upcoming 0.8 release. It 
works pretty well, you have test cases, and there's no other Python code 
available at the moment that can easily read TPR files. However, reading modern 
TPR files would be rather important.

I did a little bit of work on your TPRParser, including adding docs 
http://mdanalysis.googlecode.com/git-history/pyTPRparser/package/doc/html/docume
ntation_pages/topology/TPRParser.html and fixing a small bug (related to 
parsing of angles) and reorganized files (I put your tpr_*.py files into a 
separate sub-module MDAnalysis.topology.tpr). If you pull the changes from the 
'pyTPRparser' branch you will get all changes including the history (or see 
https://code.google.com/p/mdanalysis/source/list?name=pyTPRparser ).

Oliver


Original comment by orbeckst on 17 Apr 2013 at 5:17

  • Changed state: Started

@GoogleCodeExporter
Copy link
Author

Hi, Oliver, 

I took a look at the changes. Thanks for adding all the docs.

No, I haven't updated it for Gromacs 4.6.x yet. I am still interested in 
writing the newer parser for modern tpr files, and would like to do it in April 
or May. I am curious about when do you plan to release 0.8?

Zhuyi

Original comment by alfred53...@gmail.com on 18 Apr 2013 at 12:42

@GoogleCodeExporter
Copy link
Author

This issue was closed by revision 7131d6e8fe1b.

Original comment by orbeckst on 29 May 2013 at 5:14

  • Changed state: Fixed

@GoogleCodeExporter
Copy link
Author

Hi Zhuyi,

excellent work on the TPRParser. I am currently merging it into the develop 
branch for inclusion in 0.8. There are some issues to iron out with how bonds 
are treated (introduced by Jan in relation to Issue 23) when the Universe is 
built but it hopefully won't take too long.

Oliver

Original comment by orbeckst on 29 May 2013 at 5:27

@GoogleCodeExporter
Copy link
Author

Hi Zhuyi and everyone else,

I merged the TPRparser into develop. I only had to make sure that bonds, 
angles, etc are stored as a list of tuples. UnitTests pass.

I also updated the docs for topology parsers and outlined the definition of the 
"structure" dict --- the data structure that we are currently using to capture 
the topology. It is stored as Universe._psf. Note that this data structure is 
subject to change and users should NOT rely on it. 

Congratulations to Zhuyi on closing an Issue that was more than 5 years old — 
this is the record for MDAnalysis.

Oliver


Original comment by orbeckst on 30 May 2013 at 12:06

  • Changed state: Verified

orbeckst pushed a commit that referenced this issue Jun 25, 2015
@dprada
Copy link

dprada commented Aug 11, 2016

Hi all,

Following the indications found in the documentation of TPRParser the need to include support for TPX version 110 from Gromacs 2016 is reported here.

Thanks.

@orbeckst
Copy link
Member

Hi @dprada,

Many thanks for reporting the issue and following the instructions.

However, these instructions are ancient and nowadays we'd much prefer if you just opened a new issue (e.g., "TPRParser does not support TPX version 110 from Gromacs 2016") so that all the discussion remains focused on the issue at hand. Could you please do this?

I will update the old instructions in the docs. — Apologies!

Thanks,
Oliver

abiedermann pushed a commit to abiedermann/mdanalysis that referenced this issue Jan 5, 2017
 H+I1tSHpuFwT3K9en0MmSOVFdnsFM/OHKzNzXYTuH7kZmP0mxE9qcoaip8VPuaoA
 8Ez+7dtuPj5ATJ+PpWu1nvl58Xod7WmsZIbw2f02noKCSlFirdmXqMYuOe+/MyRT
 ziFinAhDy5kITCjOlnwHEpXDdGCUnCA/vm3u+BtPoKNrVRsdHNUcU1p5PWJXivJT
 0dr7S+p5KQ2VvlKzwr88BcpnQbgWlWusE13YgY7Z8QdXADu3oIey7d357kSc9o/q
 8Pvv7wLtLPWS38rDyElF1FyaT1HCNt3wteNL9uPvVZJRsRBEwMoLSwBgc253N3/0
 4dYWfTrfVJ4TOjI6q43M1VwnlHYJSYHG34NRgoQBByW/Q8J9OLBJOELleXOolU7X
 Dn5DXZ7h4Fzgf6BDqS1aejUL/CNqec0i1h0P7QKnGwMlxDv+SrOKwi2yKguRAv/G
 xwqokDFbliW9tkNkyJvoPN6i4BCPlXxh00D1n8FkSRt/PmXKn34SKPN3yQoSjh9V
 hzXuVM2wMWxJCb8Kuyq0B0qspchyAX51ec+yLd5txjwVvLjuvUH5HpP5uFB6WZep
 1312wY4JlkCIW7D1Tq/7Eb+gubLW/VE9bm9T5btebh2GfWhHdYQNrIYlB2NOTDUf
 uO6zSrk2/oBkTh1g+Dnu
 =Yt5y
 -----END PGP SIGNATURE-----

fixed docs for TPRParser: submit NEW issue for missing formats

See MDAnalysis#2 (comment)

[ci skip]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants