Skip to content

Commit

Permalink
Make the TPR parser a bit faster (#2804)
Browse files Browse the repository at this point in the history
* Make the TPR parser a bit faster

A function in the TPR parser calls list.pop thousands of times, which is
slow. This commit avoids that exansive call.

Taking the TPR from
https://github.com/bioexcel/covid_modelling_simulation_data/tree/master/spike_protein/full_spike/trimer
the parsing time on my computer goes from 25.6s to 9.39s. On a more
pathological TPR file, it goes from 3 minutes to about 6s.

* Update changelog for #2804

Co-authored-by: Richard Gowers <richardjgowers@gmail.com>
  • Loading branch information
2 people authored and orbeckst committed Jul 3, 2020
1 parent fe65603 commit 2e2672c
Show file tree
Hide file tree
Showing 2 changed files with 8 additions and 9 deletions.
6 changes: 4 additions & 2 deletions package/CHANGELOG
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ The rules for this file:
* release numbers follow "Semantic Versioning" http://semver.org

------------------------------------------------------------------------------
??/??/?? richardjgowers, IAlibay, orbeckst, tylerjereddy
??/??/?? richardjgowers, IAlibay, orbeckst, tylerjereddy, jbarnoud

* 1.0.1

Expand All @@ -22,6 +22,8 @@ Fixes
* pip installation only requests Python 2.7-compatible packages (#2736)
* Testsuite does not use any more matplotlib.use('agg') (#2191)

Enhancements
* Improved performances when parsing TPR files (PR #2804)


06/09/20 richardjgowers, kain88-de, lilyminium, p-j-smith, bdice, joaomcteixeira,
Expand Down Expand Up @@ -229,7 +231,7 @@ Deprecations
* Writer.write_next_timestep is deprecated, use write() instead (remove in 2.0)
* Writer.write(Timestep) is deprecated, use either a Universe or AtomGroup

>>>>>>> develop

09/05/19 IAlibay, richardjgowers

* 0.20.1
Expand Down
11 changes: 4 additions & 7 deletions package/MDAnalysis/topology/tpr/obj.py
Original file line number Diff line number Diff line change
Expand Up @@ -129,10 +129,7 @@ def __init__(self, name, long_name, natoms):
self.natoms = natoms

def process(self, atom_ndx):
while atom_ndx:
# format for all info: (type, [atom1, atom2, ...])
# yield atom_ndx.pop(0), [atom_ndx.pop(0) for i in range(self.natoms)]

# but currently only [atom1, atom2, ...] is interested
atom_ndx.pop(0)
yield [atom_ndx.pop(0) for i in range(self.natoms)]
# The format for all record is (type, atom1, atom2, ...)
# but we are only interested in the atoms.
for cursor in range(0, len(atom_ndx), self.natoms + 1):
yield atom_ndx[cursor + 1: cursor + 1 + self.natoms]

0 comments on commit 2e2672c

Please sign in to comment.