Make the TPR parser a bit faster #2804

jbarnoud · 2020-06-29T16:14:18Z

A function in the TPR parser calls list.pop thousands of times, which is
slow. This commit avoids that exansive call.

Taking the TPR from
https://github.com/bioexcel/covid_modelling_simulation_data/tree/master/spike_protein/full_spike/trimer
the parsing time on my computer goes from 25.6s to 9.39s. On a more
pathological TPR file, it goes from 3 minutes to about 6s.

Changes made in this Pull Request:

PR Checklist

~~- [ ] Tests?~~
~~- [ ] Docs?~~

CHANGELOG updated?
Issue raised/referenced?

A function in the TPR parser calls list.pop thousands of times, which is slow. This commit avoids that exansive call. Taking the TPR from https://github.com/bioexcel/covid_modelling_simulation_data/tree/master/spike_protein/full_spike/trimer the parsing time on my computer goes from 25.6s to 9.39s. On a more pathological TPR file, it goes from 3 minutes to about 6s.

richardjgowers

@jbarnoud looks solid, thanks!

codecov · 2020-06-29T20:36:29Z

Codecov Report

Merging #2804 into develop will decrease coverage by 0.13%.
The diff coverage is 100.00%.

@@             Coverage Diff             @@
##           develop    #2804      +/-   ##
===========================================
- Coverage    92.22%   92.08%   -0.14%     
===========================================
  Files          184      183       -1     
  Lines        24141    23670     -471     
  Branches      3123     3083      -40     
===========================================
- Hits         22263    21797     -466     
+ Misses        1813     1808       -5     
  Partials        65       65

Impacted Files	Coverage Δ
package/MDAnalysis/topology/tpr/obj.py	`96.72% <100.00%> (-0.06%)`	⬇️
package/MDAnalysis/auxiliary/base.py	`90.75% <0.00%> (-0.57%)`	⬇️
package/MDAnalysis/coordinates/chain.py	`91.94% <0.00%> (-0.37%)`	⬇️
package/MDAnalysis/coordinates/base.py	`93.88% <0.00%> (-0.33%)`	⬇️
package/MDAnalysis/coordinates/TRZ.py	`88.16% <0.00%> (-0.27%)`	⬇️
package/MDAnalysis/coordinates/GSD.py	`88.63% <0.00%> (-0.26%)`	⬇️
package/MDAnalysis/coordinates/chemfiles.py	`88.12% <0.00%> (-0.22%)`	⬇️
package/MDAnalysis/coordinates/INPCRD.py	`93.33% <0.00%> (-0.22%)`	⬇️
package/MDAnalysis/coordinates/GMS.py	`92.30% <0.00%> (-0.16%)`	⬇️
package/MDAnalysis/coordinates/TXYZ.py	`93.18% <0.00%> (-0.16%)`	⬇️
... and 38 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8314f7d...2fc42b3. Read the comment docs.

orbeckst · 2020-06-29T20:54:18Z

Could be backported to 1.0.1 via PR #2798.

* Make the TPR parser a bit faster A function in the TPR parser calls list.pop thousands of times, which is slow. This commit avoids that exansive call. Taking the TPR from https://github.com/bioexcel/covid_modelling_simulation_data/tree/master/spike_protein/full_spike/trimer the parsing time on my computer goes from 25.6s to 9.39s. On a more pathological TPR file, it goes from 3 minutes to about 6s. * Update changelog for #2804 Co-authored-by: Richard Gowers <richardjgowers@gmail.com>

* Make the TPR parser a bit faster A function in the TPR parser calls list.pop thousands of times, which is slow. This commit avoids that exansive call. Taking the TPR from https://github.com/bioexcel/covid_modelling_simulation_data/tree/master/spike_protein/full_spike/trimer the parsing time on my computer goes from 25.6s to 9.39s. On a more pathological TPR file, it goes from 3 minutes to about 6s. * Update changelog for MDAnalysis#2804 Co-authored-by: Richard Gowers <richardjgowers@gmail.com>

jbarnoud added 2 commits June 29, 2020 17:05

Update changelog for #2804

174dd40

richardjgowers approved these changes Jun 29, 2020

View reviewed changes

orbeckst assigned richardjgowers Jun 29, 2020

Merge branch 'develop' into slightly-faster-tpr

2fc42b3

richardjgowers merged commit 61e236d into develop Jul 2, 2020

richardjgowers deleted the slightly-faster-tpr branch July 2, 2020 17:16

fiona-naughton added enhancement Component-Readers labels Sep 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make the TPR parser a bit faster #2804

Make the TPR parser a bit faster #2804

jbarnoud commented Jun 29, 2020 •

edited

richardjgowers left a comment

codecov bot commented Jun 29, 2020 •

edited

orbeckst commented Jun 29, 2020

Make the TPR parser a bit faster #2804

Make the TPR parser a bit faster #2804

Conversation

jbarnoud commented Jun 29, 2020 • edited

Changes made in this Pull Request:

PR Checklist

richardjgowers left a comment

Choose a reason for hiding this comment

codecov bot commented Jun 29, 2020 • edited

Codecov Report

orbeckst commented Jun 29, 2020

jbarnoud commented Jun 29, 2020 •

edited

codecov bot commented Jun 29, 2020 •

edited