Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Paper: PMDA – Parallel Molecular Dynamics Analysis #476

Merged
merged 116 commits into from Jul 3, 2019
Merged
Changes from 1 commit
Commits
Show all changes
116 commits
Select commit Hold shift + click to select a range
9b3524f
generate folder and rst, bib files
VOD555 May 17, 2019
a3bced1
add abstract
VOD555 May 17, 2019
e166e7b
skeleton for PMDA paper
orbeckst May 20, 2019
b46f485
add brief description of pmda.parallel.ParallelAnalysisBase
VOD555 May 20, 2019
210309a
new pmda.bib
VOD555 May 20, 2019
ef27ec2
correct pmda.bib
VOD555 May 20, 2019
3cbbcce
change abstract
VOD555 May 20, 2019
6a74028
change sign
VOD555 May 20, 2019
7e653d4
reduced size of bib file
orbeckst May 20, 2019
74a1de2
updated title and citations
orbeckst May 20, 2019
df1cf48
minor edits and comments on workflow
orbeckst May 20, 2019
4f29605
equal contributions: Max and Shujie
orbeckst May 20, 2019
d4ddcda
Merge pull request #16 from VOD555/2019
orbeckst May 20, 2019
b758b1b
manually merged shujie_fan.rst into fan.rst
orbeckst May 20, 2019
7e47bf8
renamed paper to pmda.rst
orbeckst May 20, 2019
c74f224
remove time record part in method section
VOD555 May 21, 2019
36f4020
change name
VOD555 May 21, 2019
0af8f45
remove Timing
VOD555 May 21, 2019
b0816db
fix title overline
VOD555 May 21, 2019
58f5d55
add user-defined parallel task 1
VOD555 May 21, 2019
df6fa01
add self-defined analysis task 2
VOD555 May 21, 2019
f1871e4
Rearrange With pmda.parallel.ParallelAnalysisBase
VOD555 May 21, 2019
f237713
references for MD applications
orbeckst May 21, 2019
7b6710e
first two introductory paragraphs + XSEDE ack
orbeckst May 21, 2019
1e5fe9d
ref prev work
orbeckst May 21, 2019
a324a01
introduction v1
orbeckst May 21, 2019
b1b046e
Merge pull request #21 from Becksteinlab/introduction
orbeckst May 21, 2019
ec2ebab
add to intro: contains library of analysis classes
orbeckst May 21, 2019
156bd06
moved code availability to end
orbeckst May 21, 2019
edac61c
Methods: reference numpy
orbeckst May 21, 2019
1ec5e5d
consistent adornments for headings
orbeckst May 21, 2019
c04b36b
more methods defs
orbeckst May 22, 2019
b5526ed
methods (#23): time series vs reduction
orbeckst May 22, 2019
23bf0d6
add benchmark introduction
VOD555 May 22, 2019
49bcbe3
merge changes
VOD555 May 22, 2019
43d8b80
remove timeit part
VOD555 May 22, 2019
1669922
methods: PMDA schema
orbeckst May 22, 2019
862e7ef
Merge branch '2019' of github.com:Becksteinlab/scipy_proceedings into…
orbeckst May 22, 2019
e09ad83
methods: implementation
orbeckst May 22, 2019
78961f1
better line breaking of code
orbeckst May 22, 2019
dbccb05
Update pmda.rst
richardjgowers May 22, 2019
ff7412d
methods: performance evaluation
orbeckst May 22, 2019
c07f8aa
updated examples and usage section
orbeckst May 22, 2019
a3cbbdf
code fix for Rgyr
orbeckst May 22, 2019
4d8b8b1
add efficiency and speedup plot for rdf and rms
VOD555 May 22, 2019
acf619b
add total time comparison
VOD555 May 22, 2019
d994849
add links to data repo to Methods
orbeckst May 22, 2019
0875aac
Merge branch '2019' of github.com:Becksteinlab/scipy_proceedings into…
orbeckst May 22, 2019
8dbc6da
combine total time, efficiency, speedup for rdf
VOD555 May 22, 2019
302a9a3
Merge branch '2019' of github.com:Becksteinlab/scipy_proceedings into…
VOD555 May 22, 2019
c29a2b7
combine total, efficiency, speed up for rms
VOD555 May 22, 2019
818833c
update acknowledgements
kain88-de May 22, 2019
9ca29f6
add fig for wait, compute, io times of rms
VOD555 May 22, 2019
71c2a42
add fig for wait compute io rdf, remove the unfixed wait compute io f…
VOD555 May 22, 2019
01fef5c
add Table with benchmark environments
orbeckst May 22, 2019
18f13c9
Merge branch '2019' of github.com:Becksteinlab/scipy_proceedings into…
orbeckst May 22, 2019
24e7a8f
fix wait compute io plot for rms
VOD555 May 22, 2019
19111b5
Merge branch '2019' of github.com:Becksteinlab/scipy_proceedings into…
orbeckst May 22, 2019
5bdbb88
add figures to Results
orbeckst May 22, 2019
1009f73
fix new fig captions
orbeckst May 22, 2019
c2a9eef
add graph for rdf's prepare, conclude universe time
VOD555 May 22, 2019
4af4bc0
add graph for rms' prepare, conclude and universe time
VOD555 May 22, 2019
e79a444
corrected how detailed timing information was obtained
orbeckst May 22, 2019
d858536
Merge branch '2019' of github.com:Becksteinlab/scipy_proceedings into…
orbeckst May 22, 2019
98531c9
Results: completed RMSD section
orbeckst May 22, 2019
0c2bee0
corrected water g(r): OO
orbeckst May 22, 2019
e8d45d0
results: finished RDF
orbeckst May 23, 2019
13557fd
wrote conclusions
orbeckst May 23, 2019
43cec73
final spell check
orbeckst May 23, 2019
5ad7e8e
small improvements
orbeckst May 23, 2019
3e2953d
abstract fix
orbeckst May 23, 2019
1ba6eac
more abstract fix
orbeckst May 23, 2019
3701f39
tense fix: use past tense for result
orbeckst May 23, 2019
b85f750
made optional/required methods clearer
orbeckst May 23, 2019
d0c8eba
maded AnalysisFromFunction() a bit clearer
orbeckst May 23, 2019
2b9ef60
more tense fixes (RMSD results)
orbeckst May 23, 2019
8198feb
Merge branch '2019' into patch-1
orbeckst May 23, 2019
731c99d
Merge pull request #25 from richardjgowers/patch-1
orbeckst May 23, 2019
b4f1b62
Merge pull request #26 from kain88-de/patch-1
orbeckst May 23, 2019
606ed6d
load booktabs package explicitly
orbeckst May 23, 2019
3746e85
consistently italicized multiprocessing and distributed
orbeckst May 23, 2019
069cfe6
more conservative description of RMSD speed-up
orbeckst May 23, 2019
24f8444
add DOI for test trajectories
orbeckst May 30, 2019
7f3c52f
break code inside column
orbeckst May 30, 2019
f775cff
improved AnalysisFromFunction example
orbeckst May 31, 2019
050947c
fix number of cores and nodes for ssd distributed
VOD555 Jun 7, 2019
2cf16d0
fixed: Table: only 1 node for distributed/SSD
orbeckst Jun 7, 2019
b282591
Merge branch '2019' of github.com:Becksteinlab/scipy_proceedings into…
orbeckst Jun 7, 2019
ace8281
fixed definition of speed-up S(M)
orbeckst Jun 13, 2019
cb46d91
add zenodo DOI for data/script repository
orbeckst Jun 13, 2019
4220250
fixed typo found by reviewer @cyrush
orbeckst Jun 13, 2019
76416eb
data is a plural noun: fixed
orbeckst Jun 13, 2019
97a90d1
added particle type indices to make g(r) equation clearer
orbeckst Jun 13, 2019
783dda4
methods updates
orbeckst Jun 13, 2019
4af3190
installation details
orbeckst Jun 14, 2019
f72c0b6
re-arranged performance evaluation section
orbeckst Jun 14, 2019
bf5ee59
float juggling to make figures appear sooner
orbeckst Jun 14, 2019
d9e1a11
cleaned up bib file
orbeckst Jun 14, 2019
07def91
updated software versions
orbeckst Jun 14, 2019
67b7afb
add detail for RMSD calculation
orbeckst Jun 14, 2019
c31013d
text improvements in RMSD Task results section
orbeckst Jun 14, 2019
0fe1c37
add errorbars to wait compute io plot
VOD555 Jun 19, 2019
f7c2003
Merge branch '2019' of github.com:Becksteinlab/scipy_proceedings into…
VOD555 Jun 19, 2019
a978b2d
add errorbars to pre_con_uni plots
VOD555 Jun 19, 2019
f44abfe
add error bars total efficiency speedup plots
VOD555 Jun 19, 2019
28cbff8
add stacked graphs with percentage times
VOD555 Jun 19, 2019
13f4579
modify color of lines on graphs
VOD555 Jun 19, 2019
d78a7e9
updated text for plots with error bars
orbeckst Jun 19, 2019
108ecf5
Merge branch '2019' of github.com:Becksteinlab/scipy_proceedings into…
orbeckst Jun 19, 2019
2da50d9
integrate stacked fraction of time plots
orbeckst Jun 20, 2019
615ae24
add reviewer @cyrush to Acknowledgements for the idea of the stacked …
orbeckst Jun 20, 2019
a32203d
updated computational details for RDF calculation
orbeckst Jun 20, 2019
b82547f
fix some typo
VOD555 Jun 20, 2019
aa19461
updated text for errors of speedup and efficiency
VOD555 Jun 20, 2019
a18166d
fix typo
VOD555 Jul 2, 2019
ade200a
fix typo
VOD555 Jul 2, 2019
File filter...
Filter file types
Jump to…
Jump to file or symbol
Failed to load files and symbols.

Always

Just for now

introduction v1

- code availability and development process (can be moved elsewhere)
- key idea of PMDA
  • Loading branch information...
orbeckst committed May 21, 2019
commit a324a01fdaa57c066ff1eb515f8558e44057fa32
@@ -61,9 +61,9 @@
.. |avg_tIO| replace:: :math:`\langle t_\text{I/O} \rangle`
.. |Ncores| replace:: :math:`N`

---------------------------------------------
PMDA - Parallel Molecular Dynamics Analysis
---------------------------------------------
-------------------------------------------
PMDA - Parallel Molecular Dynamics Analysis
-------------------------------------------

.. class:: abstract

@@ -89,6 +89,7 @@




============
Introduction
============
@@ -101,10 +102,23 @@ Currently simulated systems may contain millions of atoms and the trajectories c
Processing and analyzing these trajectories is increasingly becoming a rate limiting step in computational workflows :cite:`Cheatham:2015qf, Beckstein:2018aa`.
Modern MD packages are highly optimized to perform well on current HPC clusters with hundreds of cores such as the XSEDE supercomputers :cite:`XSEDE` but current general purpose trajectory analysis packages :cite:`Giorgino:2019aa` were not designed with HPC in mind.

In order to scale up trajectory analysis from workstations to HPC clusters with the MDAnalysis_ Python library :cite:`Michaud-Agrawal:2011fu,Gowers:2016aa` we leveraged Dask_ :cite:`Rocklin:2015aa, Dask:2016aa`, a task-graph parallel framework, together with the distributed scheduler, and created the *Parallel MDAnalysis* (PMDA_) library.
In order to scale up trajectory analysis from workstations to HPC clusters with the MDAnalysis_ Python library :cite:`Michaud-Agrawal:2011fu,Gowers:2016aa` we leveraged Dask_ :cite:`Rocklin:2015aa, Dask:2016aa`, a task-graph parallel framework, together with Dask's distributed scheduler, and created the *Parallel MDAnalysis* (PMDA_) library.
By default, PMDA follows a simple split-apply-combine :cite:`Wickham:2011aa` approach for trajectory analysis, whereby each task analyzes a single trajectory segment and reports back the individual results that are then combined into the final result :cite:`Khoshlessan:2017ab`.
Our previous work established that Dask worked well with MDAnalysis :cite:`Khoshlessan:2017ab` and that this approach was competitive with other task-parallel approaches :cite:`Paraskevakos:2018aa`.
However, we did not provide a general purpose framework to write parallel analysis tools with MDAnalysis.
Here we show how the split-apply-combine approach lends itself to a generalizable Python implementation that makes it straightforward for users to implement their own parallel analysis tools.
At the heart of PMDA is the idea that the user only needs to provide a function that analyzes a single trajectory frame.
PMDA provides the remaining framework via the ``ParallelAnalysisBase`` class to split the trajectory, apply the user's function to trajectory frames, run the analysis in parallel via Dask/distributed, and and combines the data.


Code availability and development process
=========================================

PMDA is available in source form under the GNU General Public License v2 from the GitHub repository `MDAnalysis/pmda`_, and as a `PyPi package`_ and `conda package`_ (via the conda-forge channel).
Python 2.7 and Python :math:`\ge` 3.5 are fully supported and tested.
The package uses `semantic versioning`_ to make it easy for users to judge the impact of upgrading.
The development process uses continuous integration (`Travis CI`_): extensive tests are run on all commits and pull requests via pytest_, resulting in a current code coverage of 97\% and documentation_ is automatically generated by `Sphinx`_ and published as GitHub pages.
Users are supported through the `community mailinglist`_ (Google group) and the GitHub `issue tracker`_.



@@ -364,3 +378,13 @@ References
.. _PMDA: https://www.mdanalysis.org/pmda/
.. _MDAnalysis: https://www.mdanalysis.org
.. _Dask: https://dask.org
.. _`MDAnalysis/pmda`: https://github.com/MDAnalysis/pmda
.. _`PyPi package`: https://pypi.org/project/pmda/
.. _`conda package`: https://anaconda.org/conda-forge/pmda
.. _`semantic versioning`: https://semver.org/
.. _documentation: https://www.mdanalysis.org/pmda/
.. _pytest: https://pytest.org
.. _Sphinx: https://www.sphinx-doc.org/
.. _`Travis CI`: https://travis-ci.com/
.. _`community mailinglist`: https://groups.google.com/forum/#!forum/mdnalysis-discussion
.. _`issue tracker`: https://github.com/MDAnalysis/pmda/issues
ProTip! Use n and p to navigate between commits in a pull request.
You can’t perform that action at this time.