Skip to content

Commit

Permalink
Merge pull request #10 from wehs7661/enable_gmx_mpi
Browse files Browse the repository at this point in the history
Enable `ensemble_md` to work with `gmx_mpi`
  • Loading branch information
wehs7661 committed Jun 2, 2023
2 parents 90cbb1c + 11abe50 commit c661fb8
Show file tree
Hide file tree
Showing 12 changed files with 300 additions and 265 deletions.
4 changes: 2 additions & 2 deletions .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -39,10 +39,10 @@ jobs:
cd $HOME && mkdir pkgs
git clone https://gitlab.com/gromacs/gromacs.git
cd gromacs && mkdir build && cd build
cmake .. -DCMAKE_CXX_COMPILER=$CXX -DCMAKE_C_COMPILER=$CC -DCMAKE_INSTALL_PREFIX=$HOME/pkgs -DGMX_ENABLE_CCACHE=ON
cmake .. -DCMAKE_CXX_COMPILER=$CXX -DCMAKE_C_COMPILER=$CC -DCMAKE_INSTALL_PREFIX=$HOME/pkgs -DGMX_ENABLE_CCACHE=ON -DGMX_MPI=on
make install
source $HOME/pkgs/bin/GMXRC
gmx --version
gmx_mpi --version
ccache -s
- run:
Expand Down
2 changes: 1 addition & 1 deletion docs/getting_started.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ in the future when possible.
===============
2.1. Requirements
-----------------
Before installing :code:`ensemble_md`, one should have working versions of `GROMACS`_. Please refer to the linked documentations for full installation instructions.
Importantly, :code:`ensemble_md` only works with MPI-enabled `GROMACS`_. Please refer to the linked documentation for full installation instructions.
All the other pip-installable dependencies of :code:`ensemble_md` (specified in :code:`setup.py` of the package)
will be automatically installed during the installation of the package.

Expand Down
65 changes: 42 additions & 23 deletions docs/simulations.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,8 @@
:code:`explore_EEXE` helps the user to figure out possible combinations of EEXE parameters, while :code:`run_EEXE` and :code:`analyze_EEXE`
can be used to perform and analyze EEXE simulations, respectively. Below we provide more details about each of these CLIs.

1.1. CLI `explore_EEXE`
-----------------------
1.1. CLI :code:`explore_EEXE`
-----------------------------
Here is the help message of :code:`explore_EEXE`:

::
Expand All @@ -29,8 +29,8 @@ Here is the help message of :code:`explore_EEXE`:
replicas.


1.2. CLI `run_EEXE`
-------------------
1.2. CLI :code:`run_EEXE`
-------------------------
Here is the help message of :code:`run_EEXE`:

::
Expand Down Expand Up @@ -58,18 +58,24 @@ Here is the help message of :code:`run_EEXE`:
The maximum number of warnings in parameter specification to be
ignored.

Same as any other replica-exchange methods, EEXE only works with MPI-enabled GROMACS. To leverage
the allocated computational resources, one can specify the number of MPI processes via the parameter
:code:`n_proc`, with the CLI of MPI (e.g. :code:`mpirun` or :code:`mpiexec`) speicifed via the
parameter :code:`mpi_cli`. In addition, one can specify the number of OpenMP threads per MPI process (i.e.,
:code:`-ntomp` used with the GROMACS mdrun commands) via the parameter :code:`runtime_args` by using
:code:`runtime_args = {'-ntomp': '16'}`.



In our current implementation, it is assumed that all replicas of an EEXE simulations are performed in
parallel using MPI. Naturally, performing an EEXE simulation using :code:`run_EEXE` requires a command-line interface
to launch MPI processes, such as :code:`mpirun` or :code:`mpiexec`. For example, on a 128-core node
in a cluster, one may use :code:`mpirun -np 4 run_EEXE` (or :code:`mpiexec -n 4 run_EEXE`) to run an EEXE simulation composed of 4
replicas with 4 MPI processes. Note that in this case, it is often recommended to explicitly specify
more details about resources allocated for each replica. For example, one can specifies :code:`{'-nt': 32}`
for the EEXE parameter `runtime_args` (specified in the input YAML file, see :ref:`doc_EEXE_parameters`),
so each of the 4 replicas will use 32 threads (assuming thread-MPI GROMACS), taking the full advantage
of 128 cores.

1.3. CLI `analyze_EEXE`
-----------------------
to launch MPI processes, such as :code:`mpirun` or :code:`mpiexec`. For example, to run an EEXE simulation
composed of 4 replicas, one must use :code:`mpirun -np 4 run_EEXE` (or :code:`mpiexec -n 4 run_EEXE`),
where the value of :code:`-np` should be the same as the number of replicas.
For more information about the parameter :code:`runtime_args` in the input YAML file, see :ref:`doc_EEXE_parameters`.

1.3. CLI :code:`analyze_EEXE`
-----------------------------
Finally, here is the help message of :code:`analyze_EEXE`:

::
Expand Down Expand Up @@ -210,13 +216,20 @@ In the current implementation of the algorithm, 22 parameters can be specified i
Note that the two CLIs :code:`run_EEXE` and :code:`analyze_EEXE` share the same input YAML file, so we also
include parameters for data analysis here.

3.1. GROMACS executable
-----------------------
3.1. Runtime configuration
--------------------------

- :code:`gmx_executable`: (Required)
- :code:`gmx_executable`: (Optional, Default: :code:`gmx_mpi`)
The GROMACS executable to be used to run the EEXE simulation. The value could be as simple as :code:`gmx`
or :code:`gmx_mpi` if the exeutable has be sourced. Otherwise, the full path of the exetuable (e.g.
:code:`/usr/local/gromacs/bin/gmx`, the path returned by the command :code:`which gmx`).
or :code:`gmx_mpi` if the exeutable has been sourced. Otherwise, the full path of the executable (e.g.
:code:`/usr/local/gromacs/bin/gmx`, the path returned by the command :code:`which gmx`) should be used.
Note that EEXE only works with MPI-enabled GROMACS.
- :code:`mpi_cli`: (Optional, Default: :code:`mpirun`)
The CLI for launching MPI processes, e.g. :code:`mpirun` or :code:`mpiexec`. Note that this parameter
is only required when MPI-enabled GROMACS is used. If tMPI-enabled GROMACS executable is specified
in :code:`gmx_executable`, the value of :code:`mpi_cli` will be ignored.
- :code:`n_proc`: (Optional, Default: the number of replicas, i.e. the value of :code:`n_sim`)
The number of MPI processes to run the EEXE simulation.

3.2. Simulation inputs
----------------------
Expand Down Expand Up @@ -261,9 +274,11 @@ include parameters for data analysis here.
- :code:`grompp_args`: (Optional: Default: :code:`None`)
Additional arguments to be appended to the GROMACS :code:`grompp` command provided in a dictionary.
For example, one could have :code:`{'-maxwarn', '1'}` to specify the :code:`maxwarn` argument for the :code:`grompp` command.
- :code:`runtime_args`: (Optional, Default: :code:`None`)
- :code:`runtime_args`: (Optional, Default: :code:`{}`)
Additional runtime arguments to be appended to the GROMACS :code:`mdrun` command provided in a dictionary.
For example, one could have :code:`{'-nt': 16}` to run the simulation using 16 threads.
For example, one could have :code:`{'-nt': 16}` to run the simulation using tMPI-enabled GROMACS with 16 threads.
Notably, if MPI-enabled GROMACS is used, one should specify :code:`-np` to better use the resources. If it is
not specified, the default will be the number of simulations and a warning will occur.

3.4. Output settings
--------------------
Expand Down Expand Up @@ -300,9 +315,13 @@ include parameters for data analysis here.
For convenience, here is a template of the input YAML file, with each optional parameter specified with the default and required
parameters left with a blank. Note that specifying :code:`null` is the same as leaving the parameter unspecified (i.e. :code:`None`).

::
# Section 1: GROMACS executable

.. code-block:: yaml
# Section 1: Runtime configuration
gmx_executable:
mpi_cli: 'mpirun'
n_proc: null
# Section 2: Simulation inputs
gro:
Expand Down
24 changes: 13 additions & 11 deletions ensemble_md/analysis/analyze_matrix.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,11 +33,11 @@ def parse_transmtx(log_file, expanded_ensemble=True):
Returns
-------
empirical : numpy.ndarray
empirical : np.ndarray
The final empirical state transition matrix.
theoretical : None or numpy.ndarray
theoretical : None or np.ndarray
The final theoretical state transition matrix.
diff_matrix : None or numpy.ndarray
diff_matrix : None or np.ndarray
The difference between the theortial and empirical state transition matrix (empirical - theoretical).
"""
f = open(log_file, "r")
Expand Down Expand Up @@ -92,12 +92,12 @@ def calc_equil_prob(trans_mtx):
Parameters
----------
trans_mtx : numpy.ndarray
trans_mtx : np.ndarray
The input state transition matrix
Returns
-------
equil_prob : numpy.ndarray
equil_prob : np.ndarray
"""
check_row = sum([np.isclose(np.sum(trans_mtx[i]), 1) for i in range(len(trans_mtx))])
check_col = sum([np.isclose(np.sum(trans_mtx[:, i]), 1) for i in range(len(trans_mtx))])
Expand All @@ -122,14 +122,16 @@ def calc_equil_prob(trans_mtx):
return equil_prob


def calc_spectral_gap(trans_mtx):
def calc_spectral_gap(trans_mtx, atol=1e-8):
"""
Calculates the spectral gap of the input transition matrix.
Parameters
----------
trans_mtx : numpy.ndarray
trans_mtx : np.ndarray
The input state transition matrix
atol: float
The absolute tolerance for checking the sum of columns and rows.
Returns
-------
Expand All @@ -138,8 +140,8 @@ def calc_spectral_gap(trans_mtx):
eig_vals : list
The list of eigenvalues
"""
check_row = sum([np.isclose(np.sum(trans_mtx[i]), 1) for i in range(len(trans_mtx))])
check_col = sum([np.isclose(np.sum(trans_mtx[:, i]), 1) for i in range(len(trans_mtx))])
check_row = sum([np.isclose(np.sum(trans_mtx[i]), 1, atol=atol) for i in range(len(trans_mtx))])
check_col = sum([np.isclose(np.sum(trans_mtx[:, i]), 1, atol=atol) for i in range(len(trans_mtx))])

if check_row == len(trans_mtx):
eig_vals, eig_vecs = np.linalg.eig(trans_mtx.T)
Expand Down Expand Up @@ -170,7 +172,7 @@ def split_transmtx(trans_mtx, n_sim, n_sub):
Parameters
----------
trans_mtx : numpy.ndarray
trans_mtx : np.ndarray
The input state transition matrix to split
n_sim : int
The number of replicas in EEXE.
Expand Down Expand Up @@ -200,7 +202,7 @@ def plot_matrix(matrix, png_name, title=None, start_idx=0):
Parameters
----------
matrix : numpy.ndarray
matrix : np.ndarray
The matrix to be visualized
png_name : str
The file name of the output PNG file (including the extension).
Expand Down
3 changes: 2 additions & 1 deletion ensemble_md/analysis/analyze_traj.py
Original file line number Diff line number Diff line change
Expand Up @@ -128,7 +128,7 @@ def traj2transmtx(traj, N, normalize=True):
Returns
-------
transmtx : numpy.ndarray
transmtx : np.ndarray
The transition matrix computed from the trajectory
"""
transmtx = np.zeros([N, N])
Expand Down Expand Up @@ -462,6 +462,7 @@ def plot_transit_time(trajs, N, fig_prefix=None, dt=None, folder='.'):
plt.figure()
for i in range(len(t_list)): # t_list[i] is the list for configuration i
plt.plot(np.arange(len(t_list[i])) + 1, t_list[i], label=f'Configuration {i}', marker=marker)

if max(max((t_list))) >= 10000:
plt.ticklabel_format(style='sci', axis='y', scilimits=(0, 0))
plt.xlabel('Event index')
Expand Down
2 changes: 1 addition & 1 deletion ensemble_md/analysis/msm_analysis.py
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ def plot_its(trajs, lags, fig_name, dt=1, units='step'):
Returns
-------
ts_list : list
An list of instances of the ImpliedTimescales class in PyEMMA.
An list of instances of the :code:`ImpliedTimescales` class in PyEMMA.
"""
ts_list = []
n_rows, n_cols = utils.get_subplot_dimension(len(trajs))
Expand Down

0 comments on commit c661fb8

Please sign in to comment.