# Tutorial 1: Standard REXEE simulations

[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/wehs7661/ensemble_md/c0f6d48ce3fe746e349e4a4a9610f935cca8b0b5?urlpath=lab%2Ftree%2Fdocs%2Fexamples%2Ftutorial_1%2Frun_REXEE.ipynb)

In this tutorial, we will demonstrate how one can use different command-line interfaces (CLIs) implemented in the Python package `ensemble_md` to prepare, perform, and analyze a fixed-weight REXEE simulation to estimate the solvation free energy of a toy molecule composed of 4 interaction sites. For a more comprehensive understanding, we strongly recommend reading our [documentation](https://ensemble-md.readthedocs.io/en/latest/simulations.html) on launching REXEE simulations before starting this tutorial. With MPI (`mpirun` or `mpiexec`), GROMACS and `ensemble_md` all installed, you should be able to run this tutorial either locally or on a HPC cluster, as long as you have at least four CPU cores. Alternatively, you can click the badge above to run this tutorial on Binder without installing anything. Notably, this tutorial assumes that you understand the basics of the expanded ensemble method and relevant simulation parameters in GROMACS. If not, we recommend reading the [GROMACS documentation]() and/or [this tutorial](). 

## 1. Preparing simulation inputs for a REXEE simulation

As mentioned in our documentation, running a REXEE simulation at least requires the following four input files:

- One YAML file that specifies REXEE parameters.
- One GROMACS GRO file of the system of interst.
- One GROMACS TOP file of the system of interst.
- One GROMACS MDP template for customizing MDP files for different replicas during the simulation.

In the folder where this tutorial resides (`docs/examples/tutoria_1` in the repository), you should find all these necessary input files, `params.yaml`, `sys.gro`, `sys.top`, and `expanded.mdp`.

In [1]:
!ls

expanded.mdp    params.yaml     run_REXEE.ipynb sys.gro         sys.top


While you could use multiple GRO and TOP files, one for each replica in a REXEE simulation, in this tutorial, we will use the same GRO and TOP files for all replicas to keep the demonstration straightforward. As for the MDP template `expanded.mdp`, it can be inspected that it adopts common/typical settings for a fixed-weight EE simulation, with 9 alchemical intermediate states defined to decouple van der Waals interactions and coulombic interactions. During the REXEE simulation, customized MDP files will be generated for different replicas such that the MDP file for each replica only consider the set of states that the replica is constrained to. Notably, the weights specified via the parameter `init_lambda_weights` were obtained from a weight-updating EE simulation. If you would like to run a weight-updating REXEE simulation, simply use an MDP file for running a weight-updating EE simulation, i.e., and MDP file without `init_lambda_weights` being specified, and with options such as `wl_scale`, `wl_ratio`, `init_wl_delta`, `lmc_stats`, `lmc_weights_equil`, `weight_equil_wl_delta` specified.


Now, with these three GROMACS inputs prepared, let's briefly review the input YAML file that specifies REXEE parameters.

In [1]:
!cat params.yaml

# User-defined parameters
gmx_executable: 'gmx'               
gro: sys.gro
top: sys.top
mdp: expanded.mdp             
n_sim: 4
n_iter: 5         
s: 1                         
nst_sim: 500
runtime_args: {'-nt': '1', '-ntmpi': '1'}


As shown above, this YAML file only specifies the required REXEE parameters and adopts default values for all optional parameters, execpt that `runtime_args` is specified to run each EE replica using one thread (`-nt 1`) so that it would not take too long to complete the simulation. In our case here, we will run our a REXEE simulation composed of 4 replicas, with each replica performed with one thread (`-nt 1`). With the total number of states as 9 and a state shift of 1, this means that replicas 0, 1, 2, 3 are constrained to sampling states 0-5, 1-6, 2-7, and 3-8, respectively. The simulation will attempt to swap coordinates between replicas every 500 integration steps, and perform 5 iterations in total. (Note that generally we need a much larger number of iterations. Here we use a very small value just for demosntration purposes.) 

If you are interested in trying out other numbers of replicas or state shifts given 9 alchemical intermediate states, you can use the CLI `explore_REXEE` to enumerate all possible REXEE configurations.

In [2]:
!explore_REXEE -N 9

Exploration of the REXEE parameter space
[ REXEE parameters of interest ]
- N: The total number of states
- r: The number of replicas
- n: The number of states for each replica
- s: The state shift between adjacent replicas

[ Solutions ]
- Solution 1: (N, r, n, s) = (9, 2, 5, 4)
  - Replica 0: [0, 1, 2, 3, 4]
  - Replica 1: [4, 5, 6, 7, 8]

- Solution 2: (N, r, n, s) = (9, 2, 6, 3)
  - Replica 0: [0, 1, 2, 3, 4, 5]
  - Replica 1: [3, 4, 5, 6, 7, 8]

- Solution 3: (N, r, n, s) = (9, 2, 7, 2)
  - Replica 0: [0, 1, 2, 3, 4, 5, 6]
  - Replica 1: [2, 3, 4, 5, 6, 7, 8]

- Solution 4: (N, r, n, s) = (9, 2, 8, 1)
  - Replica 0: [0, 1, 2, 3, 4, 5, 6, 7]
  - Replica 1: [1, 2, 3, 4, 5, 6, 7, 8]

- Solution 5: (N, r, n, s) = (9, 3, 5, 2)
  - Replica 0: [0, 1, 2, 3, 4]
  - Replica 1: [2, 3, 4, 5, 6]
  - Replica 2: [4, 5, 6, 7, 8]

- Solution 6: (N, r, n, s) = (9, 3, 7, 1)
  - Replica 0: [0, 1, 2, 3, 4, 5, 6]
  - Replica 1: [1, 2, 3, 4, 5, 6, 7]
  - Replica 2: [2, 3, 4, 5, 6, 7, 8]

- Solution 7: (

If you want to list all possible configurations that use 4 replicas for sampling 9 alchemical states, you can further specify the number of replicas via the flag `-r`:

In [2]:
!explore_REXEE -N 9 -r 4

Exploration of the REXEE parameter space
[ REXEE parameters of interest ]
- N: The total number of states
- r: The number of replicas
- n: The number of states for each replica
- s: The state shift between adjacent replicas

[ Solutions ]
- Solution 1: (N, r, n, s) = (9, 4, 3, 2)
  - Replica 0: [0, 1, 2]
  - Replica 1: [2, 3, 4]
  - Replica 2: [4, 5, 6]
  - Replica 3: [6, 7, 8]

- Solution 2: (N, r, n, s) = (9, 4, 6, 1)
  - Replica 0: [0, 1, 2, 3, 4, 5]
  - Replica 1: [1, 2, 3, 4, 5, 6]
  - Replica 2: [2, 3, 4, 5, 6, 7]
  - Replica 3: [3, 4, 5, 6, 7, 8]



For rules of thumb about specifying the configurational parameters (including the number of states, number of replicas, state shift and swapping frequency), please refer to the [documentation]().

## 2. Performing a REXEE simulation
With the input files above, we can now run the REXEE simulation using the CLI `run_REXEE` withthe following command:

In [5]:
!mpirun -np 4 run_REXEE

Current time: 17/05/2024 14:30:29
Command line: /Users/Wei-TseHsu/miniconda3/bin/run_REXEE

Important parameters of REXEE
Python version: 3.8.12 | packaged by conda-forge | (default, Jan 30 2022, 23:36:06) 
[Clang 11.1.0 ]
GROMACS executable: /usr/local/bin/gmx
GROMACS version: 2022.3-dev-20230426-4eabaf582b
ensemble_md version: 0.9.0+140.gc0f6d48.dirty
Simulation inputs: sys.gro, sys.top, expanded.mdp
Verbose log file: True
Proposal scheme: exhaustive
Whether to perform weight combination: False
Type of means for weight combination: simple
Whether to perform histogram correction: False
Histogram cutoff for weight correction: -1
Number of replicas: 4
Number of iterations: 5
Length of each replica: 1.0 ps
Frequency for checkpointing: 100 iterations
Total number of states: 9
Additionally defined swappable states: None
Additional grompp arguments: None
Additional runtime arguments: {'-nt': '1', '-ntmpi': '1'}
External modules for coordinate manipulation: None
MDP parameters differing acro

## 3. Analyzing a REXEE simulation