In [None]:
# Import required libraries
import matplotlib.pyplot as plt
import numpy as np

<div class="alert alert-block alert-success">

# EPFL Course: CH-630 Drug Discovery

## Doctoral School EDCH

## Week 4: Exercises

</div>

<h2 style="color:green;"> Lesson 4.3: Molecular Dynamics </h2>

As you learned in Lesson 2.3.1, **molecular dynamics (MD)** is a computational technique that simulates the time evolution of particles by numerically solving Newtonâ€™s equations of motion.

MD is widely used to explore how molecular systems behave at the microscopic, atomistic level, providing insight into their structure, flexibility, and interactions.

In drug discovery, MD plays a crucial role: it is used to investigate the dynamic behavior of proteins, understand mechanisms of folding and conformational changes, and study the interactions and binding affinity of potential drugs with a protein receptor.

In this exercise:

1. We will begin by investigating a simple toy model: a particle moving in a double-well potential.

2. We will implement basic Molecular Dynamics algorithms to understand how Newton's equations of motions are numerically integrated.

3. We will investigate the crucial role of thermostats in MD simulations.

After this, you will finish with an **exercize** on a more realistic system, using a common MD engine, GROMACS, to simulate a solvated protein.<br><br>

> ðŸ’¡ This lesson will help you understand how Molecular Dynamics algorithms work from first principles and how to run a realistic biomolecular simulations.

## Building a Simple Molecular Dynamics Simulation

When studying the time evolution of proteins or other biomolecular systems, researchers typically rely on powerful MD engines such as GROMACS, NAMD, or OpenMM.

To understand how these programs work, we will firstly analyze the fundamental algorithms that underpin them.

For simplicity, we will focus on a 1D toy model: a particle moving in a double-well potential.

We will start by implementing an algorithm to numerically integrate Newtonâ€™s equations of motion over time.
This initial simulation will be run in the microcanonical NVE ensemble, where the total energy of the system is conserved.

Then, we will extend the model by implementing a thermostat (e.g., Andersen thermostat) to run simulations in the canonical NVT ensemble. In this ensemble, the temperature is maintained constant, which is particularly relevant for biomolecular systems, as most biologically relevant processes occur at constant temperature.

<h2 style="color:green;"> Step 1: A particle moving in a double-well potential </h2>

We will consider the simple example of a particle moving in a 1D double-well potential.<br><br>

We will define the double-well potential as:

$V(x) = a * (x^2 - b^2)^2$

and the corresponding force as:

$F(x) = -dV/dx$

In [None]:
# Define the parameters of the potential
a = 1.0
b = 1.0

# Define the double-well potential
def double_well_potential(x, a=1, b=1):
    """ V(x) = a * (x^2 - b^2)^2 """
    return a * (x**2 - b**2)**2

# Define the corresponding force
def double_well_force(x, a=1, b=1):
    """ F(x) = -dV/dx """
    return -4*a*x*(x**2 - b**2)

# Plot the double-well potential for an interval of coordinates
x_vals = np.linspace(-2, 2, 200)
V_vals = double_well_potential(x_vals)

plt.figure(figsize=(8,6))
plt.plot(x_vals, V_vals, 'k-', label="Double-well potential")
plt.xlabel("x")
plt.ylabel("V(x)")
plt.title("1D double-well potential")
plt.show()


The two wells represent stable states (minima), and the barrier between them represents an energy barrier the particle must overcome to move from one well to the other.

Now, try to change the parameters of the potential to increase the barrier height between the two wells and plot the different potentials.

In [None]:
### CREATE A LIST OF a VALUES TO MODIFY THE BARRIER HEIGHT ###
### START MODIFICATION ###
a_list = 
### END MODIFICATION ###

plt.figure(figsize=(8,6))
for a in a_list:
    V_vals = double_well_potential(x_vals, a, b)
    plt.plot(x_vals, V_vals, label=f"a = {a}")

plt.xlabel("x")
plt.ylabel("V(x)")
plt.title("1D double-well potential")
plt.legend()
plt.show()

Increasing the barrier height $a$ makes transitions between wells less frequent.

<h2 style="color:green;"> Step 2: Integrating Newton's equation of motion with Velocity Verlet </h2>

To study the time evolution of a system and obtain its trajectory over time, we need to numerically solve the Newton's equations of motion of the system.

Several algorithms exist to perform this numerical integration.
In this exercise, we will use the Velocity Verlet algorithm, commonly used in MD thanks to its simplicity, stability, and time-reverisbility.

The Velocity Verlet algorithm updates positions and velocities according to the following equations:

![Alt text](velocityverlet.png)

Try to modify the following function (where indicated) for the Velocity Verlet integration by defining the steps for the update of the position and the velocity.

In [None]:
# --- Velocity Verlet integrator ---

def velocity_verlet(x0, v0, m, dt, n_steps):
    """
    Integrate the motion of a particle using Velocity Verlet.
    
    Parameters:
        x0, v0 : initial position and velocity
        m      : mass
        dt     : time step
        n_steps: number of integration steps
    Returns:
        t_traj, x_traj, v_traj  : arrays of times, positions, and velocities
    """
    x = x0
    v = v0
    F = double_well_force(x)
    
    t_traj = np.zeros(n_steps)
    x_traj = np.zeros(n_steps)
    v_traj = np.zeros(n_steps)
    
    
    for i in range(n_steps):
        # --- Velocity Verlet steps ---
        # 1. Update position
        ### START MODIFICATION FOR POSITION ###
        x += 
        ### END MODIFICATION FOR POSITION ###
        # 2. Compute new force
        F_new = double_well_force(x) 
        # 3. Update velocity
        ### STAR MODIFICATION FOR VELOCITY ###
        v += 
        ### END MODIFICATION FOR VELOCITY ###
        # 4. Update force for next step
        F = F_new

        # Save trajectory
        t_traj[i] = i*dt
        x_traj[i] = x
        v_traj[i] = v
        
    return t_traj, x_traj, v_traj

In [None]:
# --- Simulation parameters ---

''' Parameters:
x0     : initial position 
v0     : initial velocity
m      : mass
dt     : time step
n_steps: number of integration steps '''

x0 = -1.0         # start in left well
v0 = 0.0
m = 1.0
dt = 0.01
n_steps = 5000

# --- Run NVE simulation ---
t_nve, x_nve, v_nve = velocity_verlet(x0, v0, m, dt, n_steps)

In [None]:
# Plot trajectory
plt.figure(figsize=(8,6))
plt.plot(t_nve, x_nve, 'b-', label="Position x(t)")
plt.xlabel("Time")
plt.ylabel("x")
plt.title("Particle Trajectory in NVE Ensemble")
plt.legend()
plt.show()

# Plot total energy (kinetic + potential)
total_energy = 0.5 * m * v_nve**2 + double_well_potential(x_nve)
plt.figure(figsize=(8,6))
plt.plot(t_nve, total_energy, 'r-', label="Total Energy")
plt.xlabel("Time")
plt.ylabel("Energy")
plt.title("Energy Conservation in NVE Ensemble")
plt.ylim(total_energy.min()-0.1, total_energy.max()+0.1)
plt.legend()
plt.show()

This is a simulation in the NVE ensemble, where there is no energy exchange with the environment.<br>
The total energy is conserved throughout the simulation. <br><br>

With an initial velocity of 0 (`v0 = 0`), the particle cannot move from the minimum at $x = -1$.
Try increasing the kinetic energy by setting a non-zero initial velocity.

In [None]:
# --- Simulation parameters ---

''' Parameters:
x0     : initial position 
v0     : initial velocity
m      : mass
dt     : time step
n_steps: number of integration steps '''

x0 = -1.0
v0 =           # change the initial velocity
m = 1.0
dt = 0.01
n_steps = 5000

# --- Run NVE simulation ---
t_nve, x_nve, v_nve = velocity_verlet(x0, v0, m, dt, n_steps)

In [None]:
# Plot trajectory
plt.figure(figsize=(8,6))
plt.plot(t_nve, x_nve, 'b-', label="Position x(t)")
plt.xlabel("Time")
plt.ylabel("x")
plt.title("Particle Trajectory in NVE Ensemble")
plt.legend()
plt.show()

# Plot total energy (kinetic + potential)
total_energy = 0.5 * m * v_nve**2 + double_well_potential(x_nve)
plt.figure(figsize=(8,6))
plt.plot(t_nve, total_energy, 'r-', label="Total Energy")
plt.xlabel("Time")
plt.ylabel("Energy")
plt.title("Energy Conservation in NVE Ensemble")
plt.legend()
plt.ylim(total_energy.min()-0.1, total_energy.max()+0.1)
plt.show()

The total energy should still be conserved. Small fluctuations around the mean value are normal.

With a small initial velocity, the particle oscillates around the minimum at $x = -1$. <br>
When the initial velocity is large enough, the particle can cross the barrier and reach the other minimum at $x = +1$. <br>

<div class="alert alert-block alert-info">
What is the <b>smallest initial velocity</b> for which the particle crosses barrier and reaches the other minimum?
</div>

<h2 style="color:green;"> Step 3: Adding a thermostat (NVT ensemble) - Andersen thermostat </h2>

To study the behavior of a system at **constant temperature**, we need to allow the particle to exchange energy with a thermal bath. This corresponds to the canonical ensemble (NVT), where the temperature is fixed, but the total energy of the system can fluctuate.

This can be done by adding a thermostat. One simple approach is the Andersen thermostat. In this method:

- At each timestep, each particle has a probability $\nu*dt$ of having its velocity randomly reassigned according to a Maxwell-Boltzmann distribution corresponding to the target temperature $T$.

- The system evolves using the Velocity Verlet algorithm, just as in NVE, but occasionally the velocities are "thermalized" to maintain the desired temperature.

Try to modify the following function (where indicated) to add the Andersen thermostat.

Extract the new velocity from a normal random distribution centered in $0$ and with standard deviation $\sigma = \sqrt{k_B * T / m}$

(Hint: you can use `np.random.normal`)

In [None]:
# --- Velocity Verlet integrator with Andersen thermostat ---

def velocity_verlet_NVT(x0, v0, m, dt, n_steps, T=1.0, nu=0.1):
    """
    Integrate the motion of a particle using Velocity Verlet with Andersen thermostat (NVT ensemble).
    
    Parameters:
        x0, v0 : initial position and velocity
        m      : mass
        dt     : time step
        n_steps: number of integration steps
        T      : temperature (for thermostat) in reduced units
        nu     : collision frequency for Andersen thermostat
    Returns:
        x_traj, v_traj, t_traj : arrays of positions, velocities, and times
    """

    kB = 1 # Boltzmann constant in reduced units
    
    x = x0
    v = v0
    F = double_well_force(x)
    
    x_traj = np.zeros(n_steps)
    v_traj = np.zeros(n_steps)
    t_traj = np.zeros(n_steps)
    
    for i in range(n_steps):
        # --- Velocity Verlet steps ---
        # 1. Update position
        ### START MODIFICATION FOR POSITION ###
        x += 
        ### END MODIFICATION FOR POSITION ###
        # 2. Compute new force
        F_new = double_well_force(x) 
        # 3. Update velocity
        ### STAR MODIFICATION FOR VELOCITY ###
        v +=   
        ### END MODIFICATION FOR VELOCITY ###
        # 4. Update force for next step
        F = F_new
        
        # --- Andersen thermostat ---
        if np.random.rand() < nu * dt:
            # Reassign velocity from Maxwell-Boltzmann distribution
            ### STAR MODIFICATION ###
            v = 
            ### END MODIFICATION ###

        # Store trajectories
        x_traj[i] = x
        v_traj[i] = v
        t_traj[i] = i*dt
        
    return t_traj, x_traj, v_traj

Now, as for the velocity reassignment, define the initial velocity of the simulation from a normal random distribution centered in $0$ and with standard deviation $\sigma = \sqrt{k_B * T / m}$

(We are working in reduced-units, with $k_B = 1$)


In [None]:
# --- Simulation parameters ---

''' Parameters:
x0     : initial position 
v0     : initial velocity
m      : mass
dt     : time step
n_steps: number of integration steps '''

x0 = -1.0         # start in left well
m = 1.0         
dt = 0.01
n_steps = 5000
T = 0.1           # reduced temperature
nu = 0.1          # collision frequency
kB = 1.0          # Boltzmann constant in reduced units

# initialize velocity from Maxwell-Boltzmann distribution
### ADD YOUR CODE HERE ###
v0 = 

# --- Run NVT simulation ---
t_nvt, x_nvt, v_nvt = velocity_verlet_NVT(x0, v0, m, dt, n_steps, T, nu)

In [None]:
# Plot trajectory
plt.figure(figsize=(8,6))
plt.plot(t_nvt, x_nvt, 'b-', label="Position x(t)")
plt.xlabel("Time")
plt.ylabel("x")
plt.title("Particle Trajectory in NVT Ensemble")
plt.legend()
plt.show()

# Plot total energy (kinetic + potential)
total_energy = 0.5 * m * v_nvt**2 + double_well_potential(x_nvt)
plt.figure(figsize=(8,6))
plt.plot(t_nve, total_energy, 'r-', label="Total Energy")
plt.xlabel("Time")
plt.ylabel("Energy")
plt.title("Energy in NVT Ensemble")
plt.legend()
plt.ylim(total_energy.min()-0.02, total_energy.max()+0.02)
plt.show()

- At low temperatures: The particle may remain trapped in a single well for a long time.<br><br>

- At high temperatures: The particle gains enough kinetic energy from the thermal bath to cross the barrier and explore both wells.<br><br>

Run the simulation at different temperatures $T$ and observe the trajectory.<br><br>


<div class="alert alert-block alert-info">

1. Which is the <b>lowest temperature</b> at which the particle can cross the barrier and reach the other minimum?

2. Is the total energy conserved?
</div>

<b>Observations:</b>

- At low T, the particle oscillates around one minimum, similar to small `v0 = 0` in NVE.

- As T increases, the particle occasionally crosses the barrier, similar to increasing the initial velocity in NVE.

Temperature fluctuations in NVT allow the system to explore states that would be inaccessible at zero kinetic energy.


<div class="alert alert-block alert-warning">
Simulations in the NVT ensemble are important because most biological and chemical processes occur at constant temperature, not constant energy.

Using a thermostat in MD ensures that the system samples configurations corresponding to the canonical ensemble, which is crucial for calculating properties like free energies, probabilities, and reaction rates.
</div>

<h2 style="color:orange;"> Exercise </h2>

Now that you have explored molecular dynamics using a simple one-dimensional model, it is time to move to a realistic biomolecular system. <br><br>

In this final exercise, you will run a complete all-atom MD simulation of a protein in explicit solvent using GROMACS.

You will follow the tutorial at https://tutorials.gromacs.org/md-intro-tutorial.html# (doi:10.5281/zenodo.11198375)

(You can also follow the tutorial online at https://hub.2i2c.mybinder.org/user/gromacs-online---intro-tutorial-l6bw0idt/notebooks/tutorial.ipynb)