# Extracting and visualising a free energy simulation

This notebook provides a step-by-step guide to extract and visualise a free energy simulation trajectory from a ``simulation.nc`` file using [openfe-analysis](https://github.com/OpenFreeEnergy/openfe_analysis), [MDAnalysis](https://github.com/MDAnalysis/mdanalysis) and [mdtraj](https://github.com/mdtraj/mdtraj). By the end, you should understand how to:

1. Extract the trajectory of a ``replica`` or ``single lambda state`` from a ``simulation.nc`` file
2. For a given hybrid topology trajectory, extract the relevant atom positions for the end states using `MDAnalysis`
3. Write out the trajectorie(s) using `MDAnalysis`
4. Centre the ligand in the simulation box using `mdtraj`

## Downloading the example data

First, download some example trajectory data. This may take a few minutes due to the size of the simulation file. Please skip this section if you have already done this!

In [1]:
! wget https://zenodo.org/records/15375081/files/simulation.nc
! wget https://zenodo.org/records/15375081/files/hybrid_system.pdb

--2025-09-22 13:47:47--  https://zenodo.org/records/15375081/files/simulation.nc
Resolving zenodo.org (zenodo.org)... 2001:1458:d00:25::100:372, 2001:1458:d00:61::100:2f3, 2001:1458:d00:24::100:f6, ...
Connecting to zenodo.org (zenodo.org)|2001:1458:d00:25::100:372|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 516886878 (493M) [application/octet-stream]
Saving to: ‘simulation.nc.2’

simulation.nc.2       7%[>                   ]  37.40M  6.76MB/s    eta 88s    ^C
--2025-09-22 13:47:55--  https://zenodo.org/records/15375081/files/hybrid_system.pdb
Resolving zenodo.org (zenodo.org)... 2001:1458:d00:61::100:2f3, 2001:1458:d00:24::100:f6, 2001:1458:d00:25::100:372, ...
Connecting to zenodo.org (zenodo.org)|2001:1458:d00:61::100:2f3|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 388547 (379K) [application/octet-stream]
Saving to: ‘hybrid_system.pdb.2’


2025-09-22 13:47:56 (1.29 MB/s) - ‘hybrid_system.pdb.2’ saved [388547/388547]



## Extracting the trajectory with `MDAnalysis`

The `openfe-analysis` package provides an `MDAnalysis` reader to help extract the trajectory data from the `simulation.nc` file. As the file contains multipule replicas simulated at different lambda states, we must choose which of these to load as a single trajectory. We have two options available to construct the trajectory:
- `state_id`: will construct a trajectory which follows a single Hamiltonian lambda state at the specified value.
- `recplica_id`: will construct a trajectory which follows a single replica at the specified value.

In this example which uses a trajectory from a relative binding free energy calculation we will load the trajectory at `lambda=0` or the end state corresponding to Ligand A and visulaise the trajectory with `nglview`.

In [28]:
import MDAnalysis as mda
import mdtraj as md
from openfe_analysis import FEReader
import nglview as nv
import numpy as np

u_0 = mda.Universe("hybrid_system.pdb", "simulation.nc", format=FEReader, state_id=0)

w = nv.show_mdanalysis(u_0)
w



NGLWidget(max_frame=500)

<div class=\"alert alert-block alert-info\"> <b>Note:</b> The OpenFE relative binding free energy protocol does not save water positions by default, this can be changed via the <a href="https://docs.openfree.energy/en/latest/reference/api/openmm_protocol_settings.html#openfe.protocols.openmm_utils.omm_settings.MultiStateOutputSettings.output_indices">output_indices</a> protocol setting. </div>


To view the final state at `lambda=1` we can use negative indexing if we don't know the total number of lambda states.

In [29]:
u_1 = mda.Universe("hybrid_system.pdb", "simulation.nc", format=FEReader, state_id=-1)

w = nv.show_mdanalysis(u_1)
w.center()
w



NGLWidget(max_frame=500)

# Extracting the end state positions with `MDAnalysis` 

The trajectory data stored in the `simulation.nc` file contains the positions of the end-state ligands in their hybrid topology format. This means only atoms that are unique to the end-states have individual positions, with conserved core atoms sharing a single set of positions. As you might have noticed in the visualisation above, this can complicate the analysis and visualisation of the protein-ligand interactions. However, we can identify the atoms relevant to the end states or core atoms using the beta factors in the topology file:

- `0.0`: The non-alchemical atoms (protein, solvent, etc)
- `0.25`: The unique atoms of state A
- `0.5`: The conserved core atoms present in both end states
- `0.75`: The unique atoms of state B

With this information, we can easily extract the atom positions relevant to `state A` for `lambda=0`:

In [30]:
# get atoms for state A
bfactor = 0.25
state_atoms = np.array([atom.ix for atom in u_0.atoms if atom.bfactor in (bfactor, 0.5, 0.0)])
state = u_0.atoms[state_atoms]

w = nv.show_mdanalysis(state)

w



NGLWidget(max_frame=500)

## Saving the trajectory to file with `MDAnalysis`

We can now use `MDAnalysis` to save the trajectory of the `state A` atoms to a common file format, note that we will also need to write out a new topology file that can be used to load this trajectory:

In [31]:
# write a new PDB topology file for the state A atoms only
state.write("state_a_topology.pdb")
# write the trajectory to an xtc file
with mda.Writer('out.xtc', n_atoms=len(state.atoms)) as w:
    for ts in u_0.trajectory:
        w.write(u_0.atoms[state_atoms])



## Centring the Ligand with `mdtraj`

You may have noticed in the view above that the ligand seems to have drifted away from the protein, this is a visualisation artifact caused by the use of periodic boundary conditions and the way in which `OpenMM` tries to ensure that all particle positions are written into a single periodic box. We can fix this, however, using `mdtraj` and the [image_molecules](https://mdtraj.org/1.9.3/api/generated/mdtraj.Trajectory.html?highlight=image_molecules#mdtraj.Trajectory.image_molecules) function:

In [33]:
traj = md.load_xtc("out.xtc", top="state_a_topology.pdb")
traj = traj.image_molecules()

w = nv.show_mdtraj(traj)

w

NGLWidget(max_frame=500)