# Simulate a Flexible, Confined Homopolymer

This notebook runs a Monte Carlo simulation of a confined, flexible homopolymer. The polymer includes 100,000 monomeric units, each separated by 5 persistence lengths. While we invoke the  SSWLC model in this simulation, we define a bead spacing restricting polymer behavior to the flexible regime. Twist is not considered during the simulation. The polymer is confined by a 4,500 nm radius spherical boundary. Monomers are non-interacting; field energies are not evaluated, except to enforce the confinement. The simulation is run for 100 snapshots, each involving 40,000 MC iterations.

### Setup

Install necessary modules, add the package root directory to the system path, and change working directory to root. Every simulation will involve similar setup steps.

**Do not run the setup cell more than once, except after restarting the kernel.** There is no way to consistently track the directory containing the notebook.

In [None]:
# Built-in modules
import os
import sys

# Insert package root to system path
cwd = os.getcwd()
parent_dir = cwd + "/../.."
sys.path.insert(1, parent_dir)

print("Directory containing the notebook:")
print(cwd)

In [None]:
# External modules
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# Package modules
import chromo.mc as mc
from chromo.polymers import SSWLC
import chromo.binders
from chromo.fields import UniformDensityField
import chromo.mc.mc_controller as ctrl
from chromo.util.reproducibility import get_unique_subfolder_name
import chromo.util.poly_stat as ps

In [None]:
# Change working directory to package root
os.chdir(parent_dir)
print("Root Directory of Package: ")
print(os.getcwd())

### Specify Reader Proteins

Reader proteins are collectively stored in a data frame that we call the "binder collection." We can generate the binder collection using the `make_binder_collection` function in the `chromo.binders` module. The binder collection accepts as an input a list of reader protein objects. Certain reader protein objects are pre-implemented in the `chromo.binders` module and can be instantiated by name using the `get_by_name` method in the `chromo.binders` module.

All simulations require at least one reader protein to be defined, even if we want to simulate a homopolymer like here. We have defined a placeholder binder named `null_reader` which serves only to ensure compatibility with the rest of the code.

In [None]:
# Instantiate the HP1 reader protein, which is pre-defined in the `chromo.binders` module
null_reader = chromo.binders.get_by_name('null_reader')

# Create the binder collection
binders = chromo.binders.make_binder_collection([null_reader])

### Specify Confinement

The behavior of confining boundaries are defined in the `chromo.field` module. The confinement type is specified as a string name. The confinement length depends on the confinement type (as should be documented in the `chromo.field` module) and is specified as a float.

Here we define a spherical confinement with a 4,500 nm radius.

In [None]:
confine_type = "Spherical"
confine_length = 4500

### Instantiate Polymer(s)

Various polymer models are defined in the `chromo.polymers` module. All polymer classes share the attributes and methods defined in the `PolymerBase` class. We instantiate a stretchable, shearable wormlike chain from the `SSWLC` class in the `chromo.polymers` module. The polymer has 100,000 beads, each separated by 265 nm, and we specify the persistence length of the polymer as being 53 nm. The polymer is initialized as a Gaussian random walk inside its confinement.

In [None]:
num_beads = 100000
bead_spacing = 265
lp = 53

polymer = SSWLC.confined_gaussian_walk(
    'poly_1',
    num_beads,
    bead_spacing,
    confine_type=confine_type,
    confine_length=confine_length,
    binder_names=np.array(['null_reader']),
    lp=lp
)

In [None]:
x = polymer.r[:, 0]
y = polymer.r[:, 1]
z = polymer.r[:, 2]

fig = plt.figure(figsize=(8, 6))
ax = fig.add_subplot(projection='3d')
ax.plot3D(np.asarray(x), np.asarray(y), np.asarray(z))
ax.set_xlabel('x')
ax.set_ylabel('y')
ax.set_zlabel('z')

plt.show()

### Define Uniform Density Field

We leverage a field theoretic approach  to model interactions between beads and bound reader proteins. The field tracks the density of the polymer and all binders in cubical voxels of space. To instantiate the field, we need to specify the dimensions of the voxel grid, the confinement, and all polymers and binders contained.

Here we instantiate a field that contains the spherical confinement with voxels of 100 nm width in the x, y, and z directions. Since we are dealing with a homopolymer, there is no energy contribution by the field; however, the simulation requires that a field be defined.

In [None]:
n_bins_x = 90
n_bins_y = n_bins_x
n_bins_z = n_bins_x

x_width = 2 * confine_length
y_width = x_width
z_width = x_width

udf = UniformDensityField(
    polymers = [polymer],
    binders = binders,
    x_width = x_width,
    nx = n_bins_x,
    y_width = y_width,
    ny = n_bins_y,
    z_width = z_width,
    nz = n_bins_z,
    confine_type = confine_type,
    confine_length = confine_length,
    chi=0,
    vf_limit=1E99,                  # Completely non-interacting particles.
    assume_fully_accessible=1
)

### Specify Simulation

The simulator proposes and evaluates random configurational changes to the polymer; in the case of the homopolymer, moves involve physical transformations to the chain.

##### Move and Bead Amplitudes

There are two ways to tune these configurational changes over the course of the simulation; we can either adjust the magnitude of the selection window for affected beads, or we can adjust the magnitude of the transformation itself. We define a "bead amplitude" to represent the maximum magnitude of the selection window for a move and a "move amplitude" to represent the maximum magnitude of the move's transformation. We restrict the move and bead amplitudes between bounds to ensure that MC moves do not become too large or too small. The bounds of the move and bead amplitudes are dependent on the polymers' sizes and are determined by the `get_amplitude_bounds` method in the `__init__` module of the `mc` directory.

In [None]:
amp_bead_bounds, amp_move_bounds = mc.get_amplitude_bounds(
    polymers = [polymer]
)

##### Define MC Moves

Here, we identify exactly which MC moves we wish to include in the simulation. We use the following moves in this. example:

- **Slide:** Translation of a continuous segment of beads in a random direction
- **Crank-Shaft:** Rotation of an internal segment of the polymer about the axis containing the segment
- **End-Pivot:** Rotation of a segment on one end of the polymer about an arbitrary axis
- **Tangent Rotation:** Rotation of the tangent vectors of random beads in the polymer

Since we often use this combination of MC moves, comprising all the physical transformations we've implemented so far, we have defined a helper function called `all_moves_except_binding_state` in the `chromo.mc.mc_controller` module to initialize the moves. The adaptable moves themselves are attributes of controllers that dynamically adjusts the move and bead amplitudes over the course of the simulation. Since each move is associated with a log file that may be used to track its acceptance rate, we need to specify an output directory for the simulation. The `get_unique_subfolder_name` function in the `chromo.util.reproducibility` module identifies a unique output directory name for the simulation.

In [None]:
latest_sim = get_unique_subfolder_name("output/sim_")
moves_to_use = ctrl.all_moves_except_binding_state(
    log_dir=latest_sim,
    bead_amp_bounds=amp_bead_bounds.bounds,
    move_amp_bounds=amp_move_bounds.bounds,
    controller=ctrl.SimpleControl
)

##### Simulation Length

We define the simulation by the number of snapshots we would like to generate and the number of iterations of each move between the snapshots. Here, we specify a simulation producing 100 snapshots, each with 40,000 iterations of each MC move.

In [None]:
num_snapshots = 100
mc_steps_per_snapshot = 40000

### Run the Simulation

The `polymer_in_field` method defined in the `__init__` file of the `mc` directory initiates the simulation. Running the code block below generates a unique output directory and run the simulation.

In [None]:
%%capture
mc.polymer_in_field(
    polymers = [polymer],
    binders = binders,
    field = udf,
    num_save_mc = mc_steps_per_snapshot,
    num_saves = num_snapshots,
    bead_amp_bounds = amp_bead_bounds,
    move_amp_bounds = amp_move_bounds,
    output_dir = 'output',
    mc_move_controllers = moves_to_use
)

### Evaluate Convergence

With the simulation complete, we want to check for energy and configurational convergence in the Monte Carlo snapshots. First we recall the latest simulation output directory. From that directory, we load the polymer configuration in each snapshot. We then evaluate the elastic energy of each bond. We lastly check that the mean squared separation distance between beads $n = 50$ monomers apart on the chain also converges.

##### List Polymer Output Files

Polymer configurations are saved at each snapshot of the simulation in the simulation's output directory. All polymer configurations are saved with `.csv` extension and are prepended with the label `poly_1`. The snapshot number is indicated after a hyphen and before the file extension in the polymer output file name.

In [None]:
# Load names of polymer configuration output files
output_files = os.listdir(latest_sim)
output_files = [
    f for f in output_files if f.endswith(".csv") and f.startswith("poly_1")
]
snapshot = [int(f.split("-")[-1].split(".")[0]) for f in output_files]
sorted_snap = np.sort(np.array(snapshot))
output_files = [f for _, f in sorted(zip(snapshot, output_files))]

##### Calculate Energies

The polymer class has a `compute_E` method for calculating total energy

In [None]:
all_energies = []
polymer_energies = []

for i, f in enumerate(output_files):
    snap = sorted_snap[i]
    output_path = str(latest_sim) + '/' + f

    r = pd.read_csv(
        output_path,
        header=0,
        skiprows=1,
        usecols=[1, 2, 3],
        dtype=float
    ).to_numpy()

    t3 = pd.read_csv(
        output_path,
        header=0,
        skiprows=1,
        usecols=[4, 5, 6],
        dtype=float
    ).to_numpy()

    polymer.r = r.copy()
    polymer.t3 = t3.copy()

    polymer_energy = polymer.compute_E()
    polymer_energies.append(polymer_energy)

##### Plot Energy Convergence

In [None]:
plt.figure(figsize=(5,4), dpi=300)
plt.plot(sorted_snap, polymer_energies)
plt.xlabel("Snapshot number")
plt.ylabel("Polymer Energy")
plt.tight_layout()
plt.show()

##### Calculate MSD

Check that the mean squared separation distance of beads $n = 50$ monomers apart on the chain converges during the simulation. In the `chromo.util.poly_stat` module, there is a method that does this calculation.

In [None]:
lp = 100    # Persistence length of DNA; in this example, `lp` has no effect
delta = 50  # Monomer monomer separation at which to calculate mean squared distance.

all_dists = []
for i, f in enumerate(output_files):
    snap = sorted_snap[i]
    output_path = str(latest_sim) + '/' + f
    r = pd.read_csv(
        output_path,
        header=0,
        skiprows=1,
        usecols=[1, 2, 3],
        dtype=float
    ).to_numpy()
    poly_stat = ps.PolyStats(r, lp, "overlap")
    windows = poly_stat.load_indices(delta)
    all_dists.append(poly_stat.calc_r2(windows))

##### Plot MSD Convergence

In [None]:
plt.figure(figsize=(8, 6))
plt.plot(sorted_snap, all_dists)
plt.xlabel("Snapshot number")
plt.ylabel(r"$\langle R^2 \rangle /(2l_p)^2$")
plt.tight_layout()
plt.show()

If energy and/or configurational convergence are not achieved, a longer simulation is required to enable equilibration.


### Evaluate Consistency with WLC Statistics

We compare the results of this simulation to the theoretical radial densities of a flexible polymer. The radial density represents the average bead density at each radial position in the confinement. We allow the polymer 90 snapshots to equilibrate, then we average the radial densities in remaining snapshots. We plot radial densities from our simulation against corresponding theoretical values.

In [None]:
num_equilibration = 90
counts_all = []

for i, f in enumerate(output_files):

    if i < num_equilibration:
        continue

    output_path = str(latest_sim) + "/" + f
    r = pd.read_csv(
        output_path,
        header=0,
        skiprows=1,
        usecols=[1, 2, 3],
        dtype=float
    ).to_numpy()

    radial_dists = np.linalg.norm(r, axis=1)

    step_size = 100
    bins = np.arange(step_size, confine_length, step_size)
    counts, bin_edges = np.histogram(radial_dists, bins=bins)
    counts = counts.astype(float)
    counts_all.append(counts)

counts_all = np.array(counts_all)
counts_avg = np.sum(counts_all, axis=0)

# Correct densities based on volumes of spherical shells
for i in range(len(bin_edges)-1):
    volume = 4/3 * np.pi * ((bin_edges[i+1]/1E3)**3 - (bin_edges[i]/1E3)**3)
    counts_avg[i] /= volume

counts_avg /= np.sum(counts_avg)

# Get theoretical radial densities
a = confine_length
b = lp
N = len(r)
r_theory = np.arange(step_size, confine_length, 1)
n_max = 1000
rho = np.zeros(len(r_theory))
for n in range(2, n_max + 1):
    rho += (-1)**(n+1) / (n * np.pi) * np.sin(np.pi * r_theory / a) * np.sin(n * np.pi * r_theory / a) / (r_theory**2 * b**2 * (n**2 - 1))
rho += N / np.pi * np.sin(np.pi * r_theory / a)**2 / r_theory**2

normalize = np.sum(rho)
rho_theory = rho / normalize * step_size

In [None]:
font = {'family': 'serif',
        'weight': 'normal',
        'size': 18}
plt.rc('font', **font)

plt.figure(figsize=(8, 6))
plt.hist(bin_edges[:-1], bin_edges, weights=counts_avg, alpha=0.4, color="blue")
plt.plot(r_theory, rho_theory, color="blue")
plt.xlabel("Radial Distance (nm)")
plt.ylabel(r"Probability")
plt.tight_layout()
plt.show()

### Summary

We began by defining the components of a 100,000 bead flexible homopolymer. We then specified the Monte Carlo simulation, which entailed iterative geometric transformation of the polymer. We finally assessed the predicted polymer structures at various snapshots, verifying convergence and generating a plot of radial densities. Using a purely physics-based approach, we were able to match the theoretical radial density distribution of a flexible polymer.