# Creating a Waterbox and Performing a Quick Equilibration

This notebook walks you through **building a periodic box of water**, running an **energy minimisation**, and carrying out a short **NPT equilibration** with **LAMMPS**.  
By the end you will have:

* `waterbox_initial.lmp` – the as-built, un-equilibrated system  
* `equil_waterbox.lmp` – the box after minimisation + equilibration  
* Thermodynamic logs (`log.waterbox`, `log.equil`) and a density plot for quick inspection.


<!-- Cell 2 -->
### 1 · Environment preparation

1. Loads the required software modules  
2. Exports `LAMMPS_EXE` pointing to the **MBX + LAMPPS** build  
3. Prints the first two help-screen lines to confirm the binary is found and runnable


In [None]:
%%bash --login

module purge
module load shared slurm/expanse/23.02.7 sdsc/1.0 DefaultModules slurm/expanse/23.02.7 
module load cpu/0.17.3b intel/19.1.3.304/6pv46so intel-tbb/2020.3/lfesfxm intel-mpi/2019.10.317/ezrfjne fftw/3.3.10/tqkvj37

export LAMMPS_EXE=/expanse/projects/qstore/csd973/bin/lmp_mpi_mbx
echo
echo "LAMMPS_EXE = $LAMMPS_EXE"
"$LAMMPS_EXE" -h | head -n 2      # quick sanity-check

<!-- Cell 4 -->
### 2 · Creating the water-molecule template

`water.mol` will contain atom IDs, coordinates, and topology of a water molecule in **LAMMPS “molecule” format**.  


<!-- Cell 5 (code follows) -->
#### Create `water.mol`

In [None]:
from pathlib import Path

water_mol = r'''
# Water molecule

3 atoms
2 bonds
1 angles

Coords

1    0.00000  -0.06556   0.00000
2    0.75695   0.52032   0.00000
3   -0.75695   0.52032   0.00000

Types

1        1   # O
2        2   # H
3        2   # H

Charges

1       -0.834
2        0.417
3        0.417

Bonds

1   1      1      2
2   1      1      3

Angles

1   1      2      1      3
'''

Path('water.mol').write_text(water_mol)
print('water.mol is created.')

<!-- Cell 6 (code follows) -->
#### Create `tip4pEw.param`

We will do a quick equlibration using TIP4P-Ew water model.
`tip4pEw.param` holds the **TIP4P-Ew charges, LJ terms, and harmonic OH/HOH parameters** that will be included later.

Similarly, this cell writes the full TIP4P-Ew parameter block, including:

* `pair_style lj/cut/tip4p/long` with the M-site offset  
* LJ coefficients for O–O and H–H  
* Grouping all O/H atoms into `h2o`


In [None]:
from pathlib import Path

tip4pEw = r'''
## H2O ## TIP4P-Ew
set             type ${O} charge  -1.04844
set             type ${H} charge   0.52422

bond_coeff      ${OH_bond} 5000.0  0.9572
angle_coeff     ${OH_angle} 5000.0  104.52

variable OM_dist equal 0.1250

# pair style
pair_style &
  lj/cut/tip4p/long ${O} ${H} ${OH_bond} ${OH_angle} ${OM_dist} 9.0 9.0

# this is for soft pair style
# if you are not using soft, just remove the last column 
pair_coeff      ${O} ${O}  0.162750     3.16435
pair_coeff      ${H} ${H}  0.000000     1.0000   
pair_coeff      ${H}  *    0.000000     1.0000   

group h2o type ${O} ${H}
'''

Path('tip4pEw.param').write_text(tip4pEw)
print('tip4pEw.param is created.')

<!-- Cell 8 (code follows) -->
#### Generate `build.in`

Key points:

* Calculates `box_dim` analytically from `num_water` and target `density`  
* Uses `create_atoms … mol water` to place molecules at random, allowing small overlaps  
* Includes the parameter file but **defers LJ mixing & k-space settings** to keep the file lightweight  
* Writes `waterbox_initial.lmp` without coefficients (`nocoeff`)


In [None]:
from pathlib import Path

build_in = r'''
#############################################
#  build_waterbox.in
#  – Create a water box and write it
#############################################

# Variables
variable     num_water      equal 256        # number of water molecules
variable     density        equal 0.70       # g cm-3 (loose – will relax later)

# Calculate cubic box dimension (Å) from number of waters, density, and Avogadro's number
variable     box_dim        equal (180*${num_water}/(6.022*${density}))^(1/3)

# System definition
processors * * * map xyz
units      real
atom_style full
bond_style harmonic
angle_style harmonic
boundary p p p

# Define a cubic simulation region of size box_dim
region box block 0 ${box_dim} 0 ${box_dim} 0 ${box_dim}
# Create simulation box for two atom types (O and H)
create_box 2 box bond/types 1 angle/types 1 &
           extra/bond/per/atom 2 extra/angle/per/atom 1 extra/special/per/atom 2

# Masses
mass 1 15.9994
mass 2 1.008

variable O        equal 1
variable H        equal 2
variable OH_bond  equal 1
variable OH_angle equal 1

molecule water water.mol
# Populate the box with water molecules at random positions
# seed = 34564, 25678 target num_water molecules, allow small overlaps
create_atoms 0 random ${num_water} 34564 NULL mol water 25678 overlap 1.33

bond_style   harmonic
angle_style  harmonic
include      tip4pEw.param         # force-field coefficients
kspace_style pppm/tip4p 1.0e-5

# Write the box
write_data waterbox_initial.lmp nocoeff
'''

Path('build.in').write_text(build_in)
print('build.in is created.')

<!-- Cell 10 (code follows) -->
#### Build the initial box with LAMMPS

Runs `lmp_mbx_plumed` in serial, logging output to `log.waterbox`.  
Expect this to finish in a few seconds because no dynamics are performed yet.


In [None]:
%%bash --login
module purge
module load shared slurm/expanse/23.02.7 sdsc/1.0 DefaultModules slurm/expanse/23.02.7 
module load cpu/0.17.3b intel/19.1.3.304/6pv46so intel-tbb/2020.3/lfesfxm intel-mpi/2019.10.317/ezrfjne fftw/3.3.10/tqkvj37

export LAMMPS_EXE=/expanse/projects/qstore/csd973/bin/lmp_mpi_mbx

"$LAMMPS_EXE"  -in build.in  -log log.waterbox

<!-- Cell 11 -->
### 4 · Prepare minimisation & equilibration inputs – overview

We’ll minimise potential energy and then equilibrate at **298 K, 1 atm** for **10 ps** (10 000 × 1 fs).  
A fresh `equil.in` is created so the workflow remains reproducible.


<!-- Cell 12 (code follows) -->
#### Generate `equil.in`

Highlights:

* Defines temperature, timestep, pressure, and thermo print frequency  
* Reads `waterbox_initial.lmp` and re-applies `tip4pEw.param`  
* Performs a **steepest-descent minimisation** (`minimize 1.0e-4 1.0e-6 500 2000`)  
* Switches to **NPT** with SHAKE constraints for rigid water geometry  
* Writes the equilibrated structure to `equil_waterbox.lmp`


In [None]:
from pathlib import Path

equil_in = r'''
############################################################
#  equilibrate_waterbox.in
#  Minimise and NPT-equilibrate the pre-built water box
############################################################

# Variables
variable     temp           equal 298.0      # K
variable     dt             equal 1.0        # fs
variable     P              equal 1          # atm
variable     thermo_freq    equal 100        # steps

variable O        equal 1
variable H        equal 2
variable OH_bond  equal 1
variable OH_angle equal 1

# System initialisation
processors * * * map xyz
units      real
atom_style full

read_data  waterbox_initial.lmp

bond_style   harmonic
angle_style  harmonic
include      tip4pEw.param         # force-field coefficients

pair_modify  mix arithmetic tail yes
kspace_style pppm/tip4p 1.0e-5

# Neighbor list settings
neighbor 2.0 bin
neigh_modify every 1 delay 10 check yes

timestep ${dt}
thermo_style custom step time temp etotal pe press vol density lx
thermo ${thermo_freq}

# Initialize velocities to target temperature with Gaussian distribution
velocity all create ${temp} 428879 rot yes dist gaussian

# Energy minimisation
minimize 1.0e-4 1.0e-6 500 2000

# NPT equilibration
fix SHAKE all shake 1e-5 50 0 b ${OH_bond} a ${OH_angle}
fix NPT  all npt temp ${temp} ${temp} $(100.0*dt)  iso ${P} ${P} $(1000.0*dt)

run 10000

# Write final data file without force field coefficients
write_data equil_waterbox.lmp nocoeff
'''
Path('equil.in').write_text(equil_in)
print('equil.in created')

<!-- Cell 13 -->
##### Submitting the job  


In [None]:
sub_sh_script = r"""#!/bin/bash

#SBATCH --job-name="equil"
#SBATCH --output="equil.out"
#SBATCH --partition=debug
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=16
#SBATCH --mem=32GB
#SBATCH -A csd973
#SBATCH --export=ALL
#SBATCH -t 00:10:00

module purge
module load shared slurm/expanse/23.02.7 sdsc/1.0 DefaultModules slurm/expanse/23.02.7 cpu/0.17.3b intel/19.1.3.304/6pv46so intel-tbb/2020.3/lfesfxm intel-mpi/2019.10.317/ezrfjne fftw/3.3.10/tqkvj37

lammps=/expanse/projects/qstore/csd973/bin/lmp_mpi_mbx

export OMP_NUM_THREADS=16

$lammps -in equil.in  -log log.equil
"""

with open('sub.sh', 'w') as f:
    f.write(sub_sh_script)

In [None]:
!sbatch sub.sh

In [None]:
!squeue --me

In [None]:
!tail equil.out

<!-- Cell 15 -->
### 6 · Checking the density

A quick plot of density vs. simulation time lets us judge whether:

* The system has stabilised near **1 g cm⁻³**  


In [None]:
import numpy as np
import re
import matplotlib.pyplot as plt
import pathlib

# ---------- read log file ----------------------------------------------------
log_lines = pathlib.Path("log.equil").read_text().splitlines()

num_pat = re.compile(r"[+-]?\d+(?:\.\d+)?(?:[eE][+-]?\d+)?")
header = None
rows = []

for line in log_lines:
    if header is None:
        if line.lstrip().startswith("Step"):
            header = line.split()
        continue

    # pull out ONLY the numeric tokens on this line
    nums = num_pat.findall(line)
    if len(nums) == len(header):           # make sure it still lines up
        rows.append([float(x) for x in nums])

# convert to a NumPy array
data = np.array(rows)  # shape (n_steps, n_columns)

# find column indices
step_idx    = header.index("Step")
density_idx = header.index("Density")

# compute time in ps and extract density
time_ps  = data[:, step_idx] * 0.001     # 1 fs → ps
density  = data[:, density_idx]

# ----- Plot density only -----------------------------------------------------
plt.figure(figsize=(6, 4))
plt.plot(time_ps, density)
plt.title("Density vs. Time")
plt.xlabel("Time (ps)")
plt.ylabel(r"Density (g cm$^{-3}$)")
plt.tight_layout()
plt.show()


<!-- Cell 17 -->
### What to expect

After ~2 ps the density should plateau near **1 g cm⁻³**; small fluctuations are normal.  
We will perform actual equlibration and produciton run using MB-pol potential in the next step.
