# Caffeine Solvation in Electrolyte Solution
author, affiliation ect ect.

## Introduction

We present a study of the solvation free energy of caffeine in electrolyte solutions using the energy representation description in combination with all-atom simulations.

The Setschenow coefficient, $k_s$, is on the $\ln$-scale defined as

$$ \ln S/S_0 = \ln\gamma = -k_sc_s = \mu^{ex}$$

where $S$ and $S_0$ are solubilities in pure water and an electrolyte solution of concentration $c_s$.

_Note:_ The $\log_{10}$ scale is often used in the literature used.

### Import of python models

In [None]:
# Notebook dependent libs
import parmed as pmd
import math
import numpy as np
import matplotlib.pyplot as plt
import mdtraj as md
import os

# Simulation specific libs
import sys
from simtk.openmm import app
import simtk.openmm as mm
import openmmtools as mmtools
import parmed as pmd
from parmed.openmm import StateDataReporter

homedir='/home/stefan/Caffeine_solubility'

### Simulation settings

In [None]:
# Concentrations of NaCl
states = ('iso', 'sol', 'aqs')
salt_concentrations = {0.00: {'caffeine': 1, 'wat': 2760, 'Na':0,   'Cl':0},
                       0.25: {'caffeine': 1, 'wat': 2760, 'Na':16,  'Cl':16},
                       0.50: {'caffeine': 1, 'wat': 2760, 'Na':27,  'Cl':27},
                       0.55: {'caffeine': 1, 'wat': 2760, 'Na':30,  'Cl':30},
                       0.73: {'caffeine': 1, 'wat': 2760, 'Na':40,  'Cl':40},
                       1.00: {'caffeine': 1, 'wat': 2760, 'Na':55,  'Cl':55},
                       1.25: {'caffeine': 1, 'wat': 2760, 'Na':69,  'Cl':69},   # Check avg volume from sim.
                       1.50: {'caffeine': 1, 'wat': 2760, 'Na':83,  'Cl':83},   # Check avg volume from sim.
                       1.75: {'caffeine': 1, 'wat': 2760, 'Na':96,  'Cl':96},   # Check avg volume from sim.
                       2.00: {'caffeine': 1, 'wat': 2760, 'Na':110, 'Cl':110}   # Check avg volume from sim.
                      }

### INPUT FILES: make PDB and .TOP files

In [None]:
%cd -q $homedir

for concentration, Nparticles in salt_concentrations.items():
    conc = '{0:.2f}'.format(concentration)
    %cd -q $homedir/Simulations/NaCl/$conc
    
    # Packmol Input
    packmol_script="""
tolerance 2.0
filetype pdb
output Caffeine_NaCl_sol.pdb
add_box_sides 1.0

structure ../../../PDB_files/single-caffeine-molecule.pdb
        number {N_caffeine}
        fixed 0. 0. 0. 0. 0. 0.
        centerofmass
end structure

structure ../../../PDB_files/water.pdb
        number {N_wat}
        inside cube 0. 0. 0. 45.
end structure

{salt}structure ../../../PDB_files/Na.pdb
{salt}        number {N_Na}
{salt}        inside cube 0. 0. 0. 45.
{salt}end structure

{salt}structure ../../../PDB_files/Cl.pdb
{salt}        number {N_Cl}
{salt}        inside cube 0. 0. 0. 45.
{salt}end structure
"""
    with open('packmol.in', 'w') as text_file:
        # fix for no salt:
        if concentration:
            salt=''
        else:
            salt='#'
        text_file.write(packmol_script.format(N_caffeine=Nparticles['caffeine'], N_wat=Nparticles['wat'],
                                              N_Na=Nparticles['Na'], N_Cl=Nparticles['Cl'], salt=salt))
    !packmol < packmol.in

    # Topology input
    topology="""
[ system ]
; Name
Caffeine in NaCl {conc} M aqueous solution.

[ defaults ]
; nbfunc        comb-rule       gen-pairs       fudgeLJ fudgeQQ
1               3               yes             0.5     0.8333

; Include all atomtypes
#include "/home/stefan/Caffeine_solubility/force_fields/atomtypes_spc.itp"
#include "/home/stefan/Caffeine_solubility/force_fields/atomtypes_ions.itp"
#include "/home/stefan/Caffeine_solubility/force_fields/atomtypes_caffeine.itp"

; Include all topologies
#include "/home/stefan/Caffeine_solubility/force_fields/ions.itp"
#include "/home/stefan/Caffeine_solubility/force_fields/spce.itp"
#include "/home/stefan/Caffeine_solubility/force_fields/caffeine-KBFF.itp"

[ molecules ]
; Compound         #mols
2S09               {N_caffeine}
SOL                {N_wat}
{salt}Cl                 {N_Cl}
{salt}Na                 {N_Na}
"""
    with open('Caffeine_NaCl_sol.top', 'w') as text_file:
        # fix for no salt:
        if concentration:
            salt=''
        else:
            salt=';'
        text_file.write(topology.format(conc=concentration, salt=salt,
                                        N_caffeine=Nparticles['caffeine'], N_wat=Nparticles['wat'],
                                        N_Na=Nparticles['Na'], N_Cl=Nparticles['Cl']))
    
    # Solvated state
    mol = pmd.load_file('Caffeine_NaCl_sol.top', xyz='Caffeine_NaCl_sol.pdb')
    mol.save('Caffeine_NaCl_sol.top', overwrite=True)
    
    # Isolated state
    mol.strip('!:2S09')
    mol.save('Caffeine_NaCl_iso.top')
    mol.save('Caffeine_NaCl_iso.pdb')
    
    # Aqueous state
    mol = pmd.load_file('Caffeine_NaCl_sol.top', xyz='Caffeine_NaCl_sol.pdb')
    mol.strip(':2S09')
    mol.save('Caffeine_NaCl_aqs.top')
    mol.save('Caffeine_NaCl_aqs.pdb')
    
    print('Wrote initial configurations and topology files to'+os.getcwd())

## Molecular dynamics simulations
All-atomic molecular dynamics was conducted on pure liquids using the OPLS-aa force field in combination with the openMM 7.3.1 software package modded using the openmmtools package. The initial configuration was created using the Packmol software. The simulation was run on the Aurora supercomputer in Lund, and thus the following code provides the simulation settings in terms of which molecules to simulate, system size, and temperature, followed by an OpenMM run script (```run_openMM.py```) containing the simulation setup including constrains, barostat, thermostat, simulation length, and calculated properties and finally a submit script (```aurora.sh```) for clusters opperated with Slurm.

### Simulation setup using OpenMM

In [None]:
%cd -q $homedir
N_simulations = 0

for concentration in salt_concentrations:
    conc = '{0:.2f}'.format(concentration)
    %cd -q $homedir/Simulations/NaCl/$conc
    for state in states:
        
        openmm_script="""
# Imports
import sys
import os
from simtk.openmm import app
import simtk.openmm as mm
import openmmtools as mmtools
from parmed import load_file, unit as u
from parmed.openmm import StateDataReporter

print('Loading initial configuration and toplogy')
init_conf = load_file(Caffeine_NaCl_{state}.top', xyz=Caffeine_NaCl_{state}.pdb')

# Creating system
print('Creating OpenMM System')
system = init_conf.createSystem(nonbondedMethod=app.PME, ewaldErrorTolerance=0.0005,
                                nonbondedCutoff=1.2*u.nanometers, constraints=app.HBonds)
                                                    
# Positional restraints
state = '{state}'
if state == 'iso':
    force = mm.CustomExternalForce("k*((x-x0)^2+(y-y0)^2+(z-z0)^2)")
    force.addGlobalParameter("k", 5.0*u.kilocalories_per_mole/u.angstroms**2)
    force.addPerParticleParameter("x0")
    force.addPerParticleParameter("y0")
    force.addPerParticleParameter("z0")
    for i, atom_cmd in enumerate(init_conf.positions):
        force.addParticle(i, atom_cmd.value_in_unit(u.nanometers))
    system.addForce(force)

# Temperature-coupling by geodesic Langevin integrator (NVT)
integrator = mmtools.integrators.GeodesicBAOABIntegrator(K_r = 3,
                                                         temperature = 298.15*u.kelvin,
                                                         collision_rate = 1.0/u.picoseconds,
                                                         timestep = 2.0*u.femtoseconds
                                                        )

# Pressure-coupling by a Monte Carlo Barostat (NPT)
system.addForce(mm.MonteCarloBarostat(1*u.bar, 298.15*u.kelvin, 25))

platform = mm.Platform.getPlatformByName('CUDA')
properties = {{'CudaPrecision': 'mixed', 'CudaDeviceIndex': '0'}}

# Create the Simulation object
sim = app.Simulation(init_conf.topology, system, integrator, platform, properties)

# Set the particle positions
sim.context.setPositions(init_conf.positions)

# Minimize the energy
print('Minimizing energy')
sim.minimizeEnergy(tolerance=1*u.kilojoule/u.mole, maxIterations=500000)
    
# Draw initial MB velocities
sim.context.setVelocitiesToTemperature(298.15*u.kelvin)

# Set up the reporters
sim.reporters.append(app.StateDataReporter(sys.stdout, 1000, totalSteps=25000000,
    time=True, potentialEnergy=True, kineticEnergy=True, temperature=True, density=True,
    remainingTime=True, speed=True, separator='\t'))

# Set up trajectory reporter
sim.reporters.append(app.DCDReporter('trajectory_{state}.dcd', 1000, append=False))

# Run dynamics
print('Running dynamics! (NPT)')
sim.step(25000000) # 25000000*2 fs = 50 ns
"""

        with open('run_openMM_{state}.py'.format(state=state), 'w') as text_file:
            text_file.write(openmm_script.format(state=state))
            N_simulations+=1
    print('Wrote run_openMM.py files to '+os.getcwd())

print('Simulations about to be submitted: {}'.format(N_simulations))

## Submit script

In [None]:
for concentration in salt_concentrations:
    conc = '{0:.2f}'.format(concentration)
    %cd -q $homedir/Simulations/NaCl/$conc
    for state in states:
        
        submit_script="""#!/bin/bash
#PBS -l nodes=1                 # Number of nodes
#PBS -l ppn=36                  # Number of cores
#PBS -l gpu                     # Required properties of node
#PBS -l gpus=1                  # Number of GPUs
#PBS -N {conc}_M_NaCl_{state}   # Name of job
#PBS -e run_{state}.err         # error output
#PBS -o run_{state}.out         # output file name

python run_openMM_{state}.py"""

        with open('submit_{state}.sh'.format(state=state), 'w') as text_file:
            text_file.write(submit_script.format(conc=conc, state=state))
    !sbatch submit_iso.sh
    !sbatch submit_sol.sh
    !sbatch submit_aqs.sh

## Load and plot solvation free energies

The results from the ERmod analysis is stored in the file `_results.yml` which we now load and plot. The results are obtained from 50 ns long MD simulations of caffeine in electrolyte solutions.
We see that both NaCl and NaI leads to _salting out_, i.e. the solvation free energy of caffeine is increased compared to pure water.

In [None]:
with open('_results.yml') as f: # open structured result file (YAML)
    
    r = yaml.load(f, Loader=yaml.Loader)
    
    fig, (ax1, ax2) = plt.subplots(1,2, sharex=True)
    fig.set_size_inches(10,4.5)
    fig.tight_layout()

    for d in r['salts']: # loop over all salt types
        conc   = np.array(d['conc'])     # molar conc.
        mu     = np.array(d['muexcess']) # excess chem. pot.
        error  = np.array(d['error'])    # error on mu
        mu0    = mu[0]                   # in pure water
        gamma  = np.exp( (mu-mu0)*kcal_to_kT )          # activity coefficient
        fit    = np.polyfit(conc, mu, 1, w=1/error)
        fit_fn = np.poly1d(fit)
        #print("Setschenow coefficient ("+d['label']+") =", fit[0])
        ax1.errorbar(conc, mu, yerr=error, fmt='o', c=d['color'], alpha=0.6, ms=10)
        ax1.plot(conc, fit_fn(conc), label=d['label'], c=d['color'], lw=2, ms=10)
        ax2.plot(conc, gamma, label=d['label'], c=d['color'], lw=2, marker='o', ms=10)

    ax1.set_title('Solvation Free Energy')
    ax2.set_title('Activity Coefficient')
    ax1.set_xlabel('Salt concentration (mol/l)')
    ax1.set_ylabel('$\Delta G_{sol}$ (kcal/mol)')
    ax2.set_xlabel('Salt concentration (mol/l)')
    ax2.set_ylabel('$\gamma$')
    ax1.legend(loc=2, frameon=False, fontsize='large')
plt.savefig('solvation.pdf', bbox_inches='tight')

## Free Energy Decomposition by Solvent and Co-solutes

The ERmod output also contain information about how solvent and co-solutes contribute to the total free energy. This is plotted below for the case of 2 molar salt.

In [None]:
with open('_results.yml') as f: # open structured result file (YAML)
    r = yaml.load(f, Loader=yaml.Loader)
    fig, ax = plt.subplots()
    fig.set_size_inches(6,6)
    width=0.3
    offset=0
    index = np.arange(4)
    
    for d in r['salts']: # loop over all salt types
        mu = d['decomposition'][0]['mu']
        error = d['decomposition'][0]['error']
        ax.bar( index+offset, mu,
                color=d['color'], label=d['label'], width=width,
                alpha=0.7, yerr=error, capsize=2)
        offset = offset + width
ax.set_xticks(index + width / 2)
ax.set_xticklabels(('total', 'water', 'cations', 'anions'))
ax.set_ylabel('$\Delta G_{sol}$ (kcal/mol)')
#ax.set_title('Free energy decomposition (2 M)')
ax.legend(loc=0, frameon=False, fontsize='large')
plt.savefig('solvent-decomposition.pdf', bbox_inches='tight')

## Decomposition by Caffeine Motifs

Todo...