doc/ug/run.tex

% Copyright (C) 2010,2012,2013 The ESPResSo project
% Copyright (C) 2002,2003,2004,2005,2006,2007,2008,2009,2010 
%   Max-Planck-Institute for Polymer Research, Theory Group
%  
% This file is part of ESPResSo.
%   
% ESPResSo is free software: you can redistribute it and/or modify it
% under the terms of the GNU General Public License as published by the
% Free Software Foundation, either version 3 of the License, or (at your
% option) any later version.
%  
% ESPResSo is distributed in the hope that it will be useful, but
% WITHOUT ANY WARRANTY; without even the implied warranty of
% MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
% General Public License for more details.
%  
% You should have received a copy of the GNU General Public License
% along with this program.  If not, see <http://www.gnu.org/licenses/>.
%
\chapter{Running the simulation}
\label{chap:run}

\section{\texttt{integrate}: Running the simulation}
\newescommand{integrate}

\begin{essyntax}
  \variant{1} integrate \opt{reuse_forces} \var{steps}
  \variant{2} integrate set \opt{nvt}
  \variant{3} integrate set npt_isotropic \var{p_{ext}} \var{piston} \opt{\var{x\: y\: z}} \opt{-cubic_box}
\end{essyntax}

\es uses the Velocity Verlet algorithm for the integration of the
equations of motion. The command \texttt{integrate} with an integer
\texttt{steps} as parameter integrates the system for \texttt{steps}
time steps.

Note that this implementation of the Velocity Verlet algorithm reuses forces,
that is, they are computed once in the middle of the time step, but used twice,
at the beginning and end. However, in the first time step after setting up,
there are no forces present yet. Therefore, \es{} has to compute them before the
first time step. That has two consequences: first, random forces are redrawn,
resulting in a narrower distribution of the random forces, which we compensate
by stretching. Second, coupling forces of e.\,g. the Lattice Boltzmann fluid cannot
be computed and are therefore lacking in the first half time step. In order to
minimize these effects, \es{} has a quite conservative heuristics to decide whether
a change makes it necessary to recompute forces before the first time step. Therefore,
calling hundred times \texttt{integrate 1} does the same as \texttt{integrate 100},
apart from some small calling overhead.

However, for checkpointing, there is no way for \es{} to tell that the forces that you
read back in actually match the parameters that are set. Therefore, \es{} would recompute
the forces before the first time step, which makes it essentially impossible to checkpoint
LB simulations, where it is vital to keep the coupling forces. Therefore, \texttt{integrate}
has an additional parameter \opt{reuse_forces}, which tells integrate to not recalculate
the forces for the first time step, but use that the values still stored with the particles.
Use this only if you are absolutely sure that the forces stored match your current setup!

Two methods for the integration can be set: For an NVT ensemble
(thermostat) and for an NPT isotropic ensemble (barostat). The current
method can be detected with the command \texttt{integrate set} without
any parameters.

The NVT integrator is set without parameters (the temperature can be
set with the thermostat). For the NPT ensemble, the parameters that
can be added are:

\begin{itemize}
\item \var{p_{ext}} The external pressure as float variable. This
  parameter is required.
\item \var{piston} The mass of the applied piston as float
  variable. This parameter is required.
\item \var{x}:\var{y}:\var{z} Three integers to set the box geometry for
  non-cubic boxes. This parameter is optional.
\item \texttt{-cubic_box} If this optional parameter is added, a cubic
  box is assumed.
\end{itemize}

\section{\texttt{time_integration}: Runtime of the integration loop}
\newescommand[time-integration]{time_integration}

\begin{essyntax}
  \variant{1} time_integration
  \variant{2} time_integration \var{steps}
\end{essyntax}

This command runs the integration as would the \texttt{integrate} command and
returns the wall runtime in seconds.

\section{\texttt{change_volume}: Changing the box volume}
\newescommand[change-volume]{change_volume}

\begin{essyntax}
  \variant{1} change_volume \var{V_\mathrm{new}} 
  \variant{2} change_volume \var{L_\mathrm{new}} \alt{x \asep y \asep z \asep xyz}
\end{essyntax}
Changes the volume of either a cubic simulation box to the new volume
\var{V_\mathrm{new}} or its given x-/y-/z-/xyz-extension to the new
box-length \var{L_\mathrm{new}}, and isotropically adjusts the
particles coordinates as well. The function returns the new volume of
the deformed simulation box.

\section{Stopping particles}

Use the following functions, also see Section~\ref{sec:Galilei}:
\begin{itemize}
\item \texttt{kill\_particle\_motion}: halts all particles in the
  current simulation, setting their velocities to zero, as well as
  their angular momentum if the feature ROTATION has been compiled in.
\item \texttt{kill\_particle\_forces}: sets all forces on the
  particles to zero, as well as all torques if the feature ROTATION
  has been compiled in.
\end{itemize}

\section{\texttt{velocities}: Setting the velocities}
\newescommand{velocities}
\begin{essyntax}
  velocities \var{v_\mathrm{max}} 
  \opt{start \var{pid}} 
  \opt{count \var{N}}
\end{essyntax}
Sets the velocities of the particles with particle IDs between
\var{pid} and $\var{pid}+\var{N}$ to a random vector with a length
less than \var{v_\mathrm{max}}, and returns the absolute value of the
total velocity assigned. By default, all particles are affected.

\section{\texttt{invalidate_system}}
\newescommand[invalidate-system]{invalidate_system}
\begin{essyntax}
  invalidate_system
\end{essyntax}

Forces a system re-init which, among others, causes the integrator to
also update the forces at its beginning (instead of re-using the
values from the previous integration step).  This is particularly
necessary to ensure continuity after setting a checkpoint:
\texttt{integrate} - \texttt{set_checkpoint} - \texttt{integrate} has
only one call to the force calculation routine, while
\texttt{read_checkpoint} - \texttt{integrate} has two at the beginning
of the \texttt{integrate} command (because loading a new system from
disk typically requires re-initializing the system), and since the
forces routine also uses the thermostat which in turn draws random
numbers, the two situations do not end up at the same segment of the
random number sequence, all random events will therefore slightly
differ.  To prevent this, simply include a call to
\texttt{invalidate_system} upon setting the checkpoint, because in
that case both scenarios will call the forces routine twice at the
beginning of the second integration phase thus having their random
number sequences in total sync.

Without applying this command directly before or after writing a
checkpoint, you will run into a different state of the random number
generator when reading the checkpoint to start again later!

\section{Parallel tempering}
\newescommand[parallel-tempering]{parallel_tempering}
\begin{essyntax}
  parallel_tempering::main
  -rounds \var{N}
  -swap \var{swap}
  -perform \var{perform}
  \opt{-init \var{init}}
  \opt{-values \var{\{T_i\}}}
  \opt{-connect \var{master}}
  \opt{-port \var{port}}
  \opt{-load \var{j_\mathrm{node}}}
  \opt{-resrate \var{N_\mathrm{reset}}}
  \opt{-info \var{info}}
\end{essyntax}

This command can be used to run a parallel tempering simulation. Since the simulation routines and
the calculation of the swap probabilities are provided by the user, the method is not limited to
sampling in the temperature space. However, we assume in the following that the sampled values are
temperatures, and call them accordingly. It is possible to use multiple processors via TCP/IP
networking, but the number of processors can be smaller than the number of temperatures.

\begin{arguments}
\item[\var{swap}] specifies the name of the routine calculating the
  swap probability for a system. The routine has to accept three
  parameters: the \var{id} of the system to evaluate, and two
  temperatures \var{T_1} and \var{T_2}. The routine should return a
  list containing the energy of the system at temperatures \var{T_1}
  and \var{T_2}, respectively.
\item[\var{perform}] specifies the name of the routine performing the
  simulation between two swap tries. The routine has to accept two
  parameters: the \var{id} of the system to propagate and the
  temperature \var{T} at which to run it. Return values are ignored.
\item[\var{init}] specifies the name of a routine initializing a
  system. This routine can for example create the particles, perform
  some intial equilibration or open output files.  The routine has to
  accept two parameters: the \var{id} of the system to initialize and
  its initial temperature \var{T}. Return values are ignored.
\item[\var{R}] specifies the number of swap trial rounds; in each
  round, neighboring temperatures are tried for swapping
  alternatingly, i.e. with four temperatures, The first swap trial
  round tries to swap $1\leftrightarrow 2$ and $3\leftrightarrow 4$,
  the second round $2\leftrightarrow 3$, and so on.
\item[\var{master}] the name of the host on which the
  parallel_tempering master node is running.
\item[\var{port}] the TCP/IP port on which the parallel_tempering
  master should listen. This defaults to 12000.
\item[\var{j_\mathrm{node}}] specifies how many systems to run per
  \es{}-instance. If this is more than 1, it is the user's
  responsibility to manage the storage of configurations, see below
  for examples.  This defaults to 1.
\item[\var{R_\mathrm{reset}}] specifies after how many swap trial
  rounds to reset the counters for the acceptance rate statistics.
  This defaults to 10.
\item[\var{info}] specifies which output the parallel tempering code
  should produce:
  \begin{itemize}
  \item[\texttt{none}] parallel tempering will be totally quiet,
    except for fatal errors
  \item[\texttt{comm}] information on client activities, such as
    connecting, is printed to stderr
  \item[\texttt{all}] print lots of information on swap energies and
    probabilities to stdout. This is useful for debugging and quickly
    checking the acceptance rates.
  \end{itemize}
  This defaults to \texttt{all}.
\end{arguments}

\subsubsection{Introduction}

The basic idea of parallel tempering is to run $N$ simulations with configurations $C_i$ in
parallel at different temperatures $T_1<T_2<\hdots<T_N$, and exchange configurations between
neighboring temperatures. This is done according to the Boltzmann rule, \ie the swap probability
for two configurations A and B at two different parameters $T_1$ and $T_2$ is given by
\begin{equation}
  \label{eq:ptacceptance}
  \min\left(1,\exp -\left[\beta(T_2)U_\text{A}(T_2) + \beta(T_1)U_\text{B}(T_1) -
    \beta(T_1)U_\text{A}(T_1) - \beta(T_2)U_\text{B}(T_2)\right]\right),
\end{equation}
where $U_C(T)$ denotes the potential energy of configuration $C$ at parameter $T$ and $\beta(T)$
the corresponding inverse temperature. If $T$ is the temperature, $U_C$ is indepedent of $T$, and
$\beta(T)=1/(k_BT)$. In this case, the swap probability reduces to the textbook result
\begin{equation}
  \min(1,\exp -\left[\left(1/T_2 - 1/T_1\right)\left(U_\text{A} - U_\text{B}\right)/k_B\right].
\end{equation}
However, $T$ can also be chosen to be any other parameter, for example the Bjerrum length, \ie the
the strength of the electrostatic interaction. In this case, $\beta(T)=\beta$ is a constant, but
the energy $U_C(T)$ of a configuration $C$ depends on $T$, and one needs the full
expression~\eqref{eq:ptacceptance}. \es{} always uses this expression.

In practice, one does not swap configurations, but temperatures, simply because exchanging
temperatures requires much less communication than exchanging the properties of all particles.

Th \es{} implementation of parallel tempering repeatedly propagates all configurations $C_i$ and
tries to swap neighboring temperatures. After the first propagation, the routine attempts to swap
temperatures $T_1$ and $T_2$, $T_3$ and $T_4$, and so on. After the second propagation, swaps are
attempted between temperatures $T_2$ and $T_3$, $T_4$ and $T_5$, and so on.  For the propagation,
parallel tempering relies on a user routine; typically, one will simply propagate the configuration
by a few 100 MD time steps.

\subsubsection{Details on usage and an example}

The parallel tempering code has to be loaded explicitely by {\tt source
  "scripts/parallel_tempering.tcl"} from the Espresso directory. To make use of the parallel
tempering tool, one needs to implement three methods: the propagation, the energy calculation and
an initialization routine for a configuration. A typical initialization routine will look roughly
like this:
\begin{verbatim}
proc init {id temp} {
  # create output files for temperature temp
  set f [open "out-$temp.dat" w]; close $f
  init_particle_positions
  thermostat langevin $temp 1.0
  equilibration_integration
  global config
  set config($id) "{[part]} [setmd time]"
}
\end{verbatim}
The last two lines are only necessary if each instance of \es{} handles more than one
configuration, \eg if you have 300 temperatures, but only 10 \es{} processes
(\ie {\tt -load 30}).  In this case, all
user provided routines need to save and restore the configurations. Saving the time is not
necessary because the simulation tine across swaps is not meaningful anyways; it is however
convenient for investigating the (temperature-)history of individual configurations.

A typical propagation routine accordingly looks like this
\begin{verbatim}
proc perform {id temp} {
  global config
  particle delete
  foreach p [lindex $config($id) 0] { eval part $p }
  setmd time [lindex $config($id) 1]
  thermostat langevin $temp 1.0
  set f [open "out-$temp.dat" a];
  integrate 1000
  puts $f "[setmd time] [analyze energy]"
  close $f
  set config($id) "{[part]} [setmd time]"
}
\end{verbatim}
Again, the saving and storing of the current particle properties in the config array are only
necessary if there is more than one configuration per process. In practice, one will rescale the
velocities at the beginning of perform to match the current temperature, otherwise the thermostat
needs a short time to equilibrate. The energies necessary to determine the swap probablility are
calculated like this:
\begin{verbatim}
proc swap {id temp1 temp2} {
  global config
  particle delete
  foreach p $config($id) { eval part $p }
  set epot [expr [analyze energy total] - [analyze energy kinetic]]
  return "[expr $epot/$temp1] [expr $epot/$temp2]"
}
\end{verbatim}
% make emacs happy: $
Note that only the potential energy is taken into account. The temperature enters only indirectly
through the inverse temperature prefactor, see Eqn.~\eqref{eq:ptacceptance}.

The simulation is then started as follows. One of the processes runs the command
\begin{verbatim}
for {set T 0} {$T < 3} {set T [expr $T + 0.01]} {
    lappend temperatures $T }
parallel_tempering::main -load 30 -values $temperatures -rounds 1000 \
    -init init -swap swap -perform perform
\end{verbatim}
This command turns the \es instance executing it into the master part of the parallel tempering
simulation. It waits until a sufficient number of clients has connected. This are additional \es{}
instances, which are identical to the master script, except that they execute
\begin{verbatim}
parallel_tempering::main -connect $host -load 30 \
    -init init -swap swap -perform perform
\end{verbatim}
% make emacs happy: $
Here, {\tt host} is a variable containing the TCP/IP hostname of the computer running the master
process. Note that the master process waits until enough processes have connected to start the
simulation. In the example, there are 300 temperatures, and each process, including the master
process, will deal with 30 of them. Therefore, 1 master and 9 slave processes are required. For a
typical queueing system, a starting routine could look like this:
\begin{verbatim}
master=
for h in $HOSTS; do
  if [ "$master" == "" ]; then
    ssh $h "cd run; ./pt_test.tcl"
    master=$h;
  else
    ssh $h "cd run; ./pt_test.tcl -connect $host"
  fi
done
\end{verbatim}
where {\tt pt_test.tcl} passes the {\tt -connect} option on to {\tt parallel_tempering::main}.

\subsubsection{Sharing data}

\begin{essyntax}
  parallel_tempering::set_shareddata \var{data}
\end{essyntax}

can be used at any time {\em by the master process} to specify additional data that is available on
all processes of the parallel_tempering simulation. The data is accessible from all processes as
{\tt parallel_tempering::shareddata}.

\section{Metadynamics}
\newescommand[metadynamics]{metadynamics}

\begin{essyntax}
  \variant{1} metadynamics 
  \variant{2} metadynamics set off
  \variant{3} metadynamics set distance \var{pid_1} \var{pid_2} \
  \var{d_\mathrm{min}} \var{d_\mathrm{max}} \var{b_\mathrm{height}} \
  \var{b_\mathrm{width}} \var{f_\mathrm{bound}} \var{d_\mathrm{bins}}
  \variant{4} metadynamics set relative_z \var{pid_1} \var{pid_2} \
  \var{z_\mathrm{min}} \var{z_\mathrm{max}} \var{b_\mathrm{height}} \
  \var{b_\mathrm{width}} \var{f_\mathrm{bound}} \var{z_\mathrm{bins}}
  \variant{5} metadynamics print_stat current_coord
  \variant{6} metadynamics print_stat coord_values
  \variant{7} metadynamics print_stat profile
  \variant{8} metadynamics print_stat force
  \variant{9} metadynamics load_stat \var{profile\_list} \
  \var{force\_list}

  Required features: METADYNAMICS
\end{essyntax}
Performs metadynamics sampling. Metadynamics is an efficient scheme to
calculate the potential of mean force of a system as a function of a
given reaction coordinate from a canonical simulation. The user first
chooses a reaction coordinate (\eg \texttt{distance}) between two
particles (\var{pid_1} and \var{pid_2}). As the system samples values
along this reaction coordinate (here the distance between \var{pid_1}
and \var{pid_2}), an iterative biased force pulls the system away from
the values of the reaction coordinate most sampled. Ultimately, the
system is driven in such a way that it self-diffuses along the
reaction coordinate between the two boundaries (here
\var{d_\mathrm{min}} and \var{d_\mathrm{max}}). The potential of mean
force (or free energy profile) can be extracted by reading the
\texttt{profile}. 

\begin{arguments}
\item[\var{pid_1}] ID of the first particle involved in the
  metadynamics scheme.
\item[\var{pid_2}] ID of the second particle involved in the
  metadynamics scheme.
\item[\var{d_\mathrm{min}}, \var{z_\mathrm{min}}]: minimum value of
  the reaction coordinate. While \var{d_\mathrm{min}} must be positive
  (it's a distance), \var{z_\mathrm{min}} can be negative since it's
  the relative height of \var{pid_1} with respect to \var{pid_2}.
\item[\var{d_\mathrm{max}}, \var{z_\mathrm{max}}]: maximum value of
  the reaction coordinate. 
\item[\var{b_\mathrm{height}}] height of the bias function.
\item[\var{b_\mathrm{width}}] width of the bias function.
\item[\var{f_\mathrm{bound}}] strength of the ramping force at the
  boundaries of the reaction coordinate interval.
\item[\var{d_\mathrm{bins}}, \var{z_\mathrm{bins}}]: number of bins of
  the reaction coordinate.
\item[\var{profile\_list}] Tcl list of a previous metadynamics
  profile.
\item[\var{force\_list}] Tcl list of a previous metadynamics force.
\end{arguments}

\subsubsection{Details on usage}

Variant \variant{1} returns the status of the metadynamics
routine. Variant \variant{2} turns metadynamics off (default
value). Variant \variant{3} sets a metadynamics scheme with the
reaction coordinate \texttt{distance}, which corresponds to the
distance between any two particles of the system (\eg calculate the
potential of mean force of the end-to-end distance of a
polymer). Variant \variant{4} sets a metadynamics scheme with the
reaction coordinate \texttt{relative_z}: relative height (\ie z
coordinate) of particle \var{pid_1} with respect to \var{pid_2} (\eg
calculate the potential of mean force of inserting one particle
\var{pid_1} through an interface with center of mass
\var{pid_2}). Variant \variant{5} prints the current value of the
reaction coordinate. Variant \variant{6} prints a list of the binned
values of the reaction coordinate (\eg \var{d_\mathrm{bins}} values
between \var{d_\mathrm{min}} and \var{d_\mathrm{max}}). Variant
\variant{7} prints the current potential of mean force for all values
of the reaction coordinate considered. Variant \variant{8} prints the
current force (norm rather than vector) for all values of the reaction
coordinate considered. Variant \variant{9} loads a previous
metadynamics sampling by reading a Tcl list of the potential of mean
force and applied force. This is especially useful to restart a
simulation.

Note that the metadynamics scheme works seamlessly with the
VIRTUAL_SITES feature, allowing to define centers of mass of groups of
particles as end points of the reaction coordinate. One can therefore
measure the potential of mean force of the distance between a particle
and a \emph{molecule} or \emph{interface}.

The metadynamics scheme has (as of now) only been implemented for one
processor: MPI usage is \emph{not} supported. However, one can speed up
sampling by communicating the \texttt{profile} and \texttt{force}
between independent simulations (denoted \emph{walkers}). The
\texttt{print_stat} and \texttt{load_stat} can be used to input/output
metadynamics information between walkers at regular
intervals. Warning: the information extracted from \texttt{print_stat}
contains the entire history of the simulation, while only the
\emph{last} increment of sampling should be communicated between
walkers in order to avoid counting the same samples multiple times.

\subsubsection{Details on implementation}

As of now, only two reaction coordinates have been implemented:
\texttt{distance} and \texttt{relative_z}. Many different reaction
coordinates can be set up, and it is rather easy to implement new
ones. See the code in \texttt{metadynamics.\{h,c\}} for further
details. 

The bias functions that are applied to the potential of mean force and
the biased force are not gaussian function (as in many metadynamics
codes) but so-called Lucy functions. See \cite{marsili09} for more
details. These avoid the calculation of exponentials.

%%% Local Variables: 
%%% mode: latex
%%% TeX-master: "ug"
%%% End: