# Run this cell first

In [None]:
# this code enables the automated feedback. If you remove this, you won't get any feedback
# so don't delete this cell!
try:
  import AutoFeedback
except (ModuleNotFoundError, ImportError):
  !pip install git+https://github.com/abrown41/AutoFeedback@notebook
  import AutoFeedback

try:
  from testsrc import test_main
except (ModuleNotFoundError, ImportError):
  !pip install "git+https://github.com/autofeedback-exercises/exercises.git@testpip#subdirectory=New-MTH4332/SingleParticleMD"
  from testsrc import test_main

def runtest(tlist):
  import unittest
  from contextlib import redirect_stderr
  from os import devnull
  with redirect_stderr(open(devnull, 'w')):
    suite = unittest.TestSuite()
    for tname in tlist:
      suite.addTest(eval(f"test_main.UnitTests.{tname}"))
    runner = unittest.TextTestRunner()
    try:
      runner.run(suite)
    except AssertionError:
      pass


# Calculating the potential

In the previous set of programming exercises, you learned how we can calculate approximate values for integrals by using a method called Monte Carlo, which works by a set of random microstates and computing an average.  By far the most common way that Monte Carlo simulation is used when doing research in statistical mechanics is through a variant known as molecular dynamics (MD).  In this next set of programming exercises, you are thus going to learn the basics of how to set up and run a molecular dynamics calculation.

In molecular dynamics calculations configurations are generated by solving Newtons equations numerically.  We are thus not generating random variables except when we set up the intial velocities of the particles.  However, the fact that the intial velocities are set randomly is what generates the randomness.  You can show that small differences in the initial velocities of the particles quickly accumulate.  Trajectories that are started from configurations that are very similar thus quickly diverge.  You can thus treat the observables that emerge from an MD simulation as if they are random variables and analyse any time series that emerge from an MD trajectory as if it is a sequence of random variables.

The first thing you must learn in order to run MD is how to write a function to compute the potential energy for a configuration.  In this set of exercises we are going to learn to run molecular dynamics to investigate the behaviour of a single particle on a potential.  Our system will thus have one position coordinate x and a potential energy V(x) that is calculated as:

![](eq1.png)

Your task in this exercise is to write a function called `potential` that takes the x coordinate of the particle as input and that returns the potential energy by computing the function above.  Easy!


In [None]:
import numpy as np

def potential(x) :
  energy=0
  # Your code to calculate the potential goes here

  return energy

# Here are a few calls of the potential function
print( potential(0), potential(1), potential(2) )


In [None]:
runtest(['test_energy'])

# Calculating the forces

If we were writing a Monte Carlo code we would only need to calculate the potential as the moves are random.  We are writing an MD code, however, so we need to calculate the forces acting on each of the atoms so we can calculate the trajectory.

Writing a program to calculate the forces on the atoms is thus your task in this next exercise.  You must write a function called `potential` that a single scalar, x, as input.  The argument x tells the function the position of the particle on the potential
and thus allows you to calculate V(x) using:

![](eq1.png)

Your function should then return two quantities:

1. A scalar value for the potential.
2. The force that is acting on the particle.

Remember the force is equal to the negative derivative of the potential:

![](eng2.png)

with respect to the atomic positions.  Good luck!

You should try to complete this exercise using the harmonic potential given in the first equation.  However, notice that I test your code by calculating the forces numerically.  The test will thus pass as long as the force the function `potential` returns is the negative derivative of function that was used to calculate the potential.  When doing the assignment for this module you can thus use this repl exercise to test that you have calculated the forces for your selected potential correctly.


In [None]:
import numpy as np

def potential(x) :
  energy = 0
  force = 0
  # Your code goes here

  return energy, force


# This calculates and force and prints the energy of the configuration in the input file.
pot, force = potential(-1)
print( 'The potential at x=-1 is', pot, 'and the force is', force )
pot, force = potential(0)
print( 'The potential at x=0 is', pot, 'and the force is', force )
pot, force = potential(1)
print( 'The potential at x=1 is', pot, 'and the force is', force )


In [None]:
runtest(['test_forces3'])

# Generating a trajectory

The fact that we can now calculate the forces means we are in a position to write a molecular dynamics code and to generate our first trajectory.  We are going to be using the velocity Verlet algorithm to solve Newton's equation of motion.  In the code in `main.py` I have set the initial position and velocity of the particle using the variables`init_pos` and `init_vel`.  You can if you wish change the values of these two variables to see what happens when the initial conditions change.  However, these variables only need to be set in one place.  The position and velocity variables that you will use and update when you implement the MD algorithm are called `pos` and `vel`, which you can see are initially set equal to `init_pos` and `init_vel`.

Before we start the main MD loop we first need to calculate the potential the potential energy of the particle in its initial configuration as well as the forces that are acting upon the particle in this configuration.

Once these steps of initialisation are completed you then need to write a loop that performs the following four steps multiple times.  In this way your trajectory of positions is generated:

1. The velocity, v(t), is updated by a half timestep using:

![](eq1.png)

where F(t) is the force at x(t).

2. The position, x(t), is updated a full timestep using:

![](eq2.png)

3. The force, $F(t+\delta)$, is calculated at the new position.

4. The new values for the forces are used to update the velocities another half-timestep as follows:

![](eq3.png)

Your task is to implement this algorithm in `main.py`.  As in the previous exercise, you need to start by writing a function called `potential` as you did in the previous exercise to calculate the potential and the forces.  This time the potential should be the harmonic potential:

![](eq4.png)

I have written an outline for an MD code that computes a trajectory.  You need to complete this code by implementing the velocity Verlet algorithm that is described above.  You will notice that I use three variables to the position (`pos`), velocity (`vel`) and force (`forces`).  Notice that whenever you update the velocity, position or force in the velocity Verlet algorithm you never again need the old position, velocity or force.  You thus can (and should) use these three variables to hold the instantaneous positions, velocities and forces.  I have written some code that will keep track of the position the particle adopts during the trajectory and that can be used to visualise what has occurred during the calculation.

The final result from your calculation should be a graph that shows how the position of the particle changed during the trajectory.

N.B Please note the variable timestep, $\delta$, in the algorithm described above should be set to a suitable value.  I have set it to a sensible value in the code to the right.  It is easy to tell if the value of the timestep is not sensible as total energy will not be conserved.




In [None]:
import matplotlib.pyplot as plt
import numpy as np

def potential(x) :
  energy = 0

  return energy, forces

# Set the initial position for the particle (you can change as I assume the initial particle is at init_pos when I test your code)
init_pos, init_vel = 3, 1
# This is the value to use for the timestep (the delta in the equations on the other side)
timestep = 0.005
# Now set the position equal to the inial position that was set above
pos, vel = init_pos, init_vel
# This calculates the initial values for the forces
eng, forces = potential(pos)

# We now run 500 steps of molecular dynamics
nsteps = 500
# And store every 10th frame
stride = 10
trajectory = np.zeros([int(nsteps/stride)])
for step in range(nsteps) :
  # First update the velocity a half timestep
  # fill in the blanks in the code here
  vel =

  # Now update the positions using the new velocities
  # You need to add code here
  pos =

  # Recalculate the forces at the new position
  # You need to add code here
  eng, forces =

  # And update the velocities another half timestep
  # You need to add code here


  # I have stored the trajectory here so we can plot how the positions
  # of each of the atoms changes with time.
  if step%stride==0 :
    trajectory[int(step/stride)] = pos

# This will plot how position of the atoms during trajectory
times = np.linspace( 0, (nsteps-stride)*timestep, len(trajectory) )
plt.plot( times, trajectory, 'ko'  )
plt.xlabel('time')
plt.ylabel('position')
plt.savefig( 'trajectory.png' )


In [None]:
runtest(['test_potential1', 'test_trajectory1'])

# The kinetic energy

One thing that is particularly important to check whenever we run an MD simulation is whether or not the energy is being conserved.  To check the energy is conserved we need to store the potential that is calculated every time the forces are calculated.  As well as calculating the potential though we also need to calculate the kinetic energy.  The kinetic energy can be computed using a simple function of the velocities of the atoms, which you should know.

Your task in this exercise is thus to write a function called `kinetic` that takes in a list of velocities and that returns a single scalar value for the total kinetic energy of all the particles. For the assignment you will only have one particle and thus one velocity.  To make this exercise more interesting I am asking you to write a function to calculate the total kinetic for a system of N particles moving about on your energy landscape.

N.B.  In this exercise, we are operating in natural units.  We have thus set the masses of all the atoms equal to one.


In [None]:
import numpy as np

def kinetic( vel ) :
  ke = 0
  # Your code to calculate the kinetic energy should go here

  return ke

# This command sets the velocities of the particles to 10 ranom numbers
vel = np.random.normal(size=10)
# This prints out the total kinetic energy of the atoms whose velocities are specified
# in vel.
print( kinetic( vel ) )


In [None]:
runtest(['test_kinetic3'])

# Visualising your results

We now really have everything we need in order to generate molecular dynamics trajectories as:

1. We know how to calculate the forces for a given configuration.
2. We know how to code the Verlet algorithm and to thus work out how the positions and the velocities of the atoms change with time.
3. We know how to compute the instantaneous values of the potential and kinetic energy.  We can use these values to check if we are using a sufficiently small value for the timestep as, if the timestep is too large, energy will not be conserved during our simulation.

Your task in this final exercise is thus to write one final MD code that incorporates all these elements.  In doing so you will need to:

1. Write a function called `potential` that computes the potential energy and the forces for each of the configurations you generate.
2. Write a function called `kinetic` that calculates the instantaneous kinetic energy.
3. Use your potential function to write code that uses the velocity Verlet algorithm to integrate the equations of motion.
4. Every 10 MD steps store the instantaneous values of the potential, kinetic and total energies in the lists called `p_energy`, `k_energy` and `t_energy`

I have written a skeleton code in `main.py` to get you started.  If this code is completed correctly a graph should be generated that shows how the potential, kinetic and total energy change as the simulation progresses.


In [None]:
import matplotlib.pyplot as plt
import numpy as np

def potential(x) :
  energy = 0
  forces = 0
  # Your code to calculate the potential goes here

  return energy, forces

def kinetic(v) :
  ke = 0
  # Your code to calculate the kinetic energy from the velocities goes here

  return ke

# Set the initial position for the particle (you can change as I assume the initial particle is at init_pos when I test your code)
init_pos, init_vel = 3, 1
# This is the value to use for the timestep (the delta in the equations on the other side)
timestep = 0.005
# Now set the position equal to the inial position that was set above
pos, vel = init_pos, init_vel
# This calculates the initial values for the forces
eng, forces = potential(pos)

# We now run 500 steps of molecular dynamics
nsteps, stride = 500, 10
times = np.zeros(int(nsteps/stride))
k_energy = np.zeros(int(nsteps/stride))
p_energy = np.zeros(int(nsteps/stride))
t_energy = np.zeros(int(nsteps/stride))
for step in range(nsteps) :
  # First update the velocities a half timestep
  # fill in the blanks in the code here
  vel =

  # Now update the positions using the new velocities
  # You need to add code here


  # Recalculate the forces at the new position
  # You need to add code here
  eng, forces =

  # And update the velocities another half timestep
  # You need to add code here


  # This is where we want to store the energies and times
  if step%stride==0 :
    times[int(step/stride)] = step
    p_energy[int(step/stride)] = eng
    # Write code to ensure the proper values are saved here
    k_energy[int(step/stride)] =
    t_energy[int(step/stride)] =


# This will plot the potential, kinetic and total energy as a function of
# time
plt.plot( times, p_energy, 'b-', label='potential' )
plt.plot( times, k_energy, 'r-', label='kinetic' )
plt.plot( times, t_energy, 'k-', label='total' )
plt.xlabel('time')
plt.ylabel('energy')
plt.savefig( 'energies.png' )


In [None]:
runtest(['test_potential2', 'test_kinetic5', 'test_trajectory2'])

# Setting the initial velocity

The MD codes you have written thus far can be used to sample from the microcanonical (NVE) ensemble, which is useful.  In general, however, we would like to develop methods to sample from the canonical (NVT) ensemble as it is easier to control the temperature in a lab setting than it is to control the energy and our ultimate goal should surely be to compare the results that we obtain from our simulations with the results that our colleagues obtain from their experiments.

We are thus going to learn how to modify our MD code to make the system sample the canonical ensemble.  Codes for running constant temperature MD will use a thermostat to control the temperature in the simulations.  There are many different ways of implementing a thermostat and we will only have time to introduce one particular method in this exercise.  The basic idea of these thermostats is always the same, however:

* We know, from the derivation of the ideal gas law, that if we are sampling from the canonical (NVT) ensemble the distribution of momentum along each degree of freedom is given by:

![](eq1.png)

In other words, for each degree of freedom, the momentum is a sample from a normal distribution with mean 0 and variance k_B*T*m, where k_B is Boltzmann's constant, T is the temperature and m is the mass of the particle.

* We also know that classical equipartition (which is like a microscopic version of Gibb's phase rule) will ensure that energy is distributed equally between all the various degrees of freedom in the system if we are sampling from the NVT ensemble.  If we thus set the momentum of the atoms in accordance with the distribution above the distribution of potential energies will thus be set in accordance with the desired temperature.

* We can thus employ a thermostat that exchanges energy between the system and a reservoir.  This thermostat works by ensuring that the momenta of the atoms in the system are set in accordance with the distribution given above.

In the exercises after this one, you are going to learn how to code such a thermostat.  Before getting on to that, however, you first need to complete the function called `gen_vel` in `main.py`.  This function takes a single parameter called `temp`, which gives the temperature at which the simulation is to be run.  `gen_vel` should return a single scalar.  This scalar should be reasonable initial velocity for a particular degree of freedom.  In other words,  `gen_vel` should return a sample from the distribution of velocities given above.

Remember that we are operating with natural units and that as such k_B=1 and m=1.  Furthermore, you may find the following function, which returns a sample from a standard normal random variable with mean 0 and variance 1 useful:

````
random_normal = np.random.normal()
````


In [None]:
import matplotlib.pyplot as plt
import numpy as np

def gen_vel(temp) :
  #Your code goes here

  vel = 0
  return vel


# To test your code I have written the following code, which will generate multiple
# random initial velocities and illustrate the distribution of values for these
# velocities
velocities = np.zeros(1000)
for i in range(1000) : velocities[i] = gen_vel(1.0)
plt.hist( velocities )
plt.savefig('velocity_dist.png')


In [None]:
runtest(['test_vel'])

# Implementing thermostated MD

Now that we know how an MD code is written and what the distribution of velocities for a system at a particular temperature is we should be in a position to implement a constant temperature MD algorithm.  As was already discussed in the previous sections a wide variety of different techniques can be used to couple the velocities of the atoms in the system with a heat bath.  In this exercise, we are going to learn about the so-called Langevin thermostat.  This thermostat works adjusting the velocities before and after the first step of the velocity Verlet algorithm using the following expression:

![](eq1.png)

Here $\delta$ is half the simulation timestep, $\gamma$ is a parameter known as the friction of the thermostat and N(0,1) is used to indicate a sample from a standard normal random variable.

The details as to how this expression is derived are beyond the scope of this module.  We can begin to understand how it works by looking at its structure.  The $e^{-\gamma\delta}$ in the first term, for instance, is less than one.  By multiplying the old velocity by this expression repeatedly we thus cause the old velocity to decay away to zero.  Ignoring the factor of $(1-e^{-2\gamma\delta})$ in the second term gives us the expression that we used to set the initial velocities of the atoms.  Consequently, if we use a very large time step, it is as if we are simply setting the velocities as we did in the first exercise. Notice last of all that the coefficients of the terms in the sum above are constant.  You can thus calculate these coefficients outside the main MD loop to improve computational efficiency

Finally, notice that by setting the coefficients of the second term equal to $\sqrt{1-e^{-2\gamma\delta}}$ we ensure that the sum of the squares of the coefficients of $v_{old}$ in the first part and the value we would set for the initial velocity in the second part is one.  In other words, in setting the coefficients we are using Pythagoras theorem in some way.

Lets now turn to what you need to do in order to complete the code.  As in the previous exercise, I have written a skeleton code for the MD algorithm that you need to fill in.  To complete this you need to:

1. Write a function called `potential` that computes the potential energy and the forces for each of the configurations you generate.
2. Write a function called `kinetic` that calculates the instantaneous kinetic energy.
3. Every 10 MD steps store the instantaneous values of the potential, kinetic and total energies in the lists called `p_energy`, `k_energy` and `t_energy`

In addition, you then need to complete the skeleton code that implements the dynamics.  In each loop this code should:

1. Use the equation given above to modify the velocities.  When using the above equation in this step $\delta$ should be set equal to half the simulation timestep.  This step is the first step in controlling the simulation temperature.

2. Update the velocities, v, by a half timestep using:

![](eq2.png)

3. Update the positions, x, by a full timestep using:

![](eq3.png)

4. Recalculate the forces $F(t+\delta)$ at the new position.

5. Use the new values of the forces to update the velocities by another half timestep using:

![](eq4.png)

6. Use the equation given above to modify the velocities one more.  As in the first step $\delta$ should be set equal to half the simulation timestep as you do this.

Notice that the algorithm I have just described is very similar to the velocity Verlet algorithm that you implemented in the previous exercise.  In this new algorithm, the steps of the velocity Verlet algorithm are just sandwiched between the initial and final steps that control the simulation temperature.

In the outline code, you will notice that I have created three variables to hold the positions (`pos`), velocities (`vel`) and forces (`forces`).  Notice that whenever you update the velocities, positions or forces in the velocity Verlet algorithm you never again need the old positions velocities or forces.  You thus can (and should) use these three matrices to hold the instantaneous positions, velocities and forces.  I have written some code that will keep track of the velocities the particle takes during the trajectory and that can be used to visualise what happens to the velocity as the calculation proceeds.

The final result from the calculation should be a graph showing how the velocity of the particle changes with time during the simulation.  If the code has been implemented correctly you should see that the value of the velocity fluctuates around 0.  The variance of the distribution of velocities the particle takes during the simulation should be equal to the temperature.


In [None]:
import matplotlib.pyplot as plt
import numpy as np

def potential(x) :
  energy = 0
  # Your code to calculate the potential goes here
  forces = 0
  return energy, forces

def kinetic(v) :
  ke = 0
  # Your code to calculate the kinetic energy from the velocities goes here

  return ke

# Set the initial position for the particle (you can change as I assume the initial particle is at init_pos when I test your code)
init_pos, init_vel = 3, 1
# This is the value to use for the timestep (the delta in the equations on the other side)
timestep = 0.005
# This is the value of the temperature
temperature = 1.0
# Now set the position equal to the inial position that was set above
pos, vel = init_pos, init_vel
# This is the value of the friction for the thermostate (the gamma in the equations on the other side)
friction = 2.0
# This calculates the initial values for the forces
eng, forces = potential(pos)

# We now run 5000 steps of molecular dynamics
nsteps = 5000
stride=10
times = np.zeros(int(nsteps/stride))
vels = np.zeros(int(nsteps/stride))
for step in range(nsteps) :
  vel =

  # Update the velocities a half timestep
  # fill in the blanks in the code here
  vel =

  # Now update the positions using the new velocities
  # You need to add code here


  # Recalculate the forces at the new position
  # You need to add code here
  eng, forces =

  # Update the velocities another half timestep
  # You need to add code here


  # And finish by applying the thermostat for the second half timestep

  # This is where we want to store the energies and times
  if step%stride==0 :
    times[int(step/stride)] = step*timestep
    # Write code to ensure the proper values are saved here
    vels[int(step/stride)] = vel


# This will plot the kinetic energy as a function of time
plt.plot( times, vels, 'r-' )
plt.xlabel('time')
plt.ylabel('velocity')
plt.savefig( 'velocity.png' )


In [None]:
runtest(['test_potential3', 'test_kinetic9', 'test_trajectory3'])

# Testing your implementation

In this final exercise, I would like to finish by exploring how we can ensure that we have implemented the thermostat correctly.  I have already alluded the first test you should perform on any code in the previous exercise.  You should ensure that the average kinetic energy has a value that is consistent with the predictions of equipartition.  Equipartition states that a system with N momentum coordinates should have an average kinetic energy of Nk_BT / 2.  If the average kinetic energy does not fluctuate around this value then we know that our thermostat is implemented wrongly.

The second thing that needs to be checked is whether or not an appropriate conserved quantity has been conserved.  It is hopefully obvious that the total energy (i.e. the sum of the kinetic and potential energies) is not conserved when we run this sort of constant temperature molecular dynamics.  After all, energy is exchanged between the system and the reservoir whenever the thermostat changes the velocities of the atoms.  There is, however, no way that energy can leave the coupled system composed of the atoms and thermostat.  If we thus track how much energy is transferred from the atoms to the thermostat over the course of the simulation and add this to the total energy of the atoms this final quantity should be conserved.

To check the energy is conserved we, therefore:

1. Introduce a variable called `therm` and set its value equal to zero before our main MD loop.
2. Before applying the equation for the thermostat below:

![](eq1.png)

We compute the total kinetic energy of the particles and add it to the variable called `therm`.

Then after applying the equation for the thermostat we compute the kinetic energy again and subtract it from `therm`.

Notice, furthermore, that this business of updating therm must be done twice during the main MD loop as the thermostat is applied at the start and end of each iteration.

The final conserved quantity is computed by adding together the kinetic and potential energies of the particles and the variable called `therm`.  This quantity is conserved because, if `therm` is computed using the scheme described above, then this variable tracks the total amount of energy that has been transferred from the system to the thermostat's heat bath.

To get you started on implementing a constant temperature MD code that checks for a conserved quantity I have written the outline code in the cell on the left.  As in previous exercises to complete this code you will need to:

* Write a function called `potential` that computes the potential energy and the forces for each of the configurations you generate.
* Write a function called `kinetic` that calculates the instantaneous kinetic energy.
* Use your potential function to write code that uses the velocity Verlet algorithm to integrate the equations of motion.
* Every 10 MD steps store the instantaneous values of the potential, kinetic and conserved quantity in the lists called `p_energy`, `k_energy` and `conserved`.

If the code is completed correctly a graph should be generated that shows that the conserved quantity does not change with time over the course of the simulation.  Small fluctuations in the value of the conserved quantity are inevitable due to machine error.  This quantity should not drift upwards or downwards with time, however.

You will get an error about the graph being plotted with wrong data if you write an MD code in which the energy is not conserved.



In [None]:
import matplotlib.pyplot as plt
import numpy as np

def potential(x) :
  energy = 0
  forces = 0
  # Your code to calculate the potential goes here

  return energy, forces

def kinetic(v) :
  ke = 0
  # Your code to calculate the kinetic energy from the velocities goes here

  return ke

# Set the initial position for the particle (you can change as I assume the initial particle is at init_pos when I test your code)
init_pos, init_vel = 3, 1
# This is the value to use for the timestep (the delta in the equations on the other side)
timestep = 0.005
# This is the value of the temperature
temperature = 1.0
# Now set the position equal to the inial position that was set above
pos, vel = init_pos, init_vel
# This is the value of the friction for the thermostate (the gamma in the equations on the other side)
friction = 2.0
# This calculates the initial values for the forces
eng, forces = potential(pos)
# This is the variable that you should use to keep track of the quantity of energy that is exchanged with
# the reservoir of the thermostat.
therm = 0

# We now run 500 steps of molecular dynamics
nsteps, stride = 500, 10
times = np.zeros(int(nsteps/stride))
conserved_quantity = np.zeros(int(nsteps/stride))
for step in range(nsteps) :
  # Apply the thermostat for a half timestep
  vel =

  # Update the velocities a half timestep
  # fill in the blanks in the code here
  vel =

  # Now update the positions using the new velocities
  # You need to add code here


  # Recalculate the forces at the new position
  # You need to add code here
  eng, forces =

  # Update the velocities another half timestep
  # You need to add code here


  # And finish by applying the thermostat for the second half timestep
  vel =

  # This is where we want to store the energies and times
  if step%stride==0 :
    times[int(step/stride)] = step*timestep
    # Write code to ensure the proper values are saved here
    conserved_quantity[int(step/stride)] =


# This will plot the kinetic energy as a function of time
plt.plot( times, conserved_quantity, 'r-' )
plt.ylim([min(conserved_quantity)-0.05, max(conserved_quantity)+0.05 ])
plt.xlabel('time')
plt.ylabel('conserved quantity / energy units')
plt.savefig( 'conserved_quantity.png' )


In [None]:
runtest(['test_potential4', 'test_kinetic8', 'test_trajectory4'])

# Calculating the average

The exercises in this classroom are going to introduce you to the technique of block averaging while also showing you why it is necessary.

I have run an MD calculation much like the ones that you have just performed and have copy and pasted the content of the energies file that was output to the file called energies on this online system.  You can see the contents of this file by clicking on the file called `energies`.  Furthermore, if you prefer you can replace what is in that file with the output from your simulation.  The exercise should still work regardless.

I would like you to write some python code that calculates the average value the energy took over the simulation.  To complete the exercise you will need to have a variable called `average`, which should be set equal to the average value that the energy took during the trajectory.  This quantity should, obviously, be calculated using:

![](equation.png)

In this expression N is the number of frames in the trajectory and E_t is the value the energy took at time t.




In [None]:
import numpy as np

# Read in the energies from a file
eng = np.loadtxt('energies')[:,1]

# Your code goes here


In [None]:
runtest(['test_mean'])

# Calculating block averages

In this exercise we are going to calculate block averages.

The input file `energies` that I provided you with contains 1000 values for the energy.  For this exercise I want you to calculate:

* The average over the first 100 energies in this file
* The average over the second 100 energies in this file
* The average over the third 100 energies in the file
* and so on.

The final result should thus be a list containing 10 values for the average energy.  I have setup a list with 10 elements that you can use to hold these averages.  The list is called `av_eng`.

Once you have calculated the elements of `av_eng` I would like you to draw a graph of the results.  The x-coordinates for the 10 points in your graph should be the integers from 1 to 10.  The y-coordinates
should be the values of the 10 block averages that you have obtained.  The point with x-coordinate 1 should be the block average from the first 100 energies, the point with x-coordinate 2 should be the block
average from the second 100 energies and so on.

The x-axis label for your graph should be 'Index' and the y-axis label should be 'Average energy / natural units'


In [None]:
import matplotlib.pyplot as plt
import numpy as np

# Read in the energies from a file
eng = np.loadtxt('energies')[:,1]

# Create a list with 10 elements that you will use to hold the average eneriges
av_eng = np.zeros(10)

# Your code goes here




In [None]:
runtest(['test_energies'])

# Calculating the standard deviation

In this exercise we are going to remind ourselves how to compute the variance.  We will also see that computing the error is not simply a matter of computing the variance.

Recall that the sample variance is given by:

![](equation.png)

For this exercise I want you to calculate this quantity for:

* The first 100 energies in this file
* The second 100 energies in this file
* The third 100 energies in the file
* and so on.

The values for these 10 variances should be stored in the array called `variances`, which I have already created for you and which you will notice is plotted in the final few lines of Python in the panel on the left.

In addition to computing these 10 values for the block variance I would also like you to compute the variance using all the data in the trajectory.  The value of this variance should be stored in a variable called `total_var`.

To complete the exercise you will need to plot a graph with two data series.  You will use the first data series to show the variances from each of the blocks.  The x-coordinates of the 10 points of this line should thus be the integers
from 1 to 10.  The y-coordinates will then be the values of the 10 block variances that you have obtained.  The point with x-coordinate 1 should be the block variance from the first 100 energies, the point with x-coordinate 2 should be the block
variance from the second 100 energies and so on.

The thing you will plot is a line indicating the total variance for all the data.  You can plot this with a command like the following:

```python
plt.plot( [1,10], [total_var,total_var], 'r-' )
```

This command ensures that a red horizontal line is drawn to indicate the value of the total variance.  You should find that black dots illustrating the block variances should all be reasonably close to the red line.  This makes sense - both sets of calculations
that you are performing here are estimating the same quantity.  The only difference is that when you compute the variances from each block of data you have fewer data points.

The x-axis label for your graph should be 'Index' and the y-axis label should be 'Variance / energy^2'


In [None]:
import matplotlib.pyplot as plt
import numpy as np

# Read in the energies from a file
eng = np.loadtxt('energies')[:,1]
# Create a list with 10 elements that you will use to hold the variances
variances = np.zeros(10)
# Your code goes here



In [None]:
runtest(['test_graph1'])

# The variance for the block averages

In this exercise we are going to compute the variance for the block averages.

The previous exercise showed you how to compute the variance over the whole trajectory.  We also learned that this variance is not going to be useful in terms of us calculating the error bars for our ensemble averages.  The error bars on the ensemble average will be computed by calculating the average of the block averages.  In other words, we are going to calculate N block averages over each of the M-frame blocks in our trajectory using:

![](equation-1.png)

We will assume that these N block averages represent N samples of the same random variable.  We can thus calculate the average for this random variable as:

![](equation-2.png)

Furthermore, because we have N samples, we can calculate the standard deviation (the error) for this average using:

![](equation-3.png)

I would like you to insert code in `main.py` that computes the average energy from the blocks using the second equation on this page and the error in this quantity using the third equation on this page.  To do this you are going to first have to compute block averages over the first 100, second 100, third 100 and so on frames in the trajectory as you have done in previous exercises.  You are then going to have to compute the quantities above from these block averages.  The final value that you get for the average energy should be saved in a variable called `average` and the final value for the error should be saved in a variable called `error`.


In [None]:
import matplotlib.pyplot as plt
import numpy as np

# Read in the energies from a file
eng = np.loadtxt('energies')[:,1]

# Create a list to hold the block averages
blocks = np.zeros(10)

# Your code goes here


In [None]:
runtest(['test_average_correct', 'test_error_correct'])

# The relationship between the error and the size of the blocks

In this final exercise we are going to bring together everything we have learned in order to look at how the block averaging technique allows us to resolve the problems that we would otherwise have in estimating errors with correlated variables.

The previous four exercises have shown you how to compute the average and the standard deviation by block averaging.   In this exercise we are going to look at how the size of the error depends on the size of the blocks.  To do this we will need to encapsulate
the code that we have written to calculate block averages and errors in a function that takes as input the data and the length of the block, M, into which to divide the data.  In `main.py` I have written the first line of this function for you as follows

```python
def block_average( M, data ) :
    # Your code goes here

    return error
```

You should then use the function you have written to plot a graph that shows how the size of the error depends on the length of the blocks.  In drawing this graph you should calculate the error when block averages with the following lengths
are used in the calculation of the error:

```python
xvals = [10,20,30,40,60,100,120,200,300,400]
```

The x coordinates of the points of in your graph should be equal to the numbers in the list above.  The y-coordinates of these points should then be the corresponding values of the error for that size of block.  The y value for the point at x=10 is thus
the error that is calculated from block averages that are calculated from the first 10, second 10 points and so on.  In practise you can calculate these y-values by using the function `block_average` that you will have written.

You should see that error is initially small.  It will then grow as the size of the various blocks is increased before plateauing to a constant value.

The x axis label of your graph should be 'Size of blocks' and the y axis label should be 'Error'


In [None]:
import matplotlib.pyplot as plt
import numpy as np

def block_average( M, data ) :
  # Your code goes here

  return error

# Read in the energies from a file
eng = np.loadtxt('energies')[:,1]



In [None]:
runtest(['test_blockVals', 'test_plot'])

# The average energy and error bars

This exercise is going to teach you how we can calculate heat capacities from molecular dynamics simulation.  This first task is going to revise the material that was covered in the previous set of exercises on molecular dynamics simulation.  We will also review what you have learned about the method of block averaging.  The objective is to use molecular dynamics to calculate an estimate for the ensemble average of the energy.  As we are computing an estimate for the ensemble average for the energy, we must also compute suitable error bars for this estimate, and we will calculate these errors using block averaging.

You will notice that I have written an outline for a constant temperature MD code in `main.py`.  This outline should, by now be getting somewhat familiar, and you should by now know that to complete this code, you will need to:

1. Write a function called `potential` that computes the potential energy and the forces for each of the configurations you generate.
2. Write a function called `kinetic` that calculates the instantaneous kinetic energy.
3. Use your potential function to write code that uses the velocity Verlet algorithm and the thermostat to integrate the equations of motion.
4. Every stride MD steps store the instantaneous values of the potential, kinetic, total and conserved quantity in the lists called `p_energy`, `k_energy`, `t_energy` and `conserved_quantity`.

Notice that, at variance with what has been done in previous exercises, a function called `gen_traj` has been written.  This function generates the molecular dynamics trajectory and takes seven input arguments:

1. `pos` - the initial positions of the atoms
2. `vel` - the initial velocities of the atoms
3. `nsteps` - the number of steps in the trajectory that will be generated
4. `timestep` - the simulation timestep
5. `stride` - the frequency with which to store the energies - energies are stored every stride MD steps.
6. `temperature` - the temperature at which to run the simulation
7. `friction` - the friction parameter for the thermostat

It then returns five lists:

1. `times` - the times at which data has been collected
2. `p_energy` - the potential energy as a function of time.
3. `k_energy` - the kinetic energy as a function of time
4. `t_energy` - the total energy of the system as a function of time
5. `conserved_quantity` - the value of the conserved quantity as a function of time.

Notice that to get this function for generating the trajectory to work correctly, you are going to need to fill in the blanks in the code within it using what you have learned about constant temperature molecular dynamics.  Furthermore, also notice that encapsulating this code for generating trajectories in a function like the one we have written here is useful.  For the project, you will need to run MD simulations at multiple different temperatures so you will need to use the MD code multiple times.

You will notice that there is a call to `gen_traj` after the definition of this function and that arrays called `tt`, `potential_e`, `kinetic_e`, `total` and `conserved` are used to hold the values that the energy took during the trajectory that was generated.  You need to use the data in these arrays to compute block averages and errors with blocks of trajectory that are 200 steps, 400 steps, 600 steps, 800 steps, 1000 steps and 1200 steps long.  The errors should be indicate of the 90% confidence limit.

You will only need to compute these block averages for the data in one of these lists.  Remember I would like the ensemble average for the __total energy__.  The value of the average of the block averages should be stored in the list called `averages` and the error bar for a 90 % confidence limit should be stored in the list called `errors`.  Look back at the notes you made on block averaging if you are struggling with this part of the exercise.

The final result should be a graph showing how the error bar changes as the size of the blocks from which it is computed changes.






In [None]:
import matplotlib.pyplot as plt
import scipy.stats as st
import numpy as np

def potential(x) :
  energy = 0
  forces = 0
  # Your code to calculate the potential goes here

  return energy, forces

def kinetic(v) :
  ke = 0
  # Your code to calculate the kinetic energy from the velocities goes here

  return ke

def gen_traj( pos, vel, nsteps, timestep, stride, temperature, friction ) :
  # This calculates the initial values for the forces
  eng, forces = potential(pos)
  # This is the variable that you should use to keep track of the quantity of energy that is exchanged with
  # the reservoir of the thermostat.
  therm = 0

  times = np.zeros(int(nsteps/stride))
  k_energy = np.zeros(int(nsteps/stride))
  p_energy = np.zeros(int(nsteps/stride))
  t_energy = np.zeros(int(nsteps/stride))
  conserved_quantity = np.zeros(int(nsteps/stride))
  for step in range(nsteps) :
    # Apply the thermostat for a half timestep


    # Update the velocities a half timestep
    # fill in the blanks in the code here


    # Now update the positions using the new velocities
    # You need to add code here


    # Recalculate the forces at the new position
    # You need to add code here
    eng, forces =

    # Update the velocities another half timestep
    # You need to add code here


    # And finish by applying the thermostat for the second half timestep


    # This is where we want to store the energies and times
    if step%stride==0 :
      times[int(step/stride)] = step*timestep
      # Write code to ensure the proper values are saved here
      p_energy[int(step/stride)] =
      k_energy[int(step/stride)] =
      t_energy[int(step/stride)] =
      conserved_quantity[int(step/stride)] =

  return times, p_energy, k_energy, t_energy, conserved_quantity

# Set the initial position for the particle (you can change as I assume the initial particle is at init_pos when I test your code)
init_pos, init_vel = 3, 1
# This command runs the molecular dynamics and generates a trajectory
temperature = 1.0   # This variable must be defined to pass the tests

# Generate the trajectories.  Please do not change the names of the variables on the left hand side of the
# equals sign here.  I look for variables with these names when I test your code
tt, potential_e, kinetic_e, total, conserved = gen_traj( init_pos, init_vel, 2400, 0.005, 1, temperature, 2.0 )

# This is the part to compute the block averages for the error estimation
# I use the variable called errors to test your code.  This should contain a 90% confidence limit on your estimate of the error
bsize, averages, errors = [200,400,600,800,1000,1200], np.zeros(6), np.zeros(6)
for blocksize in bsize :
  # Your code to calculate the block averages goes here


# This will plot the kinetic energy as a function of time
plt.errorbar( bsize, averages, yerr=errors, fmt='ko' )
plt.xlabel('Length of block')
plt.ylabel('Average energy / natural units')
plt.savefig( 'average_energy.png' )


In [None]:
runtest(['test_block_averages1', 'test_block_errors1', 'test_conserved1', 'test_kinetic1', 'test_forces1'])

# The average square energy

This next exercise is like the previous one in that you are going to compute an ensemble average.  This time, however,  you should calculate the ensemble average for the square of the total energy.  In other words, I want you to square each of the total energies that you obtain from your trajectory and to calculate the average of these squared quantities.  As in the previous exercise, you must perform block averaging to get suitable error bars because the final value that you obtain is an estimate for the average squared energy. Thus the uncertainty in this estimate must be quantified.

As in the last exercise, I have written an outline for a constant temperature MD code in the cell on the right.  This outline is the same as the outline in the previous exercise so you must:

1. Write a function called `potential` that computes the potential energy and the forces for each of the configurations you generate.
2. Write a function called `kinetic` that calculates the instantaneous kinetic energy.
3. Use your potential function to write code that uses the velocity Verlet algorithm and the thermostat to integrate the equations of motion.
4. Every stride MD steps store the instantaneous values of the potential, kinetic, total and conserved quantity in the lists called `p_energy`, `k_energy`, `t_energy` and `conserved_quantity`.

Furthermore, as in the previous exercise, a function called `gen_traj` has been written.  This function generates the molecular dynamics trajectory and takes seven input arguments:

1. `pos` - the initial positions of the atoms
2. `vel` - the initial velocities of the atoms
3. `nsteps` - the number of steps in the trajectory that will be generated
4. `timestep` - the simulation timestep
5. `stride` - the frequency with which to store the energies - energies are stored every stride MD steps.
6. `temperature` - the temperature at which to run the simulation
7. `friction` - the friction parameter for the thermostat

It then returns five lists:

1. `times` - the times at which data has been collected
2. `p_energy` - the potential energy as a function of time.
3. `k_energy` - the kinetic energy as a function of time
4. `t_energy` - the total energy of the system as a function of time
5. `conserved_quantity` - the value of the conserved quantity as a function of time.

As in the last exercise, you are going to need to fill in the blanks in the code within it using what you have learned about constant temperature molecular dynamics to get this function working correctly.

You will notice that there is a call to `gen_traj` after the definition of this function and that arrays called `tt`, `potential_e`, `kinetic_e`, `total` and `conserved` are used to hold the values that the energy took during the trajectory that was generated.  You need to use the data in these arrays to compute block averages and errors with blocks of trajectory that are 200 steps, 400 steps, 600 steps, 800 steps, 1000 steps and 1200 steps long.

Remember that you are computing block averages for the __square of the total energy__ so you must square the energy __before__ you add it to the variable that accumulates the mean.  The value of the average of the block averages should be stored in the list called `averages`, and the error bar for a 90 % confidence limit should be stored in the list called `errors`.

The final result should be a graph showing how the size of the error bar changes as the size of the blocks from which it is computed changes.


In [None]:
import matplotlib.pyplot as plt
import scipy.stats as st
import numpy as np

def potential(x) :
  energy = 0
  forces = 0
  # Your code to calculate the potential goes here

  return energy, forces

def kinetic(v) :
  ke = 0
  # Your code to calculate the kinetic energy from the velocities goes here

  return ke

def gen_traj( pos, vel, nsteps, timestep, stride, temperature, friction ) :
  # This calculates the initial values for the forces
  eng, forces = potential(pos)
  # This is the variable that you should use to keep track of the quantity of energy that is exchanged with
  # the reservoir of the thermostat.
  therm = 0

  times = np.zeros(int(nsteps/stride))
  k_energy = np.zeros(int(nsteps/stride))
  p_energy = np.zeros(int(nsteps/stride))
  t_energy = np.zeros(int(nsteps/stride))
  conserved_quantity = np.zeros(int(nsteps/stride))
  for step in range(nsteps) :
    # Apply the thermostat for a half timestep


    # Update the velocities a half timestep
    # fill in the blanks in the code here


    # Now update the positions using the new velocities
    # You need to add code here


    # Recalculate the forces at the new position
    # You need to add code here
    eng, forces =

    # Update the velocities another half timestep
    # You need to add code here


    # And finish by applying the thermostat for the second half timestep


    # This is where we want to store the energies and times
    if step%stride==0 :
      times[int(step/stride)] = step*timestep
      # Write code to ensure the proper values are saved here
      p_energy[int(step/stride)] =
      k_energy[int(step/stride)] =
      t_energy[int(step/stride)] =
      conserved_quantity[int(step/stride)] =

  return times, p_energy, k_energy, t_energy, conserved_quantity

# Set the initial position for the particle (you can change as I assume the initial particle is at init_pos when I test your code)
init_pos, init_vel = 3, 1
# This command runs the molecular dynamics and generates a trajectory
temperature = 1.0   # This variable must be defined to pass the tests

# Generate the trajectories.  Please do not change the names of the variables on the left hand side of the
# equals sign here.  I look for variables with these names when I test your code
tt, potential_e, kinetic_e, total, conserved = gen_traj( init_pos, init_vel, 2400, 0.005, 1, temperature, 2.0 )

# This is the part to compute the block averages for the error estimation
# I use the variable called errors to test your code.  This should contain a 90% confidence limit on your estimate of the error
# REMEMBER TO SQUARE THE ENERGY AND TO CALCULATE THE ENERGY OF THE SQUARE
bsize, averages, errors = [200,400,600,800,1000,1200], np.zeros(6), np.zeros(6)
for blocksize in bsize :
  # Your code to calculate the block averages goes here


# This will plot the kinetic energy as a function of time
plt.errorbar( bsize, averages, yerr=errors, fmt='ko' )
plt.xlabel('Length of block')
plt.ylabel('Average squared energy')
plt.savefig( 'average_squared_energy.png' )


In [None]:
runtest(['test_block_averages2', 'test_block_errors2', 'test_conserved2', 'test_kinetic2', 'test_forces2'])

# Heat capacities by finite differences

Now that you can compute these ensemble averages you know how to run all the molecular dynamics simulations that you will need to compute the heat capacities for your report.  In these final four exercises, I am thus going to give you the ensemble averages and error bars that I computed by running constant temperature MD simulations at a number of different temperatures.  You are then going to use this data to compute the heat capacity as a function of temperature.  We are going to use two different methods to compute the heat capacity.  The first of these is based on the definition of the heat capacity that you learned from classical thermodynamics.  In that part of this course, you learned that the heat capacity was the partial derivative of the internal energy with respect to temperature at constant volume.  Given this it makes sense to calculate the heat capacity at a temperature midway between T_1 and T_2 using:

![](eq.png)

In other words, we run simulations at two temperatures T_2 and T_1 and compute the ensemble average for the total energy in these two simulations.  We then insert these two approximate ensemble averages into the finite-difference formula above as <E>(T_2) and <E>(T_1) to get an approximate value for the derivative at the midpoint between the two temperatures.

The exercise in the code cell on the left will allow you to test whether or not you have understood this idea.  Ensemble averages and errors from the MD simulations that I have run from you have been imported from the input file called `md_results.txt`.  Five lists have been created from this data that are named as follows:

1. `temperatures` - the temperatures at which the simulations were run
2. `energies` - the ensemble average for the total energies at each temperature
3. `errror_energies` - the error bars for each of the average energies computed at each temperature.
4. `energies2` - the ensemble average for the square of the total energy at each temperature.
5. `error_energies2` - the error bars for each of the average squared energies computed at each temperature.

Your task is to use these 10 data points to calculate the heat capacity at nine different temperatures using the formula above.  The values of the temperature at which you have computed the heat capacity should be stored in the list called `cv_temperatures` and the final values for the heat capacity should be stored in the list called `cv`.  If you complete the exercise correctly a graph showing the value of the heat capacity as a function of temperature will be generated.


In [None]:
import matplotlib.pyplot as plt
import numpy as np

# This loads the data from the input file and generates the lists
# that are described in the text on the right.
data = np.loadtxt('md_results.txt')
temperatures = data[:,0]
energies = data[:,1]
error_energies = data[:,2]
energies2 = data[:,3]
error_energies = data[:,4]

# These are the lists that hold the temperatures at which the
# heat capacity has been computed and the values that you obtained for
# the heat capacity.
cv_temperatures, cv = np.zeros(9), np.zeros(9)

# Your code to calculate the values of the heat capacity goes here


# This will plot a graph of the heat capacity as a function of temperature
plt.plot( cv_temperatures, cv, 'ko' )
plt.xlabel('temperature / natural units')
plt.ylabel('heat capacity / natural units')
plt.savefig('heat_capacity.png')


In [None]:
runtest(['test_graph3'])

# Heat capacity by finite differences with errors

As you should be keenly aware by now we are not yet done.  The values of the ensemble averages that we used in the formula below:

![](eq1.png)

were estimates as they were computed from MD simulations of finite length.  Consequently, the values that we obtained for the heat capacity were also estimates.  We thus need to compute suitable errors by propagating the errors that were computed for the ensemble averages of the energy.

We can calculate these errors by noting that the maximum possible value we could have computed for the heat capacity is given by:

![](eq2.png)

where $\Delta E_2$ and $\Delta E_1$ are the errors for the ensemble averages computed from the simulations at $T_2$ and $T_1$ respectively.  This expression is derived by considering the steepest possible gradient that still passes through the two error bars.  Using similar logic, we can consider the shallowest possible gradient that passes through the two error bars and obtain the minimum possible value for the heat capacity as follows:

![](eq3.png)

The difference between these two values gives the range of possible values that the heat capacity might take and is equal to:

![](eq4.png)

This range is symmetric around the value for the heat capacity that is computed using the first formula above, however, so we can thus write our final value for the width of the error bar on the heat capacity as:

![](eq5.png)

To complete this exercise you must, therefore, recompute the heat capacities from the data in the input as you did in the previous exercise.  This time, however, you need to also compute the error bars for the heat capacities that you obtain.  Just in case you have forgotten from the last exercise we import the following lists from `md_data.txt` at the start of the calculation:

* `temperatures` - the temperatures at which the simulations were run
* `energies` - the ensemble average for the total energies at each temperature
* `errror_energies` - the error bars for each of the average energies computed at each temperature.
* `energies2` - the ensemble average for the square of the total energy at each temperature.
* `error_energies2` - the error bars for each of the average squared energies computed at each temperature.

You will calculate the heat capacity at nine different temperatures.  The values of the temperature at which you have computed the heat capacity should be stored in the list called `cv_temperatures` and the final values for the heat capacity should be stored in the list called `cv`.  The errors on the values of the heat capacity should be stored in the list called `cv_errors`.  If you complete the exercise correctly a graph showing the value of the heat capacity as a function of temperature with suitable error bars will be generated.

N.B.  Please do not change the names of the lists called  `cv_temperatures`, `cv` and `cv_errors`.  If the names of these lists are changed your code will fail the tests.




In [None]:
import matplotlib.pyplot as plt
import numpy as np

# This loads the data from the input file and generates the lists
# that are described in the text on the right.
data = np.loadtxt('md_results.txt')
temperatures = data[:,0]
energies = data[:,1]
error_energies = data[:,2]
energies2 = data[:,3]
error_energies2 = data[:,4]

# These are the lists that hold the temperatures at which the
# heat capacity has been computed and the values that you obtained for
# the heat capacity.
# N.B. I check that the variables in cv_errors have been computed correctly
# when I run tests.  Please ensure that there is a variable with this name in your code
cv_temperatures, cv, cv_errors = np.zeros(9), np.zeros(9), np.zeros(9)

# Your code to calculate the values of the heat capacity and the errors goes here


# This will plot a graph of the heat capacity as a function of temperature
plt.errorbar( cv_temperatures, cv, yerr=cv_errors, fmt='ko' )
plt.xlabel('temperature / natural units')
plt.ylabel('heat capacity / natural units')
plt.savefig('heat_capacity.png')


In [None]:
runtest(['test_graph4', 'test_errors1'])

# Heat capacity from fluctuations

Now you know how to compute a heat capacity you can run a simulation and produce a result that can be compared with the results from an experiment or with the result that we would expect given the body of theory that we have learned during this course on statistical mechanics.

In these final two exercises, we are going to consider an alternative method that we can use to calculate the heat capacity from a constant temperature MD simulation.  For this method, we will use the fact that the heat capacity is related to the average fluctuations in the internal energy by this formula:

![](eq1.png)

We learned about this formula when we studied the canonical ensemble.  Furthermore, at some point during your degree (or before), you will have learned that the variance can be computed using either the part of the expression shown in angle brackets above or by computing:

![](eq2.png)

Given that we are given the average for the square of the energy, it is clear that you are going to use this second formula in your codes.

With all that in mind, your task here is to compute the heat capacity using the formulas above.  In doing so, you should use the data from the file `md_results.txt` .  As always you have been given the following data:

* `temperatures` - the temperatures at which the simulations were run
* `energies` - the ensemble average for the total energies at each temperature
* `errror_energies` - the error bars for each of the average energies computed at each temperature.
* `energies2` - the ensemble average for the square of the total energy at each temperature.
* `error_energies2` - the error bars for each of the average squared energies computed at each temperature.

You should be able to use this data and the formulas above to compute the heat capacity at ten different temperatures.  The values of the temperature at which you have calculated the heat capacity should be stored in the list called `cv_temperatures`, and the final values for the heat capacity should be stored in the list called `cv`.

If you complete the exercise correctly, a graph showing the value of the heat capacity as a function of temperature will be generated.


In [None]:
import matplotlib.pyplot as plt
import numpy as np

# This loads the data from the input file and generates the lists
# that are described in the text on the right.
data = np.loadtxt('md_results.txt')
temperatures = data[:,0]
energies = data[:,1]
error_energies = data[:,2]
energies2 = data[:,3]
error_energies2 = data[:,4]

# These are the lists that hold the temperatures at which the
# heat capacity has been computed and the values that you obtained for
# the heat capacity.
cv_temperatures, cv = np.zeros(10), np.zeros(10)

# Your code to calculate the values of the heat capacity and the errors goes here


# This will plot a graph of the heat capacity as a function of temperature
plt.plot( cv_temperatures, cv, 'ko' )
plt.xlabel('temperature / natural units')
plt.ylabel('heat capacity / natural units')
plt.savefig('heat_capacity.png')


In [None]:
runtest(['test_graph5'])

# Heat capacity from fluctuations with errors

As always we are not quite done because we haven't worked out how the error bars for this second method of calculating the heat capacity are determined.  In this exercise, we are thus going to learn how to compute these errors.  We are again going to use the propagation of errors when determining the error on the heat capacity.  We can do this because the first two tasks in this exercise showed us how errors for <E> and <E^2> can be computed using block averaging.  Let's call the errors on these two quantities $\Delta E$ and $\Delta E^2$.  We now note that the heat capacity is a function of <E>and <E^2> and that we can thus calculate the maximum value that this would take using:

![](eq1.png)

Where we have truncated the expansion in the second term here at first order in the error.  Similarly, the minimum value that the heat capacity can take is:

![](eq2.png)

Taking the difference between these two values gives:

![](eq3.png)

And, because this range of values is symmetric around the value for the heat capacity that we computed by inserting the averages we thus arrive at a final value for the error bar of:

![](eq4.png)

To complete this exercise you must, therefore, recompute the heat capacities from the data in the input as you did in the previous exercise.  This time, however, you need to also compute the error bars for the heat capacities that you obtain.  Just in case you have forgotten from the last exercise we import the following lists from `md_data.txt` at the start of the calculation:

* `temperatures` - the temperatures at which the simulations were run
* `energies` - the ensemble average for the total energies at each temperature
* `errror_energies` - the error bars for each of the average energies computed at each temperature.
* `energies2` - the ensemble average for the square of the total energy at each temperature.
* `error_energies2` - the error bars for each of the average squared energies computed at each temperature.

You will calculate the heat capacity at ten different temperatures.  The values of the temperature at which you have computed the heat capacity should be stored in the list called `cv_temperatures` and the final values for the heat capacity should be stored in the list called `cv`.  The errors on the values of the heat capacity should be stored in the list called `cv_errors`.  If you complete the exercise correctly a graph showing the value of the heat capacity as a function of temperature with suitable error bars will be generated.

N.B.  Please do not change the names of the lists called  `cv_temperatures`, `cv` and `cv_errors`.  If the names of these lists are changed your code will fail the tests.


In [None]:
import matplotlib.pyplot as plt
import numpy as np

# This loads the data from the input file and generates the lists
# that are described in the text on the right.
data = np.loadtxt('md_results.txt')
temperatures = data[:,0]
energies = data[:,1]
error_energies = data[:,2]
energies2 = data[:,3]
error_energies2 = data[:,4]

# These are the lists that hold the temperatures at which the
# heat capacity has been computed and the values that you obtained for
# the heat capacity.
# N.B. I check that the variables in cv_errors have been computed correctly
# when I run tests.  Please ensure that there is a variable with this name in your code
cv_temperatures, cv, cv_errors = np.zeros(10), np.zeros(10), np.zeros(10)

# Your code to calculate the values of the heat capacity and the errors goes here


# This will plot a graph of the heat capacity as a function of temperature
plt.errorbar( cv_temperatures, cv, yerr=cv_errors, fmt='ko' )
plt.xlabel('temperature / natural units')
plt.ylabel('heat capacity / natural units')
plt.savefig('heat_capacity.png')


In [None]:
runtest(['test_graph6', 'test_errors2'])

# Analytic derivations for thermodynamic quantities

The exercises you have just completed showed you how to run molecular dynamics simulations of a particle on harmonic potential.  We then learned how to extract thermodynamic quantities from our simulations by calculating ensemble averages.  This exercise is instructive for a harmonic oscillator but not slightly unecessary.  As I have shown you in the videos, we can derive an analytic expression for the partition function of a harmonic oscillator.  There is thus no need to calculate it approximately as we have done here.  The exercises you have completed are not without merit, however, as such derivations are not without merit as for many Hamiltonians of interest such derivations are not possible.

For completeness, I want to show you how we use the mathematical algebra package SymPy to perform the derivations that I did in the video.  In `main.py` I have written code to:

1. Define the hamiltonian function
2. Evaluate the canonical partition function by doing the double integral over the position and momentum coordinates.
3. Differentiate the logarithm of the partition function to extract the ensemble average of the energy

I would like you to extend this code by using SymPy to also calculate the heat capacity.  You should set the variable CV equal to the heat capacity.  To pass the tests you will need to modify `main.py` so that it calculates the heat capacity for the following potential:

![](eq1.png)

The variable CV must be set equal to the heat capacity one particle on a energy landscape with the potential given above.

In the videos I have shown that the ensemble average of the energy is calculated by taking the following derivatives:

![](eq2.png)

The following short derivation should convince you that the derivative above is equivalent to the expression that I have used to calculate the ensemble average of the energy in `main.py`.

![](eq3.png)

Notice that you may be able to use Sympy and the code in `main.py` to calculate the heat capacity for the potential you chose for your assignment.


In [None]:
import sympy as sy

# Lets first define some symbols
# x = position of particle
# p = momentum of particle
# T = temperature
x, p, T  = sy.Symbol('x'), sy.Symbol('p'), sy.Symbol('T', real=True, positive=True )

# Now calculate the partition function
# First the hamiltonian
H = ( x*x + p*p ) / 2
# And the boltzmann weight
f = sy.exp( - H / T )

# Now integrate along p
pint = sy.integrate( f, (p,-sy.oo,sy.oo) )
# and integrate the result from the last step along x to get Z
Z = sy.integrate( pint, (x,-sy.oo,sy.oo) )
print('The partition function is', Z )

# Now get the ensemble average for the energy by differentiating log(Z) with respect to beta
E = (T**2)*sy.diff( sy.log(Z), T )
print('The ensemble average of the energy is', E )

# And finally get the heat capcity
# N.B I test that the variable with this value has the correct value when I test your code
CV =
print('The heat capacity is', CV )


In [None]:
runtest(['test_cv'])