## CHY1610: Introduction to Scientific Computing for Chemists
## Dr Daniel Cole
* Room: BEDB.2.29
* email: daniel.cole@ncl.ac.uk

## Frequently Asked Questions

Before we start with Workshop 4, here are the responses to a couple of FAQs from last week:

**Q1.** Is it possible to append to a string? (Recall that we have only covered appending to a list so far).

In [None]:
# Yes!

string1 = "gold"
string2 = "fish"
string3 = string1 + string2
print(string3)

# or we can use augmented assignment as we did before for numbers:
output = ''
output += 'd'
output += 'o'
output += 'g'
print(output)

# or the same idea in a loop
output = ''
for input in 'cat':
    output += input
print(output)

**Q2.** This wasn't really a question, but just a reminder that once a function has been successfully defined, you can use it over and over without typing it all out again (see Q4 from last week).

In [None]:
# Define function

def myfunction(a, b, c):
    """Function to compute a + b - c. User inputs a,b,c."""
    value = a + b - c
    return value

# Now we can use it as often as we like, with different input arguments,
# without altering the function code itself:

print(myfunction(2,3,1))
print("The result of the sum 5 + 4 - 3 is {0}".format(myfunction(5,4,3)))

# We can even use the function in loops:
for i in range(5):
    print(myfunction(i,4,2))

# Or use in additional mathematical operations:
print()
newvalue = 6 - myfunction(10,6,2)
print("New value is {0}".format(newvalue))

## Workshop 4: Plotting, Errors & Computational Modelling.

### Plotting

Now that we know how to read and write from data files, and manipulate data using functions, we may wish to produce plots to visualise the data. You may come across the widely-used *Matplotlib* library. Matplotlib is a very flexible python library for plotting data. Here, however, we will focus on the simpler, but still effective `pylab` package:

In [None]:
import pylab

Note that if you're **not** using the Windows virtual desktop, and the above cell results in an error, you might need to install Matplotlib on your computer (ask if you need advice).

Let's take our data from the file `data.csv` from question 3 last week, which plots the decay in concentration of a reactant with time (if you haven't yet completed the question, either have another go or ask for help). Make sure you have copied this data file into the directory that you're working from this week before continuing.

In [None]:
# create two empty lists for storing the data:
xdata = []
ydata = []

# open the data file for reading:
f3 = open('data.csv', 'r')

# read in the data from the file:
for line in f3.readlines():
    fields = line.split(',') # split the string
    xdata.append(float(fields[0])) # append the data in the 1st column to the list x
    ydata.append(float(fields[1])) # append the data in the 2nd column to the list y

f3.close()

In [None]:
pylab.plot(xdata, ydata)
pylab.show()

Here we have created a 2D line plot, and displayed it on the screen. The concentration is plotted on the y-axis and time is plotted on the x-axis. I'll show you how to label the axes a bit later.

We can instead plot the x and y data as a scatter plot if we like (note that that in this case, there's so much data that it essentially appears as a line anyway), and save the resulting plot to a file:

In [None]:
pylab.scatter(xdata,ydata)
## saves a pdf file to your computer 
## (can also output e.g. .png or .eps files)
pylab.savefig('plot.pdf')

What if we want to plot a function, such as y = sin$^{2}$x? We could generate lists of numbers to represent `x` and `y`, but it is much easier to make use of the pylab functionality. To generate a list of 11 regularly spaced x-coordinates (between -5 and 5):

In [None]:
x = pylab.linspace(-5, 5, 11)
print(x)

Then pylab can act on lists to create a series of y-coordinates for each x point (note that pylab here is used in place of the math library):

In [None]:
y = pylab.sin(x)**2
print(y)

In [None]:
pylab.plot(x,y)
pylab.show()

Note that the function is not very well sampled. Go back and increase the number of grid points from 11, to e.g. 1001 to create a smoother plot.

We can also add further functions, legends, axis labels and style settings. Experiment with the plot below and see Chapter 3 of Hill for a full description of pylab options (note that these settings are pretty reasonable for producing easy-to-read plots for your report later):

In [None]:
import pylab
x = pylab.linspace(0.01, 5, 1001)
y1 = x**2
y2 = pylab.log(x)
pylab.plot(x, y1, label='x^2', color='red', linestyle='-', linewidth=4.)
pylab.plot(x, y2, label='ln(x)', color='blue', linestyle='--', linewidth=4.)
pylab.xlabel('x Label / units', fontsize='16.')
pylab.ylabel('y Label / units', fontsize='16.')
pylab.legend(fontsize='16.')
pylab.show()

You might also find the *histogram* function useful:

In [None]:
import pylab
## Produce a histogram of data values falling within 5 equally spaced bins
data = [36.2, 45.3, 56.3, 34.9, 37.5, 45.3, 44.2, 47.9, 39.2, 34.5, 36.2, 38.6, 41.9, 45.2, 56.3, 55.1]
pylab.hist(data, bins=5)
pylab.show()

**Question 1**. The interaction between two atoms can be modelled using the Lennard-Jones potential, which comprises a short-ranged repulsion at short distances (due to overlap of electron clouds) and an attractive longer-ranged interaction (due to van der Waals attraction):

$U(r) = 4\epsilon ((\frac{\sigma}{r})^{12} - (\frac{\sigma}{r})^{6})$

where for an Argon atom, $\epsilon$ = 0.185 kcal/mol and $\sigma$ = 3.54 Å. Write a function that computes U(r) and use it to plot the interatomic potential close to typical atom-atom separation distances. 

Use your plot to read off the equilibrium (lowest energy) separation distance and the energy at that position. What does the energy tend to as the separation gets very large?

### Errors and Exceptions

We have seen already in these workshops some examples of code that produce errors when run. Let's now look at the types of error that may occur, and how we can avoid them and/or handle them more cleanly.

The first type of error is a *syntax error*. You have probably accidentally encountered many of these already. These are mistakes in the grammar of the python language. For example, what is wrong with the following code snippets?:

In [None]:
print((3*4/(2*4))

In [None]:
a = 3
if a = 3:
    print('a is equal to 3')

In each case note that a message is produced indicating the SyntaxError, and pointing to the location of the error.

The other type of error is known as an *exception*. Exceptions are usually more serious and difficult to track down. They occur when an invalid operation is attempted, using an otherwise correct expression:

In [None]:
print(undefined_variable)

In [None]:
a, b = 0, 5
print(b / a)

This time the error message depends on the type of error that occurs. In the first example, we have tried to print the value of a variable that has not yet been set and in the second we are trying to divide by zero (see Table 4.1 of Hill for more types of exception).

As written above, the sources of the error are quite easy to spot, but when embedded in a much longer piece of code, these errors may be harder to track down. The [Y2K bug](https://en.wikipedia.org/wiki/Year_2000_problem) was a famous example of this type of error occurring in many codes, occurring when calendars changed from the year 1999 to 2000.

Let's take our Lennard-Jones function from above, and try evaluating it at a range of different interatomic distances:

In [None]:
def lj(r):
    u_r = (sigma/r)**12 - (sigma/r)**6
    u_r = 4 * eps * u_r
    return u_r

sigma = 3.54 # Angstrom
eps = 0.185 # kcal/mol

for r in range (0,5):
    print(lj(r))

In [None]:
def lj(r):
    try:
        u_r = (sigma/r)**12 - (sigma/r)**6
        u_r = 4 * eps * u_r
        return u_r
    except ZeroDivisionError:
        return ("The function is undefined at r = 0")

sigma = 3.54 # Angstrom
eps = 0.185 # kcal/mol
    
for r in range (0,5):
    print(lj(r))

This second example is called *exception handling*. It is acknowledged that there are certain situations in which the function might return an error, and this is treated more gracefully by handling the exception, then resuming operation. The `try:` clause is used for the main function, and the exception (in this case the ZeroDivisionError, but other types can be used) is raised in the `except:` clause and assigned a custom message. The execution of the code now continues after the exception.

**Question 2.** Fix the code below so that it exits gracefully for both values of the input variable `var`. *(Hint: try running it first to work out what type of error you should be looking out for).*

In [None]:
var = 'hello'
#var = 10

number = int(var)
print ("you entered number", number)

#### Testing 

The best way to avoid errors is to continually test our code, not only for syntax errors but also to check that it produces the expected output. In the example below, the `calculate_distance` function calculates the distance between two points in 3D (note that c1 and c2 are both lists of length 3, representing the x,y,z coordinates). Check you understand how it works:

In [None]:
import math

def calculate_distance(c1, c2):
    """Calculate distance in 3D between coordinates c1 and c2
    as sqrt((x1-x2)**2+(y1-y2)**2+(z1-z2)**2"""
    dist = 0.0
    for i in range (3):
        dist = dist + (c1[i] - c2[i])**2
    return math.sqrt(dist)

We now check the code works by writing a test for the function. For the two sets of coordinates defined below, we know what the answer should be (we expect a distance of 5.0). We therefore call the function and test whether the observed and expected results match. The `assert` keyword raises an exception if the expression is not `True`. Try introducing an error into the function above to raise an exception.

In [None]:
def test_calculate_distance():
    coord1 = [0, 0, 0]
    coord2 = [3, 4, 0]
    expected = 5.0
    observed = calculate_distance(coord1, coord2)
    assert observed == expected, 'the observed distance is not as expected'
    
test_calculate_distance()

### Computational Modelling

#### Estimating $\pi$. 
Please see the accompanying lecture material for an introduction to computational modelling and the example problem below, which aims to estimate the value of $\pi$ by 'throwing a dart' at a circle drawn inside a square. Check that you understand the code, and can successfully run it, then answer question 3 below.

In [None]:
import math
import random
import pylab

def estimate_pi(samples): 
    ''' Estimates the value of pi by finding the ratio of the area of 
    quarter of a unit circle to the area of a unit square.
    Input is the number of points to sample.
    Output is the estimate of pi and x,y coordinates of sample points.
    '''
    # Initialise the sample count and coordinate lists
    in_circle = 0
    xcoords = []
    ycoords = []
    
    # Loop over the number of sample points
    for i in range(samples):
        
        # Assign random x,y coordinates to the point
        # in the range 0 to 1
        x_rand = random.random()
        y_rand = random.random()
        
        # Calculate the distance from the origin
        dist = math.sqrt(x_rand**2 + y_rand**2)
        
        # If the distance <= 1, the point lies in the circle
        if dist <= 1:
            in_circle += 1
            xcoords.append(x_rand)
            ycoords.append(y_rand)
    
    # calculate estimate of pi
    pi = 4 * (in_circle / samples) 
    return (pi, xcoords, ycoords)

# call function and assign outputs to variables
calc_pi, xcoords, ycoords = estimate_pi(100)

# print our estimate of pi
print("Our estimate of pi is {0:8.4f}".format(calc_pi))

# plot the x,y coordinates of the samples in the circle
pylab.xlabel('x coordinate', fontsize='16.')
pylab.ylabel('y coordinate', fontsize='16.')
pylab.scatter(xcoords,ycoords)

**Question 3.** Adapt the code above to use a `for` loop to calculate pi for a range of sample sizes (e.g. 100 to 10,000,000), and plot a graph of the estimated value of pi as a function of the number of sample points. Comment on your plot.

(Hint: Use the line `pylab.xscale("log")` to plot the samples on a logarithmic scale. And note that there is no need for the function `estimate_pi` to return the x and y coordinates here.)

### Learning outcomes

In today's workshop, you have:
* Learned how to plot your data using pylab;
* Learned how to identify errors and handle exceptions;
* Started to recognise how to build elements of a computational model.