<a href="https://colab.research.google.com/github/schilds29/OCIII/blob/main/workshop1-python.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

$x={-b \pm \sqrt{b^2 - 4ac} \over 2a}$ for
$ax^2 + bx + c = 0$

# Python Basics

Library – Python uses libraries to enable users to perform tasks without writing codes from scratch. The libraries that are used in the Workshop are Math, Numpy and Matplotlib. Scipy is another useful library for scientific computing.

## 1. Math – https://docs.python.org/3/library/math.html

### Example 1a
```python
from math import factorial
factorial(20)
```

### Example 1b
```python
import math
math.factorial(20)
```

In [1]:
from math import factorial
factorial(20)

2432902008176640000

In [2]:
import math
math.factorial(20)

2432902008176640000

Example 1c – a short Python script to calculate using an equation

$k = Ae^{-E_a \over {RT}}$

${\ln} k = {\ln} A - {E_a \over {RT}}$

${E_a \over {RT}} = {\ln} A - {\ln} k$

${E_a} = {RT}({\ln} A - {\ln} k)$

In [3]:
import math
R = 8.314 #in J per mol per K
T = 750 #in K
A = 1.86e15 #in per sec
k = 0.00018 #in per sec

Ea = R * T * (math.log(A) - math.log(k))

print ("The activation energy = " + str(Ea) + " J mol" + u'\u207B\u00B9')

The activation energy = 273002.0785401358 J mol⁻¹


### Question 1

(a) Write a script using math.factorial, math.log to calculate ${\ln} n!$, where $n$ is a 2-digit number of your choice.


In [5]:
import math
n = 64 # my choice of 2-digit number
x = math.log(math.factorial(n))
print ("ln(64!) = " + str(x))

ln(64!) = 205.1681994826412


(b) Copy the script above and add a line in your script to use the equation of Stirling’s approximation to calculate ${\ln} n!$

In [8]:
import math
n = 64
stirApprox = (math.sqrt(2*math.pi*n))*math.pow((n / math.e),n)
x = math.log(stirApprox)
print ("Stirling's approximation = " + str(stirApprox))
print ("hence, ln(64!) = " + str(x))

Stirling's approximation = 1.2672182368568135e+89
hence, ln(64!) = 205.1668974099035


(c) Add another line to calculate the difference between the real value (a) and the approximated value. Calculate the %difference, which is the difference/real value * 100%.

In [11]:
import math
n = 64
a = math.log(math.factorial(n))

stirApprox = (math.sqrt(2*math.pi*n))*math.pow((n / math.e),n)
b = math.log(stirApprox)

percDiff = (b/a)*100
print ("Percentage difference = " + str(percDiff))

Percentage difference = 99.99936536327706


(d) Add two lines to print the following statements.

${\ln} n!$ is (the real value) and approximately (the approximated value) according to Stirling’s approximation.

The difference is (%difference) of the real value.

In [13]:
import math
n = 64
a = math.log(math.factorial(n))

stirApprox = (math.sqrt(2*math.pi*n))*math.pow((n / math.e),n)
b = math.log(stirApprox)

percDiff = (b/a)*100

print ("ln(n!) is " + str(a) + " and approximately " + str(b) + " according to Stirling's approximation.")
print ("The difference is " + str(percDiff) + " of the real value.")

ln(n!) is 205.1681994826412 and approximately 205.1668974099035 according to Stirling's approximation.
The difference is 99.99936536327706 of the real value.


(e) Starting from (a) to (d), perform the tasks for (i) ${\ln} 100!$ and (ii) ${\ln}10000!$

In [14]:
import math
n = 100
a = math.log(math.factorial(n))

stirApprox = (math.sqrt(2*math.pi*n))*math.pow((n / math.e),n)
b = math.log(stirApprox)

percDiff = (b/a)*100

print ("ln(n!) is " + str(a) + " and approximately " + str(b) + " according to Stirling's approximation.")
print ("The difference is " + str(percDiff) + " of the real value.")

ln(n!) is 363.73937555556347 and approximately 363.73854222500785 according to Stirling's approximation.
The difference is 99.99977089899758 of the real value.


In [15]:
import math
n = 10000
a = math.log(math.factorial(n))

stirApprox = (math.sqrt(2*math.pi*n))*math.pow((n / math.e),n)
b = math.log(stirApprox)

percDiff = (b/a)*100

print ("ln(n!) is " + str(a) + " and approximately " + str(b) + " according to Stirling's approximation.")
print ("The difference is " + str(percDiff) + " of the real value.")

OverflowError: math range error

In [16]:
import math
n = 10000
a = math.log(math.factorial(n))

# Use math.log and properties of logarithms to avoid calculating large numbers directly:
# ln(stirApprox) = ln(sqrt(2*pi*n) * (n/e)^n)
#               = ln(sqrt(2*pi*n)) + ln((n/e)^n)
#               = 0.5 * ln(2*pi*n) + n * ln(n/e)
#               = 0.5 * (ln(2) + ln(pi) + ln(n)) + n * (ln(n) - ln(e))
#               = 0.5 * (ln(2) + ln(pi) + ln(n)) + n * (ln(n) - 1)
b = 0.5 * (math.log(2) + math.log(math.pi) + math.log(n)) + n * (math.log(n) - 1)

percDiff = (b/a)*100

print ("ln(n!) is " + str(a) + " and approximately " + str(b) + " according to Stirling's approximation.")
print ("The difference is " + str(percDiff) + " of the real value.")

ln(n!) is 82108.92783681436 and approximately 82108.92782848103 according to Stirling's approximation.
The difference is 99.99999998985089 of the real value.


## 2. Numpy – https://numpy.org/doc/stable

###Example 2a – array
```python
import numpy as np
a = np.array([1,2,3,4,5,6])
print("My first Python array is " + str(a))
```

###Example 2b – linearly spaced array
```python
import numpy as np
b = np.linspace(0,50,51)
print("My first linearly spaced array in Python is " + str(a))
```

###Example 2c – zeros & ones
```python
import numpy as np
c = np.zeros(5)
d = np.ones(5)
print(str(c) + str(d))
```


In your pre-workshop exercise, there is an example (In [34]) showing how square values can be calculated. In Python, exponents are expressed using \*\* (note that numpy is not needed for \*\* as it is a basic function). For example, 2\*\*2 is 4 (try it) and 3\*\*3 is 27.

### Question 2

(a) Use numpy to create a linearly space array of even numbers ranging from 0 to 50 (i.e., [0. 2. 4. 6. …… 50.]. Show your work.

(b) Use numpy to create an array of 51 ones. Then, add this array to Array b (from Example 2b) to form a new array that ranges from 1 to 51 (i.e., [1. 2. 3. 4. …… 51.]. Show your work.

(c) Use the \*\* function to create a new array with values that are squares of those of Array b, i.e., [0. 1. 4. 9. 16. …… 2500.]. Show your work.



## 3. Matplotlib – https://matplotlib.org/stable/index.html

###Example 3a – making a linear plot
```python
import numpy as np
b = np.linspace(0,50,51)
c = b
import matplotlib.pyplot as plt
plt.figure(figsize=(8,6))
plt.plot(b,c,'o',label="linear")
plt.xlabel("x")
plt.ylabel("y")
plt.legend()
```

Spot the difference – rerun the code above by replacing line 6 with:

1.
```python
plt.plot(b,c,'o-',label="linear")
```

2.
```python
plt.plot(b,c,'ro-',label="linear")
```

3.
```python
plt.plot(b,c,'gx',label="linear")
```

### Question 3
(a) The plot above has a slope of 1. Modify the codes above to make a plot with a slope of 2. Make the size of the plot 10,10. Label the x-axis as “Number” and y-axis as “Intensity”. Show the data points as magenta crosses and connect the data points using a dotted line. Provide a legend that states “Magenta linear dots”.

###Example 3b – loading data to make a plot
Download the files abs.csv & fluor.csv and upload them to this working directory.

```python
import pandas as pd
dataframe = pd.read_csv("abs.csv", header=None)
data = dataframe.values
x, y = data[:,0], data[:,1]
import matplotlib.pyplot as plt
plt.figure(figsize=(8,6))
plt.plot(x,y,'b-',label="Abs")
plt.xlabel("x")
plt.ylabel("y")
plt.legend()
```

Note the maximum y-value of this plot. Now replace the 4th line with:*italicized text*
```python
x, y = data[:,0], data[:,1]/max(data[:,1])
```

What is the difference with this modification? What does the max function do?

*Your* comments:

Next, remove
```python
/max(data[:,1])
```
from the line, replace _abs.csv_ with _fluor.csv_, label the plot _“Fluor”_ and run the code.

Once again, note the maximum y-value and use the max function on the y-data of _fluor.csv_

Question 3(b) Plot both sets of data with a y-maximum of 1 in one plot by using the hints below.

(i) Use two distinct variables (e.g. dataframe1 & dataframe2) to name the strings from abs.csv and fluor.csv
(ii) use the same approach for the next step, which defines the values of the read strings
(iii) define the additional x, y values (e.g. x2, y2)
(iv) make the y-maximum value 1 using the example above
(v) add an additional line to add the second set of data, e.g. plt.plot(x2,y2,'r-',label="Fluor")
(vi) Make the absorption curve blue (solid line) and fluorescence curve red
(vii) Label the x-axis as “Wavelength (nm)” and y-axis as “Normalised absorbance or fluorescence.

Question 3(c) Download carpathite.csv & dawsonite.csv, load them in Python,
plot both sets of data (blue and red) with normalised intensity (maximum y-value of 1) in the same plot. Label the y-axis “Normalised intensity” and x-axis “Raman shift (cm$^{-1}$)”

Hint: use _(cm$^{-1}$)_ in your x-axis label.

## 4. Scipy – https://www.scipy.org

Example 4a - curve fitting
==========================
Spend ~10 minutes to understand the code

```python
import matplotlib.pyplot as plt
import numpy as np
import scipy.optimize as sp

def func(x, a, b, c):
    return a * np.exp(-b * x) + c

#Generate the data to be fit with some noise:
xdata = np.linspace(0, 4, 50)
y = func(xdata, 2.5, 1.3, 0.5)
np.random.seed(1729)
y_noise = 0.1 * np.random.normal(size=xdata.size)
ydata = y + y_noise

plt.figure(figsize=(8,6))
plt.plot(xdata, ydata, 'bo', mfc='none', label='data')

#Fit1: fit for the parameters a, b, c of the function func:
popt, pcov = sp.curve_fit(func, xdata, ydata)
#popt is the array of the optimised values and pcov contains the estimated covariance of popt

plt.plot(xdata, func(xdata, *popt), 'r-',
         label='fit: a=%.3f, b=%5.3f, c=%5.3f' % tuple(popt))

#Fit2: constrain the optimization to the region of 0 <= a <= 3, 0 <= b <= 1 and 0 <= c <= 0.5:
popt, pcov = sp.curve_fit(func, xdata, ydata, bounds=([0.0, 0.0, 0.0], [3.0, 1.0, 0.5]))

plt.plot(xdata, func(xdata, *popt), 'g--',
         label='fit: a=%5.3f, b=%5.3f, c=%5.3f' % tuple(popt))

plt.xlabel('x')
plt.ylabel('y')
plt.legend()
plt.show()
```

### Question 4
(a)
Generate a code by using the following information.

(i) Define a linear function, $ax + b$ (use values other than $a=1$ and/or $b=0$.

(ii) Generate data ($x$ ranges from -10 to 10 with 101 points) with some noise (~10% of the maximum $y$ value).

(iii) Fit for the parameters $a, b$ of the function func.

(iv) Plot the data as red empty circles and fit curve as a black solid curve.


Question 4(b)
Generate a code by using the following information.

(i) Use the same linear function as above for this part.

(ii) Generate data ($x$ ranges from -10 to 10 with 101 points) with some noise (<span style='color:Red'> only ~1% </span> of the maximum $y$ value).

(iii) Fit for the parameters $a, b$ of the function func.

(iv) Plot the data as red empty circles and fit curve as a black solid curve.



Provide a brief comment on the effect of less noise on the fit results.

Your comment:

Question 4(c)
Generate a code by using the following information.

(i) Define a quadratic function using variables $a, b$ and $c$.

(ii) Generate data ($x$ ranges from -10 to 10 with 101 points) with some noise (~10% of the maximum $y$ value).

(iii) Fit for the parameters $a, b, c$ of the function func.

(iv) Plot the data as blue circles and fit curve as a black curve.

