# Checkpoint 1

**Due: Tuesday, 18 October, 2022 at 11:00am BST**

Total points: 100

### Read This First
1. Use the constants provided in the cells. Do not use your own constants.

2. Wherever you see `raise NotImplementedError()`, remove that line and put your code there.

3. Put the code that produces the output for a given task in the cell indicated. You are welcome to add as many cells as you like for imports, function definitions, variables, etc.

4. Your notebook must run correctly when executed once from start to finish. Your notebook will be graded based on how it runs, not how it looks when you submit it. To test this, go to the *Kernel* menu and select *Restart & Run All*.

5. Once you are happy with it, clear the output by selecting *Restart & Clear Output* from the *Kernel* menu.

6. Submit through Noteable.

In [None]:
from matplotlib import pyplot as plt
%matplotlib inline
import numpy as np
import time

In [None]:
plt.rcParams['figure.figsize'] = (10, 6)
plt.rcParams['font.size'] = 14

# Problem 1 - 20 points

## Interpolation
You are given an array of x and y measurements that you need to interpolate on new locations.

The file *ch1_1_data.txt* is a text file that contains two arrays of Xs and Ys in two rows that need to be interpolated. The file *ch1_1_test.txt* is a text file that contains an array of X values on which you need to evaluate the interpolated function.

You will need to do the interpolation by chosing the best interpolation technique among linear interpolation, cubic splines and smoothing splines with different values of smoothing parameters.

You need to write the code that

* selects the best among different interpolation methods for a provided dataset.
* returns the array of the results of evaluating the best interpolation method on the test dataset. Note, the returned array of interpolated Y values should correspond directly to the X values from the test file. That is, the first returned Y value should correspond to the first X value and so on.

The resulting array will then be verified to provide a mean square error (MSE) with respect to the true values of **MSE < 0.1**, where

$
\large
\begin{align}
MSE = \frac{1}{N} \sum_{i=1}^{N} (y_{interp, i} - y_{true, i})^2.
\end{align}
$

In [None]:
import scipy.interpolate

In [None]:
def solve_task1():
    """
    This function needs select the best interpolation method for provided data  
    and return the numpy array of interpolated values at the locations specified in test.txt
    """
    #opens and reads file to put data into two lists
    f1 = open("ch1_1_data.txt","r")
    data = [line.strip() for line in f1]
    xdata_list = list(map(float,data[0].split()))
    ydata_list = list(map(float,data[1].split()))
    
    #converts lists into numpy arrays
    X = np.array(xdata_list)
    Y = np.array(ydata_list)
    
    #same thing as above for test data
    f2 = open("ch1_1_test.txt", "r")
    data = [line.strip() for line in f2]
    testdata = np.array(list(map(float,data)))
    
    #sorts data arrays
    ind = np.argsort(X)
    Xsort = X[ind]
    Ysort = Y[ind]
    
    #all values needed for computation of errors, the smoothing parameter was obtained through trial and error
    smooth = 0.173
    nsplit = 3 
    N = len(Xsort)
    pos = np.arange(len(Xsort))
    ret1 = 0
    ret2 = 0
    ret3 = 0
    
    #splits data into 3 sets and compares
    for i in range(nsplit):
        testsubset = pos%nsplit ==i 
        fitsubset = ~testsubset 
        curx = Xsort[fitsubset]
        cury = Ysort[fitsubset]
        testx = Xsort[testsubset]
        testy = Ysort[testsubset]
        
        Int1 = scipy.interpolate.UnivariateSpline(curx, cury, s = smooth) #.173 best 0.10225809269031416
        Int2 = scipy.interpolate.interp1d(curx, cury)
        Int3 = scipy.interpolate.CubicSpline(curx, cury)
        
        ret1 = ret1 + np.mean((Int1(testx) - testy)**2)
        ret2 = ret2 + np.mean((Int2(np.clip(testx,curx[0],curx[-1])) - testy)**2)
        ret3 = ret3 + np.mean((Int3(testx) - testy)**2)
        
    ret1 =ret1/nsplit
    ret2 =ret2/nsplit
    ret3 =ret3/nsplit
    
    #if statements determining best interpolation method
    if ret1 < ret2 and ret1 < ret3:
        Int1 = scipy.interpolate.UnivariateSpline(Xsort,Ysort, s = smooth)
        return Int1(testdata)
    elif ret2 < ret1 and ret2 < ret3:
        Int2 = scipy.interpolate.interp1d(Xsort, Ysort)
        return Int2(testdata)
    else:
        Int3 = scipy.interpolate.CubicSpline(Xsort, Ysort)
        return Int3(testdata)

We will add tests to the cell below when grading.

In [None]:
# This function will be tested with this 
# assert ( np.mean((solve_task1()- YTRUE )**2) < 0.1)

print ("Testing, testing...")



# Problem 2 - 80 points

This problem is divided into 5 tasks, worth the following point values:

1. 20 points
2. 15 points
3. 15 points
4. 20 points
5. 10 points

## The 1D time-independent Schrödinger equation

In one dimension, the time-independent Schrödinger equation is given by

$
\large
\begin{align}
\mathbf{H}\ \mathbf{\Psi} = E\ \mathbf{\Psi}
\end{align}
$,

where $\mathbf{H}$ is the Hamiltonian. Here, $E$ and $\mathbf{\Psi}$ are the eigenvectors and eigenvalues of $\mathbf{H}$, respectively. The Hamiltonian is expressed as

$
\Large
\begin{align}
H = -\frac{\hbar^2}{2m} \nabla^2 + V(r),
\end{align}
$

where $V(r)$ is the electric potential energy, given by

$
\Large
\begin{align}
V(r) = -\frac{e^{2}}{4 \pi \epsilon_{0} r}.
\end{align}
$

In matrix form, the Schrödinger equation is solved for N equally spaced values of r, such that r goes from ($r_{max}$/N) to $r_{max}$, where $r_{max} \sim 1.5$ nm is a sensible choice. To turn the Schrödinger equation into a matrix, $\textbf{V(r)}$ should be a diagonal matrix with the values of the potential at each r along the diagonal.

For this problem, the constants for the above equations have been defined for you in the cell below. Please use these for your calculations.
* $\frac{\hbar^{2}}{2m} = 0.0380998\ nm^{2} eV$ (called `c1` below)
* $\frac{e^{2}}{4 \pi \epsilon_{0}} = 1.43996\ nm\ eV$ (called `c2` below)
* $r_{0} = 0.0529177\ nm$ (the Bhor radius, called `r0` below)
* Planck constant $h = 6.62606896\times10^{-34} J s$ (`h`)
* Speed of light $c = 299792458\ m/s$ (`c`)

In [None]:
# Constants (use these)
c1 = 0.0380998 # nm^2 eV
c2 = 1.43996 # nm eV
r0 = 0.0529177 # nm
h  = 6.62606896e-34 # J s
c  = 299792458. # m/s
hc = 1239.8419 # eV nm

# Task 1 - 20 points

For this task, you will create the matrix representing $\mathbf{H}$ and find the two lowest eigenvalues. These correspond to the two lowest energy levels of the Hydrogen atom.

In the constants defined above, the theoretical values for the first two energy levels are given by

$
\Large
\begin{align}
e_{n} = \frac{c_2}{2 r_0 n^2},
\end{align}
$

where $r_{0}$ is the Bhor radius, given by

$
\Large
\begin{align}
r_{0} = \frac{4 \pi \epsilon_{0} \hbar^{2}}{m e^{2}}.
\end{align}
$

In the cells below, write a function that creates a matrix representing the Hamiltonian and returns the two lowest eigenvalues. **This function should take a single argument, N, for the size of the matrix.**

Use your function to determine the minimum value of N (within a factor of 2) required to compute the two lowest energy levels to within **0.05\%** of the theoretical values. Print the values of the two energy levels and the error for each, where the error is $abs((E_{calc} - E_{theo}) / E_{theo})$. $E_{calc}$ is the calculated value and $E_{theo}$ is the theoretical value. **Note, your code should iteratively call your function while increasing N (e.g., doubling it each time) and stop when the desired error is reached. It is not sufficient to simply run the code at a single value of N that meets the criteria.**

In [None]:
from scipy import sparse
from scipy.sparse import linalg as splinalg

In [None]:
def hamiltonian(N):
    #When provided with N creates a N by N hamiltonian which is then solved for its eigenvalues
    rmax = 1.5
    r = np.linspace(rmax/N,rmax,N)
    
    #hamiltonian using diags function
    K = ((N/rmax)**2) * sparse.diags([1,-2,1],[-1,0,1],shape = (N,N))
    V = sparse.diags([c2/r],[0],shape = (N,N))
    H = -c1 * K - V
    
    #use of scipy.sparce.eigsh to calculate eigenvalues
    eigval, eigvec = splinalg.eigsh(H, k = 2, which = "SA")
    eig_1 = eigval[0]
    eig_2 = eigval[1]
    return eig_1, eig_2

In [None]:
N = 3
e1 = -c2/(2*r0*1)
e2 = -c2/(2*r0*4)
X = 0

#while loop to calculate eigen values and error at increasing N
while X == 0:    
    e1calc, e2calc = hamiltonian(N)
    
    error1 = abs((e1calc - e1)/e1)
    error2 = abs((e2calc - e2)/e2)
    
    if error1 < 0.0005 and error2 < 0.0005:
        X = 1
    else:
        N = N*2

#print statements
print("Energy for n = 1: " + str(e1calc))
print("Error for n = 1: " + str(error1))
print("Energy for n = 2: " + str(e2calc))
print("Error for n = 2: " + str(error2))

## Task 2 - 15 points

Now, imagine the Coulomb law has a minor modification to it, and is now given by:

$
\Large
\begin{align}
F(r) = -\frac{e^{2}}{4 \pi \epsilon_{0} r^{2}} \left( \frac{r}{r_{0}} \right)^{\alpha},
\end{align}
$

where $\alpha = 0.01$ and $r_{0}$ is the Bhor radius, given by:

$
\Large
\begin{align}
r_{0} = \frac{4 \pi \epsilon_{0} \hbar^{2}}{m e^{2}}.
\end{align}
$

The electric potential is given by:

$
\Large
\begin{align}
V(r) = \int_{r}^{\infty} F(r^{\prime}) dr^{\prime}
\end{align}
$

Using the constants defined previously, write a function to calculate V(r) using the modified Coulomb law by numerically integrating the equation above. This function need only accept a single value of radius and not an entire array. Your function must agree with the analytical value to within $10^{-5}$ eV.

Your function should go in the cell below using the template for `potential_numerical`.

In another cell, make a plot of V(r) over the range of r values used in Task 1. Remember to label axes and show units.

In [None]:
from scipy import integrate

In [None]:
#function previously stated but into code form
def my_function(r,alpha):
    return -c2/(r**2) * (r/r0) ** alpha

In [None]:
#takes function and with a given r and alpha computes the potential
def potential_numerical(r, alpha):
    V, error = scipy.integrate.quad(my_function,r,np.inf,args = (alpha))
    return V

The cell below will test your function for a few values of radius.

In [None]:
def potential_exact(r, alpha):
    return c2*np.power(r,alpha-1)*np.power(r0,-alpha) / (alpha-1)

for my_r in np.linspace(0.01, 1, 100):
    diff = abs(potential_numerical(my_r, 0.01) - potential_exact(my_r, 0.01))
    assert(diff <= 1e-5)

Plot V(r) in the cell below.

In [None]:
#cell for plotting V(r) against r
rmax = 1.5
r_list = []
V_list = []

#for loop to append r and Vr values to lists
r_range = np.linspace(rmax/N,rmax,N)
for i in r_range:
    V = potential_numerical(i,0.01)
    r_list.append(i)
    V_list.append(V)

plt.plot(r_list,V_list)
plt.xlabel("r (nm)")
plt.ylabel("V(r) (eV)")
plt.show()

## Task 3 - 15 points

Write a function to calculate the first 2 energy levels (eigenvalues of $H$) for $\alpha = 0.01$ and print out the values in eV. The values must be accurate to 0.01 eV. Use the function template `calculate_energy_levels_modified` below for your function. It is fine to call functions you've already written. 

In the cell after, plot the difference $\Delta E$ between the two lowest energy levels as a function of $\alpha$ for $\alpha = 0$ and $0.01$. Remember axes labels and units.

In [None]:
from scipy import linalg

In [None]:
def calculate_energy_levels_modified(N, alpha):
    #Similiar to previous energy level calculator, but with modified potential taken into account
    rmax = 1.5
    r = np.linspace(rmax/N,rmax,N)
    
    K = ((N/rmax)**2) * sparse.diags([1,-2,1],[-1,0,1],shape = (N,N)).toarray()
    
    #creates a list of V(r) values at the different r
    V_list = []
    for i in r:
        V_list.append(potential_numerical(i,alpha))
        
    V = sparse.diags(V_list,shape = (N,N)).toarray()

    H = (-c1 * K) + V 
    eigval = scipy.linalg.eigh(H,eigvals_only = True,subset_by_index = [0,1]) #scipy.linalg.eigh

    eig_1 = eigval[0]
    eig_2 = eigval[1]
    
    return eig_1, eig_2

The cell below will test your function against the correct values.

In [None]:
N = 1024
alpha = 0.01
E1, E2 = calculate_energy_levels_modified(N, alpha)

In the cell below, make the plot of $\Delta E$ vs. $\alpha$ as instructed above.

In [None]:
#creates lists of delta E and alpha to plot
deltaE = []
alpha_list = np.linspace(0,0.01,5)

for i in alpha_list:
    E1,E2 = calculate_energy_levels_modified(N,i)
    deltaE.append(abs(E2 - E1))

In [None]:
plt.plot(alpha_list,deltaE)
plt.xlabel("alpha")
plt.ylabel("deltaE (eV)")

plt.show()

## Task 4 - 20 points

The transition between the 1st and 2nd states is known as the Lyman-$\alpha$ transition. The photon emitted by this transition will have a wavelength, $\lambda$, given by

$
\Large
\begin{align}
\lambda = \frac{hc}{\Delta E}.
\end{align}
$

Imagine the wavelength of this transition has been measured as $\lambda = 121.5 \pm 0.1$ nm. What is the maximum value of $\alpha > 0$ consistent with this measurement (i.e., the largest $\alpha$ such that the predicted and measured wavelengths differ by less than 0.1 nm)?

Using the template `find_alpha_max`, write a function that performs the above computation and returns the value of $\alpha_{max}$. Your value for $\alpha_{max}$ should be within 1% of the correct answer.

In [None]:
def find_alpha_max():
    #calculates the maximum value of alpha such that the difference between the predicted and measured wavelength is 0.1nm
    N = 1024
    lamda  = 121.5
    alpha = 0
    alpha_max = 0
    step = 0.01
    con = True
    rep = 0
    
    #while statement to compute eigenvalues, calculate wavelength, then compare to measured result
    while rep < 50:
        E1, E2 = calculate_energy_levels_modified(N, alpha)
        lamda_calc = hc/(E2 - E1)
        diff = abs(lamda - lamda_calc)
            
        if diff < 0.1:
            alpha_max = alpha
            alpha = alpha + step
        else:
            alpha = alpha - step
            alpha_max = alpha
            step = step/10
            
        rep = rep + 1
            
    return alpha_max

The cell below will run your function. You will not be told the correct answer.

In [None]:
amax = find_alpha_max()
print (f"alpha_max = {amax}.")

## Task 5 - 10 points

Knowing the shape of the matrix for of $\textbf{H}$, is it possible to greatly increase the accuracy of the energy level calculation without a significant increase in computation time? In the cell below, write a function to compute the first two energy levels using the original (unmodified) potential. Your function should run in 15 seconds or less and compute the first two energy levels each to within an accuracy of $5\times10^{-6}$.

In [None]:
from scipy.linalg import eig_banded

In [None]:
def calculate_energy_levels_super():
    #creates an array of the bands of the H matrix then calculates the eigenvalues
    N = 16384
    rmax = 1.5
    r = np.linspace(rmax/N,rmax,N)
    
    K = (np.ones(N)* -2 * -c1 * ((N/rmax)**2)) - (np.ones(N) * c2/r)
    K2 = np.zeros(N)
    K2[:N-1] = -c1 * ((N/rmax)**2)
    A = np.array([K,K2])
    
    w = eig_banded(A, lower=True, eigvals_only=True,select='i', select_range=[0,1])
    w1 = w[0]
    w2 = w[1]
    
    return w1,w2

In [None]:
t1 = time.time()
my_e1, my_e2 = calculate_energy_levels_super()
t2 = time.time()
print (f"Calculation took {t2-t1} seconds.")

e1_th = -c2 / (2 * r0)
e2_th = e1_th / 4

er1 = abs((my_e1 - e1_th) / e1_th)
er2 = abs((my_e2 - e2_th) / e2_th)
print (f"Err1 = {er1}, Err2 = {er2}.")