# Quiz 2: python essentials and data analysis

#### Date: 20 September 2024

#### Credits: 20 points

- When you finish, please send a single **.ipynb** file via email to wbanda@yachaytech.edu.ec


- This classwork is **individual**. Please include your name in the notebook.


- Copying and pasting code from **AI applications is a breach of academic integrity**. Code should be your own!


- Within a **single python notebook**, solve the following problems:

## Name: 

### Problem 1 (Fermi-Dirac distribution, 10 points):

The **Fermi-Dirac distribution** describes the probability that a quantum state with energy $E$ is occupied by a fermion (e.g. by an electron) at thermal equilibrium. The Fermi-Dirac distribution is:

$$f(E) = \frac{1}{e^{(E - E_{\rm F})/k_B T} + 1}$$

Where:

- $f(E)$ is the probability that a state with energy $E$ is occupied.

- $E$ is the energy of the state.

- $E_{\rm F}$ is the chemical potential (also called the Fermi level at $T=0\,\rm K$).

- $k_B$ is the Boltzmann constant.

- $T$ is the absolute temperature.

The exponential term $e^{(E - E_{\rm F})/k_B T}$ controls how the occupancy changes with energy and temperature.

### Tasks:

(a) Create a python lambda function to convert the Boltzmann constant from SI units ($\rm J/K$) to units of $\rm eV/K$ (**Hint:** using the fundamental constants from scipy may be helpful).

(b) Create a python function that reads in the Fermi level, the absolute temperature, and the energy of the state, and then returns the (Fermi-Dirac) probability distribution function, $f(E)$. The Boltzmann constant should be in units of $\rm eV/K$.

(c) Define a python dictionary containing the symbols of the $5$ following materials as "keys" with their respective Fermi levels ($E_{\rm F}$) as elements.

- Silicon ($\rm Si$) has $E_{\rm F}=1.1\,\rm eV$

- Gallium Arsenide ($\rm Ga\,As$) has $E_{\rm F}=1.4\,\rm eV$

- Gold ($\rm Au$) has $E_{\rm F}=5.5\,\rm eV$

- Copper ($\rm Cu$) has $E_{\rm F}=7\,\rm eV$

- Aluminum ($\rm Al$) has $E_{\rm F}=11.6\,\rm eV$



(d) Generate a 1D energy vector covering a reasonable range of energies in $\rm eV$, and fix the temperature of the gas at $100\,\rm K$.

(e) Using the values of Fermi levels ($E_{\rm F}$) from the dictionary in point (c), and the fixed energy vector and temperature defined in point (d), call the function created in point (b) to obtain a set of arrays with the Fermi-Dirac distributions, $f(E)$, for each material. (**Hint:** a for loop can help access the dictionary elements).

(f) Use matplotlib to make a single high-quality labeled plot of the energy distribution of all 5 materials for the set temperature. The plot should have $f(E)$ on the Y axis and $E$ on the X axis.

(g) Repeat steps (d,e,f) for 3 more temperatures ($0\,\rm K$, $400\,\rm K$, and $1000\,\rm K$), and report all the results in a single 4-panel high-quality labeled figure. Each panel should show the results for each temperature ($0\,\rm K$, $100\,\rm K$, $400\,\rm K$, and $1000\,\rm K$).


### Analysis:

Based on your plots, answer the following questions:

(h) What happens with the energy distributions at low temperatures? Particularly, at $0\,\rm K$?

(i) What happens with the fermion distributions in the materials when we increase the temperature? Why?

(j) Can we classify the materials in groups using their $f(E)$ distributions at a fixed temperature? Why do some materials have higher $E_{\rm F}$ than others?

### Problem 2 (Analysis of atmospheric $^{14}CO_2$, 10 points):

The value of $\Delta^{14}\text{CO}_2$ is defined as the relative difference in the ratio of $^{14}\text{C}$ to $^{12}\text{C}$ in a sample compared to a standard, corrected for isotopic fractionation and radioactive decay. It is expressed as:

$$
\Delta^{14}\text{CO}_2 = \left( \frac{\left( \frac{{^{14}\text{C}}}{{^{12}\text{C}}} \right)_{\text{sample}}}{\left( \frac{{^{14}\text{C}}}{{^{12}\text{C}}} \right)_{\text{standard}}} - 1 \right) \times 1000
$$

where:

-  $\left( \frac{{^{14}\text{C}}}{{^{12}\text{C}}} \right)_{\text{sample}} $ is the ratio of $^{14}\text{C}$ to $^{12}\text{C}$ in the sample,

- $\left( \frac{{^{14}\text{C}}}{{^{12}\text{C}}} \right)_{\text{standard}} $ is the ratio in a standard reference material.


The result is given in permil (‰).


### Data file:

Please download the data file from here:

https://github.com/wbandabarragan/physics-teaching-data/tree/main/1D-data/BHD_14CO2_datasets_20211013.csv


This data file has 60 years of $\Delta^{14}\text{CO}_2$ measurements from New Zealand. The measurements show the rise of the $^{14}\rm C$ due to the so-called **bomb spike** (from nuclear bomb testing), and the subsequent decline in $\Delta^{14}\text{CO}_2$ due to ban of nuclear bomb tests, the natural carbon cycle, and the increase of fosil fuel-based $\rm CO_2$ emissions. Scientists use this data to understand how quickly atmospheric $\rm CO_2$ flows in and out of the oceans and terrestrial ecosystems. We will solely use the first two columns of the file, following the header, i.e. the columns labeled as "Date" and "D14C_trend".

#### Reference:
https://doi.org/10.5194/acp-17-14771-2017

### Tasks:

(a) Inspect the structure of the data file. Then, create an appropriate IO python function that reads the filename, opens the data file using pandas, skips the header lines, place the first two columns ("Date" and "D14C_trend") into pandas objects, and returns them as numpy arrays.

(b) Call your IO function developed in (a) and obtain the time/date axis (in $[yr]$) and $\Delta^{14}\text{CO}_2$ (in ‰). Then, make a high-quality labeled plot of $\Delta^{14}\text{CO}_2$ (in the Y axis) versus time (in the X axis).

(c) Is the relation between the two variables linear? Is it monotonic?

(d) Create a python function that identifies the year/date ($t_{\rm max}$) at which the bomb spike reached a maximum ($\Delta^{14}_{\rm max}\text{CO}_2$), and returns the peak coordinate pair: $(t_{\rm max},\Delta^{14}_{\rm max}\text{CO}_2)$.

(e) Propose a physically-motivated model for the data. Write down your proposed model in a markdown cell, and clearly indicate what the variables and free parameters are. Justify the number of free parameters that you have chosen. **Hint:** Since $\Delta^{14}\text{CO}_2$ involves radioactive decay, using a piece-wise function with exponentials ($\propto \exp{(\pm k\,t)}$ with $k$ being the growth/decay rate) may be a good choice. You should also use the peak coordinate pair $(t_{\rm max},\Delta^{14}_{\rm max}\text{CO}_2)$ computed in (d) to define your model and reduce the number of free parameters.

(f) Carry out a regression using python tools (e.g. scipy's **curve_fit** function). Report the best-fit function, and comment: what is the decay rate, $k_{\rm fit}$ of $\Delta^{14}\text{CO}_2$? **Hint:** Since the fitting function is not a simple polynomial function, in some implementations it may help to aid curve_fit by providing initial guesses for the free parameters in the regression (see: p0 argument).

(g) Report the result of your regression including the uncertainties associated with each free parameter in your model, and calculate the global uncertainty obtained via error propagation.

(h) Make a high-quality labeled plot showing $\Delta^{14}\text{CO}_2$ (in the Y axis) versus time (in the X axis), showing both the empirical data and the best-fit model (obtained from your physically-motivated model, including the uncertainties). Does your model explain the empirical data?

### Analysis:

(i) Now, you will compare your findings with four semi-empirical decay rate predictions, taken from the litarature. Create a function that uses your model, but uses four predictions for the decay rate ($k_{\rm atm}$, $k_{\rm bio}$, $k_{\rm oce}$, and $k_{\rm sed}$) to return four semi-empirical $\Delta^{14}\text{CO}_2$ decay lines as arrays. Consider the following decay rates based on four contributing factors:

- **Atmospheric decay** due to the mixing and exchange of carbon dioxide in the atmosphere predicts $k_{\rm atm}=0.10\,\rm yr^{-1}$.

- **Biosphere decay** due to vegetation, soil, and organic matter predicts $k_{\rm bio}=0.02\,\rm yr^{-1}$.

- **Ocean decay** due to the long-term ocean absorption and storage of carbon predicts $k_{\rm oce}=0.01\,\rm yr^{-1}$.

- **Sedimentary decay** due to decomposition and mineralization predicts $k_{\rm sed}=0.001\,\rm yr^{-1}$.

(j) Make a high-quality labeled plot showing $\Delta^{14}\text{CO}_2$ (in the Y axis) versus time (in the X axis), showing the empirical data, your best-fit model (obtained from your physically-motivated model, including the uncertainties), and four additional lines (one for every contributing factor above). Can we explain the observed decay of $\Delta^{14}\text{CO}_2$ as due to a single one of these contributing factors? If not, propose a possible decay model for $k_{\rm fit}$ based on all of them.