# Exercise:
## Arrhenious Plot

In this exercise, we will repeat linear regressions and how to use the logarithm of the 

- We will learn how to use create `np.log`.
- We will learn how to select a specific range of data.
- We will learn how to use `scipy.linregress` to perform linear regression.



## Exercise B.3 -- Arrhenius Curve Fitting

In this exercise we have a look at diffusion data (either from experiment or theoretical calculations) and want to obtain the activation energy $E_A$.

Many dynamical properties in chemistry (such as rate constants, diffusion, etc.) follow an [Arrhenian temperature dependency](https://en.wikipedia.org/wiki/Arrhenius_plot) given as 

$$ y(T) = y_0 e^{-\frac{E_A}{RT}} $$ 

with $y_0$ being the pre-exponential factor, R and T are the molar gas constant (8.3145 J mol$^{-1}$ K $^{-1}$) and temperature, respectively.

One potential option to obtain $E_A$ is a non-linear exponential fit, but this is known to be less reliable than its linear counterpart!

Consequently, Mr. [Svante Arrhenius](https://en.wikipedia.org/wiki/Svante_Arrhenius) used a linearization of the equation, which in case of the diffusion coefficient looks like this: 

$$ ln(D) = ln(D_0) - \frac{E_A}{R}\frac{1}{T} $$

This corresponds to a linear equation


$$ y = a\cdot x + b$$

$$y = ln(D)$$

$$ x= \frac{1}{T}$$ 

Applying standard linear regression we obtain


$$a=ln(D_0)$$
$$b = -\frac{E_A}{R}$$

From this we can directly access the activation energy $E_A$ via:
$$ E_A = - a \cdot R $$

Easy! :D


This time the files are in [csv-format](https://en.wikipedia.org/wiki/Comma-separated_values)
(= comma separated values), *i.e.*
the different data columns are
separated by comma symbols.

Luckily, we can again use the
command *np.loadtext()*, but
we have to indicate the comma
by adding *delimiter = “,”*.

Most programs such as Excel, Origin and scientific software can write data sets in this format. If you want to use python in your research, this is most likely the most common file format to input you data sets.

### Data

We have three data sets with diffusion coefficients $D$ in nm $^2$ /ps at different temperatures $T$ in K.

- `D_vs_T_v1.csv` (Diffusion coefficient vs. temperature)
- `D_vs_T_v2.csv` (Diffusion coefficient vs. temperature)
- `D_vs_T_v3.csv` (Diffusion coefficient vs. temperature)


### Data Path:
https://raw.githubusercontent.com/stkroe/PythonForChemists/main/course/data/exercises/Arrhenius/

### Task

- Load the data from the files `D_vs_T_v1.csv`, `D_vs_T_v2.csv` and `D_vs_T_v3.csv` into numpy arrays.
- Create a plot of the diffusion coefficient $D$ vs. temperature $T$.
- Create a plot of the logarithm of the diffusion coefficient $ln(D)$ vs. $1/T$.
- Perform a linear regression of the data using `scipy.stats.linregress` only in the linear region of the data.
- Calculate the activation energy $E_A$ from the slope of the linear regression.


### Questions

- Which data set has the highest activation energy?