# Exercises 8
This notebook contains exercises for the eight meeting.

## More exercises with SciPy
These exercises focus mostly on the use of scipy.optimize.minimize.
Some of these exercises can be quite difficult, but optimize.minimize can be a very powerful tool to be able to use. 

Remember that you can use np.genfromtxt("filename", skip_header=x, delimiter=",") to load in well-structured datafiles.

You can call functions from within functions, as an example:

In [None]:
from functiontools import partial

def f(x, a, b):
    return a*x + b

def error_function(parameters, x_values, y_values):
    total_error = 0
    for i in range(0, len(x_values)):
        total_error += abs(y - f(x, paramters[0], parameters[1]))
    return total_error

baked_error_function = partial(error_functon, x_values=x, y_values=y)

Look at the above example code and compare it with the solution to Ex 7.4.
You can see that we can call f(x, a, b) from within the function that returns the total error. 
When doing it like this, we still have the original function f(x, a, b) that we can use for plotting and other purposes.

After solving these exercises you are able to solve alot of different complex tasks.

## Ex 8.1 - functools.partial reminder
In this exercise we want to fit some data to a function of the form:

$$f(x) = A\cdot \sin (B\cdot x + C)$$

The data to be fitted against can be generated by the following code:

In [None]:
x = np.linspace(0, 20, 200)
y = 13*np.sin(0.5*x+7) + np.random.normal(size=200)

Now try to make a fit of $f(x)$ to the generated data. 
This can be done following the same procedure use in Ex 7.4

## Ex 8.2 - Multi-linear fit
A widely used method in computational chemistry is called docking.
In docking a molecule is placed inside a protein, and the energy is evaluated in a rough way.
The total energy is given as:

$$\Delta G_\text{tot} = \Delta G_\text{van der Waals} + \Delta G_\text{cavity} + \Delta G_\text{electrostatic}$$

Often this method yields very poor results, and needs tuning.
In "data/4HEU.csv" the data needed can be found. 
The data in the coloums are $\Delta G_\text{experimental}$, $\Delta G_\text{van der Waals}$, $\Delta G_\text{cavity}$ and $\Delta G_\text{electrostatic}$.

Try make a plot of $\Delta G_\text{experimental}$, against $\Delta G_\text{tot}$.

Some scaling constants can be indtroduced.
$\Delta G_\text{tot}$ can now be written as:

$$\Delta G_\text{tot} = a_1\cdot \Delta G_\text{van der Waals} + a_2\cdot \Delta G_\text{cavity} + a_3\cdot \Delta G_\text{electrostatic}$$

Use scipy.optimize.minimize to minimze the error between $\Delta G_\text{tot}$ and $\Delta G_\text{experimental}$.

- Hint: Make a function that returns the total error between $\Delta G_\text{tot}$ and $\Delta G_\text{experimental}$. This function should take $a_1$, $a_2$, $a_3$, $\Delta G_\text{van der Waals}$, $\Delta G_\text{cavity}$ and $\Delta G_\text{electrostatic}$ as arguments. 
- Hint: Use partial to set the $\Delta G_\text{van der Waals}$, $\Delta G_\text{cavity}$ and $\Delta G_\text{electrostatic}$ to constant values.
- Hint: Use scipy.optimize.minimize to find the optimal $a$'s.

Try make a plot of $\Delta G_\text{experimental}$, against $\Delta G_\text{tot}$.

## Ex 8.3 - Gaussian fit
In KE522 you learned about Gaussian basis sets used in quantum chemistry. In practis complex algorithms are used to make the best basis-set. In this exercise we will investigate the justification learned in KE522 for using Gaussian functions.

The 1s orbital of the hydrogen atom have the form in one dimension:

$$f(x) = \frac{1}{\sqrt{\pi}}\exp (-|x|)$$

Try make a plot of this function.

A Gaussian function is of the form:

$$g(x) = c\cdot \exp (-a\cdot x^2)$$

Try make the following three fits of $f(x)$:

$$f_\text{approximated}(x) = c\cdot \exp (-a\cdot x^2)$$

and,

$$f_\text{approximated}(x) = c_1\cdot \exp (-a_1\cdot x^2) + c_2\cdot \exp (-a_2\cdot x^2)$$

and,

$$f_\text{approximated}(x) = c_1\cdot \exp (-a_1\cdot x^2) + c_2\cdot \exp (-a_2\cdot x^2) + c_3\cdot \exp (-a_3\cdot x^2)$$

Make a plot of each fit on top of the "real" function $f(x)$. 
Does it seem justified that we can built basis-sets using Gaussian functions?
Try to also plot the individual Gaussians from the fits, to see how the different contributions looks like.

- Hint: Make the fit in the interval -10 to 10. You can get the x-values by np.linspace(-10, 10, 1000). You can get the target values by evaluating f(x) for the generated x-values.

## Ex 8.4 - Fitting charges
With quantum mechanical methods the electrostatic potentials of molecules can be calculated. 
We know that this electrostatic potential can be approximated by atomic charges, following the equation:

$$V(r,q) = \frac{q}{r}$$

Remember that:

$$r = \sqrt{(x_\text{atom} - x_\text{point})^2+(y_\text{atom} - y_\text{point})^2+(z_\text{atom} - z_\text{point})^2}$$

All the equations are given in atomic units, and the given data is also in atomic units, so no conversions are required.

In this exercise we will consider a water molecule.
The water molecule have the following coordinates:

Atom | x | y | z
--- | --- | --- | ---
H | 1.638 | 1.137 | 0.000
O | 0.000 | -0.143 | 0.000
H | -1.638 | 1.137 | 0.000

The calculated quantum mechanical electrostatic potential can be found in the file "data/ESP_data.csv". 
The data is structured as:

$$x_\text{point} | y_\text{point} | z_\text{point} | QM_\text{ESP}$$

We know want to determine the optimal atomic charges for reproducing the quantum mechinal electrostatic potential.
We want to minimize the following equation:

$$QM_\text{ESP, approximated} = \sum_i \left[ V(r_\text{atom 1; point i}, q_\text{atom 1}) + V(r_\text{atom 2; point i}, q_\text{atom 2}) + V(r_\text{atom 3; point i}, q_\text{atom 3}) \right]$$

- Hint: Write the function of V(r, q) as V(charge, x_point, y_point, z_point, x_atom, y_atom, z_atom).
- Hint: If stuck take a look at the solution, close the solution and try to make your own solution.

Try to take the sum of your fitted charges. 
As you can see these differ from zero. 
We know that the total charge of water should be zero.
Now we will try to fit the charges under the assumption that the total charge is zero.
I.e. we know:

$$q_\text{total} = q_\text{atom 1} + q_\text{atom 2} + q_\text{atom 3}$$

This we can also write as:

$$q_\text{atom 3} = q_\text{total} - q_\text{atom 1} - q_\text{atom 2}$$

We can use the above equation, to write the new equation that we need to minimize:

$$QM_\text{ESP, approximated} = \sum_i \left[ V(r_\text{atom 1; point i}, q_\text{atom 1}) + V(r_\text{atom 2; point i}, q_\text{atom 2}) + V(r_\text{atom 3; point i}, q_\text{total} - q_\text{atom 1} - q_\text{atom 2}) \right]$$

Remember we know $q_\text{total} = 0$, we can therefore note that we only have two paramters instead of three to fit. 
I.e. we only need to fit $q_\text{atom 1}$ and $q_\text{atom 2}$.

Try make a new fit where you employ the above equations.
This way you are guarenteed that the total charge is zero.