# Workbook 3 - Exercises

Before we start, let's get Colab set up for this workbook:

In [None]:
!pip install h5py
!pip install lmfit
!pip install numpy
!pip install matplotlib

!git clone https://github.com/timsnow/advanced_sas_training_course
%cd 'advanced_sas_training_course/02 - Data Handling and Plotting'

With Colab set up, let's set up this workbook for the exercises:

In [None]:
import numpy as np
from lmfit.models import *
import matplotlib.pyplot as plt

hdf_file_path = 'data/i22-363058.h5'
text_file_path = 'data/i22-363110.dat'
internal_hdf_data_path = '/entry/data/data'

Using the `with xxx as yyy:` pattern, load the data contained within the hdf and text files nominated above into the following variables:

 - From the .h5 file load in the `/entry/data/data` dataset as a variable called `two_d_dataset`
 - From the .dat file load in the data, using numpy, and then create both an `x_dataset` and `y_dataset`

Using matplotlib create a basic plot of both of these datasets:

Run the following cell to instanciate a function that will let you generate some data:

In [None]:
def noisy_gaussian_function(x_axis, a, b, c, noise_factor):
    length_of_x_axis = len(x_axis)
    noise_to_add = (np.random.randn(length_of_x_axis) * noise_factor) / (1 / x_axis)
    return (a * np.exp(-((x_axis - b)**2 / (2 * c**2)))) + noise_to_add

Using `numpy` generate a dataset called `x_axis` and then use the function above on that dataset to generate a `y_axis` dataset:

Use the following cell to plot this data:

In [None]:
plt.plot(x_axis, y_axis)
plt.show()

Using `lmfit` and the `GaussianModel()` class, fit this data:

### Extra credit - for after the training course

Using the `lmfit` webpage on multiple peak fitting, attempt a fit on the following dataset, which is a simulation of an actual SAS dataset:

In [None]:
def gaussian_function(x_axis, a, b, c):
    return a * np.exp(-((x_axis - b)**2 / (2 * c**2)))

noise_factor = 30
peak_height = 10
intensity_value = 1
q_data = np.linspace(2, 200, 198)
q_data /= 1000
i_data = intensity_value / q_data**4

peak_one =   gaussian_function(q_data, (i_data[38]  * peak_height), 0.04, 0.001)
peak_two =   gaussian_function(q_data, (i_data[78]  * peak_height), 0.08, 0.001)
peak_three = gaussian_function(q_data, (i_data[118] * peak_height), 0.12, 0.001)
peak_four =  gaussian_function(q_data, (i_data[158] * peak_height), 0.16, 0.001)

i_data += peak_one + peak_two + peak_three + peak_four
di_data = (np.random.randn(len(i_data)) * noise_factor) + np.sqrt(i_data)

plt.fill_between(q_data, i_data - di_data, i_data + di_data, color='gray', alpha=0.5)
plt.yscale('log')
plt.show()

The following code snippet will be of use:

In [None]:
gauss1 = GaussianModel(prefix='g1_')
gauss2 = GaussianModel(prefix='g2_')
gauss3 = GaussianModel(prefix='g3_')
gauss4 = GaussianModel(prefix='g4_')
background = ExpressionModel('bgval / x**4')
overall_model = gauss1 + gauss2 + gauss3 + gauss4 + background

overall_parameters = gauss1.make_params()
overall_parameters.update(gauss2.make_params())
overall_parameters.update(gauss3.make_params())
overall_parameters.update(gauss4.make_params())
overall_parameters.update(background.make_params())

overall_parameters['g1_amplitude'].value =
overall_parameters['g1_center'].value =
overall_parameters['g1_height'].value =
overall_parameters['g2_amplitude'].value =
overall_parameters['g2_center'].value =
overall_parameters['g2_height'].value =
overall_parameters['g3_amplitude'].value =
overall_parameters['g3_center'].value =
overall_parameters['g3_height'].value =
overall_parameters['g4_amplitude'].value =
overall_parameters['g4_center'].value =
overall_parameters['g4_height'].value =
overall_parameters['bgval'].value =