# 3. Loading and fitting multiple diffraction patterns

If a number of diffraction patterns are collected over time they can be automatically fitted and the change in peaks over time measured. The change in the positions or heights of the peaks can then be correlated to material properties. We call the fitting of multiple **spectra** an **Experiment**.

In the example files folder there is a sequence of 10 diffraction patterns which we will use for demonstration.

## 3.1. Fitting a peak at multiple times

The first thing to do is to load an `FittingExperiment` object - this contains some metadata about the experiment and will hold one `FitSpectrum` for each diffraction pattern. The *first_cake_angle*, *cakes_to_fit*, *peak_params* and *merge_cakes* parameters are the same as before. The *spectrum_time* parameter sets the number of seconds between each spectrum in the sequence - this value is used to appropriately label the x-axis on plots. The *file_stub* parameter is used to locate the files. To use all of the spectra in a folder that are sequentially numbered - provide the stub of the file name with a star (wildcard).

Once the `FittingExperiment` object is loaded, the `run_analysis` method runs the fit over all of the specified files.

In [None]:
#%load_ext autoreload
#%autoreload 2

%matplotlib inline

import xrdfit.spectrum_fitting as spectrum_fitting
from xrdfit.spectrum_fitting import PeakParams
from xrdfit.spectrum_fitting import FitSpectrum
import ipywidgets as widgets
from ipywidgets import fixed

first_cake_angle = 90

cakes_to_fit = [1, 2]
#cakes_to_fit = list(range(1, 11))

## PeakParams('(10-11) & (200)', (3.1, 3.5), [(3.15, 3.32), (3.338, 3.45)]), PeakParams('(311) & (20-20) & Other', (5.4, 6.25), [(5.527, 5.565), (5.61, 5.7), (5.78, 5.82), (5.913, 5.95), (5.992, 6.04)])

peak_params = [PeakParams('(10-10) & (111)', (2.70, 3.1),  [(2.883, 2.92), (2.94, 2.98)]),
               PeakParams('(220) & (11-20)', (4.55, 5.25), [(4.79, 4.9), (5.0, 5.05)])]

merge_cakes = False

spectrum_time = 1
file_stub = r'C:\Users\mbgnwob2\Dropbox (The University of Manchester)\2. Project\Python Script\Single Peak Fitting Script ORIGINAL - Copy\Test_data_Sample2_tension_strain/Strain_00000.dat'

In [None]:
spectral_data = FitSpectrum(file_stub, first_cake_angle)

for cake in cakes_to_fit:
    for params in peak_params: 
        spectral_data.fit_peaks(params, cake)
        spectral_data.plot_peak_params(params, cake, show_points=True)

In [2]:
file_stub = r'C:\Users\mbgnwob2\Dropbox (The University of Manchester)\2. Project\Python Script\Single Peak Fitting Script ORIGINAL - Copy\Test_data_Sample2_tension_strain\Strain_*'

experiments = []
spectrums = []
spectral_data = []

for cake in cakes_to_fit:
    experiments.append(spectrum_fitting.FittingExperiment(spectrum_time, file_stub, first_cake_angle, cake, peak_params, merge_cakes))
    
for experiment in experiments: 
    experiment.run_analysis(reuse_fits=True)

Processing 59 diffraction patterns.


HBox(children=(IntProgress(value=0, max=59), HTML(value='')))




The following fits took over 500 fitting iterations. The quality of these fits should be checked.
Fit for peak (10-10) & (111) at timesteps: [16, 58]
Fit for peak (220) & (11-20) at timesteps: [2, 42, 45, 49]


Analysis complete.
Processing 59 diffraction patterns.


HBox(children=(IntProgress(value=0, max=59), HTML(value='')))




TypeError: Improper input: N=9 must not exceed M=0

The results are hierarchically stored in the `FitExperiment` object and can be accessed directly if you wish. However, it is probably easier to use the `FitExperiment` helper methods described below to plot the results rather than accessing the raw fit data directly.

The `peak_names` method lists the names of the fitted peaks specified in the `PeakParams` objects. The `fit_parameters` method gives a list of the names of the fit parameters for a particular peak.

To plot a parameter for a fit over time use the `plot_fit_parameter` method. To plot all of the parameters for all of the peaks use a loop:

The blue line is the value of the parameter while the light blue shaded area is +- 1 standard error on the determination of the parameter from the fit. In this case the errors on the parameters are small indicating a good fit, however the peak does not do anything very interesting. This is because the 10 example files represent only a short time period where little was changing in the material.

Error bars are plotted by defualt, to turn them off you can use the `show_error` parameter:

## 3.2. Fitting multiple peaks at multiple times
We can set up a larger analysis to fit multiple peaks over time. It is probably a good workflow to determine good values for the `PeakParams` on a single file first to check that the fits are good and then use them to run over multiple files. Here we take the `PeakParams` determined in the first tutorial notebooks.

In [None]:
print(experiment.peak_names())
for peak_name in experiment.peak_names():
    print(experiment.fit_parameters(peak_name))

Here we can see that we have the four peaks we have fitted. Notice that the third peak has two sets of fit parameters. Since the third peak is fitted with a compound fit, the first fit corresponds to the first maxima and the second the second maxima.

As before we can loop over the names though this time we plot just the centre of the fit. The second maxima will only be plotted for the peak where it exists.

In [None]:
for peak_name in experiment.peak_names():
    for param in experiment.fit_parameters(peak_name):
        experiment.plot_fit_parameter(peak_name, param)

## 3.3. Plotting fits

If you notice the fits going very slowly at certain times or the plotted fit parameters show unexpected values, it is likely the fit parameters need adjusting a little. To get an idea of how the fits went you can plot the fits from a time series using the `plot_fits` method. By defualt the method prints plots for 5 timesteps, evenly spaced in time, plotting one plot for each fitted peak.

In [None]:
def steps_slide_bar():
    step_slider = widgets.IntSlider(value=0, min = 0, max = 57, step = 1)
    return step_slider

In [None]:
timestep = steps_slide_bar()
timestep

In [None]:
timestep_val = [timestep.value]
timestep_val

In [None]:
widgets.interact(experiments[0].plot_fits, timesteps = [1, 2, 3, 4], peak_names = fixed(['(10-10) & (111)']))

In [None]:
for experiment in experiments:
    experiment.plot_fits(num_timesteps = 59)

This could potentially be a lot of plots so you can narrow down the plots you want by provinding different arguments to the plot_fits function. `num_timesteps` sets how many evenly spaced timesteps to plot (default: 5). `peak_names` is a list of one or more peak names to plot (default: all fitted peaks). `timesteps` is a list of integer values specifying which timesteps to plot, if timesteps is provided then `num_timesteps` will be ignored.

In [None]:
experiment.plot_fits(peak_names=["1", "4"], timesteps=[2, 3])

## 3.4. Fitting a subset of timesteps

Sometimes it is the case that we do not want to process all spectra in a series. Perhaps the sampling frequency is too high and we only want to fit every other spectrum or every 10th spectrum. Perhaps the interesting data is at the end so you want to skip the first 100 spectra. This can be done by  supplying an extra parameter to the `FitExperiment` object.

The *frames_to_load* parameter is a list of integer values specifying which files to load. The file stub also has to be modified here - add a python format string where the numbers need to be substituted in the file name. In this example `:05d` corresponds to a 5 digit wide integer padded with zeros. This means 1 will become 00001, 10 will become 00010 etc. For more on python sting formatting see here: https://pyformat.info/#number

The below example will be just the same as the one above except it will only load spectra 1, 3 and 4 from the example folder. Notice how the x-axis on the parameter plots scales correctly - leaving a gap at 2 seconds where there is no data.

In [None]:
frames_to_load = [1, 3, 4]
file_stub = "../example_data/adc_041_7Nb_NDload_700C_15mms_{:05d}.dat"

experiment = spectrum_fitting.FittingExperiment(spectrum_time, file_stub, first_cake_angle, 
                                                cakes_to_fit, peak_params, merge_cakes, frames_to_load)

experiment.run_analysis()
for peak_name in experiment.peak_names():
    experiment.plot_fit_parameter(peak_name, "maximum_1_center", show_points=True)

To get the raw data you can use the `get_fit_parameter` method. The first column is the time (x-data), the second column is the requested parameter (y-data) and the third column is the standard error on the fit parameter (y-error) estimated from the fitting covariance matrix.

In [None]:
experiment.get_fit_parameter("1", "maximum_1_center")

Once you have done an experiment it may be desirable to save the fits to be able to refer back to them later.

This can be done using the `save` method of the `FitExperiment` object.

In [None]:
experiment.save("experiment.dump")

This file is a compressed binary file and so is not human readable. Note that although the file is compressed, the output may well be large - typically on the order of the size of the input data since the input data is embedded in the object.

To read in a previously saved `FitExperiment` object, use the `spectrum_fitting.load_dump` method. This returns a new `FitExperiment` with the saved fits which you can operate on just as before.

In [None]:
old_experiment = spectrum_fitting.load_dump("experiment.dump")
experiment.plot_fit_parameter("1", "maximum_1_center", show_points=True)

## 3.5. Using previous fits as a starting point for the next fit

It is possible to use the result of a fit from a previous time step as the starting condition for the next fit. You can do this by using the reuuse_fits parameter of the `run_analysis` method. 

In [None]:
experiment.run_analysis(reuse_fits=True)

This may be useful if the fits change little between timesteps as in theory the previous fit should be a good starting point for the next one. If the fits are quite different between timesteps it is likely better to not reuse the fits. In this case the code will make an educated guess about the parameters instead.

We have previously found that reusing the fits can cause poor fitting performance, taking many iterations to complete each fit. The reason for this is not currently understood. Try with and without reusing fits and see which works best for your dataset.