# Spectral Analysis for Large Data Sets

Author: Garrek Stemo \
Date created: August 31, 2020 \
Institution: Nara Institute of Science and Technology

This is an interactive notebook for analyzing relatively large sets of spectral data, specifically tuned for angle-resolved FTIR Fabry-Perot spectra. I initially created a command-line tool to do curve fitting and spectral analysis, but this was cumbersome because of the variability from data set to data set and the need to carefully truncate and set bounds on the data. Interactivity also makes it easier to try different fitting models. The notebook uses the lmfit package, which is a wrapper for SciPy's optimize method. It is much more user-friendly and customizable, making it well-adapted to an interactive programming scheme like this one. Lmfit comes with almost all of the fitting functions one might need to perform analysis, but I have included additional functions in a custom `pmath.py` module. The most important function here is the asymmetric Gaussian, Lorentzian, and Voigt functions. Asymmetric broadening of spectra occurs in a wide range of materials, including crystalline solids, nanoparticles, molecular solids, and liquids. The scheme used here to model asymmetries is described in [Korepanov and Sedlovets, 2018](https://arxiv.org/abs/1804.06083). I also wrote a bunch of functions to pull data from FTIR experiments and transfer matrix simulations in the `polariton_processing.py` module. These functions assume the files are named a certain way and are in a particular format, so check out that module for more detail or write your own.


### Import Modules

We use the matplotlib widgets framework to generate interactive plots. Make sure you have the appropriate dependencies installed.

In [None]:
import matplotlib.pyplot as plt
import ipywidgets as widgets
import numpy as np
import lmfit as lm
import polariton_processing as pp
import pmath

%matplotlib widget

### Load Data

Assign 'data_directory' the path to a directory containing your angle-resolved data. Each .csv data file must contain "deg##.##_" where the "##.##" is an angle, which could be an integer or not. The underscore is necessary, since the program uses this to extract the angle information from the file name. Then specify an output directory where you would like output data to go.

In [None]:
data_directory = ''
output_directory = ''

angle_data, absorbance_data = pp.get_angle_data_from_dir(data_directory, convert_units=('um', 'cm-1'))
sample, params = pp.get_sample_params(data_directory)

# pp.write_angle_spec_to_file(angle_data, sample, output_directory)  # Write the spectrum to a file

### Set Bounds

The spectra, either simulated or experimental, probably spans a large domain of frequencies. Here you can truncate the data to isolate just the peaks you want to analyze. You can visualize the spectrum from whichever angle you want.

In [None]:
lower_bound = 1940
upper_bound = 2172

spectrum = angle_data[0]
wavenumber, transmittance = pp.truncate(spectrum[1], spectrum[2], lower_bound, upper_bound)

fig1, ax = plt.subplots()

ax.plot(wavenumber, transmittance)
ax.set_title("Set wavenumber bounds")
ax.set_xlabel(r'Wavenumber (cm$^{-1}$)')
ax.set_ylabel('Transmittance %')
plt.show()

In [None]:
plt.close(fig1)

### Define the Model and Parameters

Now test the fitting functions available in pmath.py to find the one you would like to use. The lmfit package has lots of built-in functions. You can use these, fitting functions in pmath, or your own.

In [None]:
# def make_model(params):

Set the model, make initial guesses for the fit parameters, and apply constraints.

In [None]:
model = lm.Model(pmath.asym_voigt)
init_params = model.make_params(amp=0.1, w_0=2077, gamma=10, a=0.0, m=0.3)

init_params['m'].set(min=0.0, max=1.0)
init_params['gamma'].set(min=0.0)

print('parameter names: {}'.format(model.param_names))
print('independent variables: {}'.format(model.independent_vars))

In [None]:
first_result = model.fit(transmittance, init_params, w=wavenumber)
# print(result.fit_report())
print(first_result.params.pretty_print())
# result.plot()

In [None]:
fig2, ax = plt.subplots()
ax.plot(wavenumber, transmittance, 'bo', markersize=5)
ax.plot(wavenumber, first_result.init_fit, 'k--', label='initial fit')
ax.plot(wavenumber, first_result.best_fit, 'r-', label='best fit')
# ax.vlines(result.params['w1'].value, 0, 0.02, linestyles='dashed', color='blue')
ax.vlines(first_result.params['w_0'].value, 0, 0.02, linestyles='dashed', color='blue')

plt.legend()
plt.show()

In [None]:
plt.close(fig2)

### Automatically Fit Peaks From Multiple Data Sets

Now that we have a good fit for the first peak, go through and fit the rest, using the results from this first spetrum as initial guesses for the rest.

In [None]:
guess = first_result.values
model = lm.Model(pmath.asym_voigt)
num_sets_to_analyze = 10
data_sets = []
results = []

In [None]:
for angle in range(num_sets_to_analyze):
    wavenumber, transmittance = pp.truncate(angle_data[angle][1], angle_data[angle][2], lower_bound, upper_bound)

    params = model.make_params(amp=guess['amp'], 
                               w_0=guess['w_0'], 
                               gamma=guess['gamma'], 
                               a=guess['a'], 
                               m=guess['m'])

    params['m'].set(min=0.0, max=1.0)
    params['gamma'].set(min=0.0)
    
    result = model.fit(transmittance, params, w=wavenumber)
    
    data_sets.append((wavenumber, transmittance))
    results.append(result)
    guess = result.values

### Examine Data

Next we should inspect the results to make sure the fitting procedure worked. It's not feasible to inspect *every* spectrum (I won't stop you!), but you can sample a few and see what they look like by changing the `examine_spectrum` index and generating a plot that will show the actual data (blue dots), the best fit (red), and the initial fit (black dashed line). Since we use the previous best fit for the guess of the next data set, we are essentially plotting the best fit for the current data set alongside the best fit for the previous one.

In [None]:
examine_spectrum = 0
print(results[examine_spectrum].params.pretty_print())

In [None]:
result = results[examine_spectrum]
wavenumber, transmittance = data_sets[examine_spectrum]

fig3, ax = plt.subplots()

ax.plot(wavenumber, transmittance, 'bo', markersize=3, label='raw data')
ax.plot(wavenumber, result.init_fit, 'k--', label='initial fit')
ax.plot(wavenumber, result.best_fit, 'r-', label='best fit')
# ax.vlines(result.params['w1'].value, 0, 0.02, linestyles='dashed', color='blue')
ax.vlines(result.params['w_0'].value, 0, 0.03, linestyles='dashed', color='blue')

plt.legend()
plt.show()

In [None]:
plt.close(fig3)

### Write Data to File

We'll write the data to a .csv file because you might need to do run this a few times and collect the results before analyzing the Rabi splitting parameter, which is coming up next.

In [None]:
output_file = ''

In [None]:
# for r in results:
#     print(r.values['w_0'])

### Calculate Rabi Splitting Parameter

