# XPS data analysis example

In this notebook a XPS measurement file from a SPECS detector (using the native SPECS .sle format) that has already been converted into the [NXmpes](https://manual.nexusformat.org/classes/contributed_definitions/NXmpes.html#nxmpes) NeXus standard is read and some basic data analysis (a fit of one Au 4f spectrum) is done.

## View the data with H5Web

H5Web is a tool for visualizing any data in the h5 data format. Since the NeXus format builds opon h5 it can be used to view this data as well. We just import the package and call H5Web with the output filename from the convert command above.

You can also view this data with the H5Viewer or other tools from your local filesystem.

In [None]:
from jupyterlab_h5web import H5Web

In [None]:
H5Web("Au_25_mbar_O2_no_align.nxs")

## Analyze data

First, we need to import the necessarry packages. We use h5py for reading the NeXus file, lmfit for fitting and the class XPSRegion from the provided `xps_region.py` file.

In [None]:
import h5py
from xps_region import XPSRegion

from lmfit.models import GaussianModel

### Load data and plot

We want to load the Au 4f spectrum from the Au foil from our measurement file. Feel free to adapt to different regions in the file by changing the `MEASUREMENT` variable.

In [None]:
MEASUREMENT = "Au_in_vacuum__Au4f"

with h5py.File("Au_25_mbar_O2_no_align.nxs", "r") as xps_file:
    binding_energy = xps_file[f"/{MEASUREMENT}/data/energy"][:]
    cps = xps_file[f"/{MEASUREMENT}/data/data"][:]
    cps_err = xps_file[f"/{MEASUREMENT}/data/data_errors"][:]

There is also a convenience function in XPSRegion to directly load the data: 

In [None]:
au4f = XPSRegion.load("Au_25_mbar_O2_no_align.nxs", MEASUREMENT) 

With the loaded data we create the `au4f` `XPSRegion` containing the measurement data.

In [None]:
au4f = XPSRegion(binding_energy=binding_energy, counts=cps, counts_err=cps_err) 

`XPSRegion` provides us a function to visualize the loaded data with

In [None]:
au4f.plot()

### Fit data

From the preview plot we can detect two symmetric peaks which result from the spin-orbit splitting into the Au 4f5/2 and 4f3/2 regions. For illustration of the typical analysis routine, we construct two Gaussian peaks with the lmfit GaussianModel and initialize them with appropriate start values. Here we are just using initial good guesses for the start values. These, however, can eventually be deduced by data inside NOMAD as soon as enough data is available, e.g. similar to a peak detection in other XPS analysis programs. There are different peak shapes available in lmfit, such as Lorentz, Voigt, PseudoVoigt or skewed models. Please refer to the packages documentation for further details on these models and on how to use them.

In [None]:
peak_1 = GaussianModel(prefix="Au4f52_")
peak_1.set_param_hint("amplitude", value=3300)
peak_1.set_param_hint("sigma", value=0.5)
peak_1.set_param_hint("center", value=84.2)

peak_2 = GaussianModel(prefix="Au4f32_")
peak_2.set_param_hint("amplitude", value=1600)
peak_2.set_param_hint("sigma", value=0.5)
peak_2.set_param_hint("center", value=87.2)

We can simply add the two models together to create a composite model.

In [None]:
comp = peak_1 + peak_2
params = comp.make_params()

We also set a constraint, namely that the area of `peak_2` is exactly half the area of `peak_1` (since it is a photoemission doublet).

To constrain the areas correctly, we need to set the expression for the amplitude of `peak_2` considering both the amplitude and sigma. The constraint should be:

$$\text{area of peak 2} = 0.5 \times \text{area of peak 1}$$

Since the area $A$ of a Gaussian peak is given by:

$$ A = \text{amplitude} \times \sigma \times \sqrt{2\pi}$$

For `peak_2` to have half the area of `peak_1`:

$$ \text{amplitude}_2 \times \sigma_2 = 0.5 \times (\text{amplitude}_1 \times \sigma_1) $$

So, the correct expression for the amplitude of `peak_2` should be:

$$ \text{amplitude}_2 = 0.5 \times \text{amplitude}_1 \times \frac{\sigma_1}{\sigma_2} $$

Therefore, we can write:


In [None]:
params['Au4f32_amplitude'].expr = '0.5 * Au4f52_amplitude * (Au4f52_sigma / Au4f32_sigma)'

In the next step, we perform the actual fit. First, since the data in `Au_in_vacuum__Au4f` contains a very wide scan range, we only select the region with the Au 4f doublet (with `fit_region(...)`). Then, we calculate a Shirley baseline with `calc_baseline()`, set the fit model (`.fit_model(comp)`) and perform a fit (`.fit()`). All of this functions can also be used independently. The fit function takes the measurement uncertainties as weights to the fit function into account.

Finally, the model is plotted with the previously used `plot()` method. Since we performed a fit the plot is now extended by the baseline and fits.

In [None]:
au4f.fit_region(start=80,stop=94).calc_baseline().fit_model(comp).fit(params).plot()

The fit result gets stored inside the `fit_result` parameter and is displayed to extract, e.g., the peak central energies. Please note that the fitting does not take the measurement uncertainties into account and the errors are simple fitting errors.

In [None]:
au4f.fit_result.params 

We can also extract a fitting parameter shared accross different peaks, e.g. the peak central energies. This refers to the text behind the model paramters prefix, so we select `center` here to get the central energies.

In [None]:
au4f.peak_property('center')

Typically, we are also interested in the peak areas which can be calculated with `peak_areas()`

In [None]:
(areas := au4f.peak_areas())

and their ratios

In [None]:
areas / areas.max()

To assess the quality of the fit, the fit residual can be viewed with `plot_residual()`.

In [None]:
au4f.plot_residual()