# Automatic Analysis of XAS *In-Situ* Data w. Measured Standards
Notebook showing an example workflow used for automatic analysis of XAS *in-situ* data when standards for both unreduced precursors and reduced metal foils have been measured on the same instrument.

# Imports
Here the required packages and functions are imported.

Whether plots are interactive is also changed here. 

In [1]:
# Functions written for the analysis of XAS data
from autoXAS.data import *
from autoXAS.LCA import *
from autoXAS.plotting import *

%matplotlib inline

# Global variables

In [2]:
synchrotron = 'ESRF'

# Boolean flags
Here the values of boolean flags (True/False) that occur throughout the notebook can be changed.

In [3]:
# Decide if transmission or absorption data should be used for normalization and analysis
use_transmission = False
# Decide if subtraction of pred-edge should be used for normalization
use_preedge = True
# Decide if plots should be interactive or static
interactive = True

# Splitting of .dat files
Split .dat files containing multiple measurements into single measurement files

In [4]:
# split_dat_file(
#     data_folder='../Data/ESRF_BM31/IrPtPdRuRh_XAS/IrPtPdRuRh_XAS/XAS1/',
#     filename='Pd',
#     header_length=81,
#     data_length=161, # This value will most likely be different for each measured edge
#     footer_length=5,
# )

# Standards and preprocessing
Here the measured standards (metal foils and precursors) are loaded and preprocessed. 

This section only needs to be run once, as it applies to all experiments measured on the same instrument.

## Metal foils

In [5]:
# Specify data location
folder_metal_foils = '../Data/ESRF_BM31/Standards/Standards/'

# Load data
df_foils = load_xas_data(
    folder_metal_foils,
    synchrotron='ESRF', 
    file_selection_condition='mono', # It will look for files with this substring in the filename. This can also be a list of substrings
    negated_condition=True, # Files containing the above substring will be ignored(True)/loaded(False).
)

# Initial data processing
df_foils = processing_df(df_foils, synchrotron='ESRF')

# Specify the measurements to use when averaging. 
# This can be given as either a list or a range.
measurements_to_average = range(1,3) # Change this to fit the number of repeat measurements for the standards

# Create dataframe with the reference spectra for reduced metals
df_foils = average_measurements(df_foils, measurements_to_average)

Loading data: 100%|██████████| 6/6 [00:00<00:00, 19.92it/s, Currently loading Rufoil.dat]


### Edge energy corrections
The energy shifts of the different edges are systematic errors from the instrument. Therefore the shift is consistent across measurements and we can correct the measured data using the theoretical edge energies.

In [6]:
# # Calculate the edge energy shift at each edge
# edge_correction_energies = {
#     'Pd':calc_edge_correction(df_foils, metal='Pd', edge='K', transmission=use_transmission),
#     'Ag':calc_edge_correction(df_foils, metal='Ag', edge='K', transmission=use_transmission),
#     'Rh':calc_edge_correction(df_foils, metal='Rh', edge='K', transmission=use_transmission),
#     'Ru':calc_edge_correction(df_foils, metal='Ru', edge='K', transmission=use_transmission),
#     'Mn':calc_edge_correction(df_foils, metal='Mn', edge='K', transmission=use_transmission),
#     'Ir':calc_edge_correction(df_foils, metal='Ir', edge='L3', transmission=use_transmission),
#     'Pt':calc_edge_correction(df_foils, metal='Pt', edge='L3', transmission=use_transmission),
# }

# Use dictionary filled with 0's if no foils have been measured
edge_correction_energies = {
    'Pd':0,
    'Ag':0,
    'Rh':0,
    'Ru':0,
    'Mn':0,
    'Ir':0,
    'Pt':0,
}


### Normalization
Normalization includes correcting the energy shifts, subtraction by the minimum measured value and division by a fit to the post-edge data. A fit to the pre-edge data can also be used to subtract from the data, but can sometimes lead to overcorrections. 

The pre- and post-edge fits can be visually inspected using the "plot_non_normalized_xas()" function with the optional arguments "pre_edge=True" and "post_edge=True". 

All normalization of data **must** use the same normalization procedure!

In [7]:
# Normalization of the data
normalize_data(
    df_foils, 
    edge_correction_energies, 
    subtract_preedge=use_preedge, 
    transmission=use_transmission,
)
df_foils.head()

Unnamed: 0,Filename,Experiment,Measurement,ZapEnergy,MonEx,xmap_roi00,Ion1,Metal,Precursor,Energy,Temperature,Absorption,Transmission,Relative Time,Energy_Corrected,Normalized,pre_edge,post_edge
0,Agfoil.dat,Agfoil,1,25.400257,1838.0,1642.0,47334.0,Ag,Avg,25400.257812,25.771809,0.892292,-3.248945,0,25400.257812,0.003856,-0.187556,57.054533
1,Agfoil.dat,Agfoil,1,25.40056,1856.0,1549.0,47827.0,Ag,Avg,25400.560547,25.771809,0.859137,-3.249295,0,25400.560547,0.003179,-0.181941,57.053846
2,Agfoil.dat,Agfoil,1,25.401297,1854.0,1587.0,47723.0,Ag,Avg,25401.296875,25.771809,0.864015,-3.249086,0,25401.296875,0.003026,-0.168285,57.052175
3,Agfoil.dat,Agfoil,1,25.40201,1855.0,1650.0,47774.0,Ag,Avg,25402.009766,25.771809,0.886617,-3.248814,0,25402.009766,0.003191,-0.155064,57.050557
4,Agfoil.dat,Agfoil,1,25.402702,1855.0,1670.0,47787.0,Ag,Avg,25402.703125,25.771809,0.892608,-3.248996,0,25402.703125,0.003072,-0.142205,57.048982


### Plotting
It is always a good idea to visually inspect the data to see if it behaves as it should.

In [8]:
plot_non_normalized_xas(df_foils, 'Pdfoil', pre_edge=True, post_edge=True, transmission=use_transmission, interactive=interactive)

## Precursors

In [9]:
# Specify data loccation
folder_precursor_standards = '../Data/ESRF_BM31/wheel/wheel/'

# Load data
df_precursors = load_xas_data(
    folder_precursor_standards, 
    synchrotron='ESRF', 
    file_selection_condition='mono', 
    negated_condition=True,
)

# Initial data processing
df_precursors = processing_df(df_precursors, synchrotron='ESRF')

# Specify the measurements to use when averaging. 
# This can be given as either a list or a range.
measurements_to_average = range(1,3)

# Create dataframe with the reference spectra for reduced metals
df_precursors = average_measurements(df_precursors, measurements_to_average)

Loading data: 100%|██████████| 8/8 [00:00<00:00, 20.10it/s, Currently loading Ruacac.dat]


Incomplete measurement detected!
Not all edges were measured 3 times, but only 2 times.
Incomplete measurements will be removed unless keep_incomplete="True".

Incomplete measurements were removed!


### Normalization

In [10]:
# Normalization of the data
normalize_data(
    df_precursors, 
    edge_correction_energies, 
    subtract_preedge=use_preedge, 
    transmission=use_transmission
)
df_precursors.head()

Unnamed: 0,Filename,Experiment,Measurement,ZapEnergy,MonEx,xmap_roi00,Ion1,Metal,Precursor,Energy,Temperature,Absorption,Transmission,Relative Time,Energy_Corrected,Normalized,pre_edge,post_edge
0,Agacac.dat,Agacac,1,25.400257,15827.0,2483.0,38828.0,Ag,Avg,25400.257812,1100.0,0.159598,-0.897378,0,25400.257812,0.003925,-0.031966,8.113111
1,Agacac.dat,Agacac,1,25.40056,15951.0,2585.0,39134.0,Ag,Avg,25400.560547,1100.0,0.162173,-0.897445,0,25400.560547,0.004133,-0.031083,8.113207
2,Agacac.dat,Agacac,1,25.401253,15932.0,2544.0,39093.0,Ag,Avg,25401.251953,1100.0,0.160627,-0.897563,0,25401.251953,0.003696,-0.029068,8.113428
3,Agacac.dat,Agacac,1,25.402031,15940.0,2685.0,39104.0,Ag,Avg,25402.03125,1100.0,0.163802,-0.897325,0,25402.03125,0.003808,-0.026797,8.113677
4,Agacac.dat,Agacac,1,25.402941,15921.0,2678.0,39048.0,Ag,Avg,25402.941406,1100.0,0.165325,-0.897333,0,25402.941406,0.003671,-0.024145,8.113968


### Plotting
It is always a good idea to visually inspect the data to see if it behaves as it should.

In [11]:
plot_non_normalized_xas(df_precursors, 'Pdacac', pre_edge=True, post_edge=True, transmission=use_transmission, interactive=interactive)

# Experiments
Here the measured data from different experiments are loaded, preprocessed and analysed. 

This section needs to be run every time a new experiment is analysed.

## Preprocessing

### Single dataset
Use either this section or *Stiching together datasets*.

In [12]:
# # Specify data location
# folder_XAS_data = '../Data/ESRF_BM31/IrPtPdRuRh_XAS/IrPtPdRuRh_XAS/XAS1/'

# # Load data
# df_data = load_xas_data(
#     folder_XAS_data, 
#     synchrotron='ESRF', 
#     file_selection_condition='mono', 
#     negated_condition=True
# )

# # Initial data processing
# df_data = processing_df(df_data, synchrotron='ESRF')

### Stiching together datasets
Use either this section or *Single dataset*.

In [13]:
# Specify all data locations
list_of_folders = [
    '../Data/ESRF_BM31/IrPtPdRuRh_XAS/IrPtPdRuRh_XAS/XAS1/',
    '../Data/ESRF_BM31/IrPtPdRuRh_XAS/IrPtPdRuRh_XAS/XAS2/',
    '../Data/ESRF_BM31/IrPtPdRuRh_XAS/IrPtPdRuRh_XAS/XAS3/',
    '../Data/ESRF_BM31/IrPtPdRuRh_XAS/IrPtPdRuRh_XAS/XAS4/',
]

# Create empty list to hold all datasets
list_of_datasets = []

# Load data
for folder in list_of_folders:
    df_data = load_xas_data(
        folder, 
        synchrotron='ESRF', 
        file_selection_condition='mono', 
        negated_condition=True, 
        verbose=False,
    )

    # Initial data processing
    df_data = processing_df(df_data, synchrotron='ESRF')

    # Append to list of datasets
    list_of_datasets.append(df_data)

# Combine the datasets
df_data = combine_datasets(list_of_datasets)

Loading data: 100%|██████████| 5/5 [00:00<00:00, 26.46it/s, Currently loading Ru.dat]
Loading data: 100%|██████████| 5/5 [00:00<00:00, 18.91it/s, Currently loading Ru.dat]
Loading data: 100%|██████████| 5/5 [00:00<00:00, 35.55it/s, Currently loading Ru.dat]
Loading data: 100%|██████████| 5/5 [00:00<00:00, 17.31it/s, Currently loading Ru.dat]


### Normalization

In [14]:
# Normalization of the data
normalize_data(
    df_data, 
    edge_correction_energies, 
    subtract_preedge=use_preedge, 
    transmission=use_transmission
)
df_data.head()

Unnamed: 0,index,Filename,Experiment,Measurement,ZapEnergy,MonEx,xmap_roi00,Ion1,Metal,Precursor,Energy,Temperature,Absorption,Transmission,Relative Time,Energy_Corrected,Normalized,pre_edge,post_edge
0,0,Ir.dat,Ir,1,11.170101,83329.0,389.0,44793.0,Ir,,11170.101562,39.459183,0.004668,0.620745,0,11170.101562,0.008543,-0.001677,0.224033
1,1,Ir.dat,Ir,1,11.170307,83551.0,377.0,44911.0,Ir,,11170.307617,39.459183,0.004512,0.620774,0,11170.307617,0.007512,-0.001599,0.223956
2,2,Ir.dat,Ir,1,11.170659,83612.0,384.0,44955.0,Ir,,11170.65918,39.459183,0.004593,0.620525,0,11170.65918,0.007287,-0.001466,0.223823
3,3,Ir.dat,Ir,1,11.171026,83536.0,369.0,44922.0,Ir,,11171.026367,39.459183,0.004417,0.62035,0,11171.026367,0.0059,-0.001327,0.223684
4,4,Ir.dat,Ir,1,11.171415,83525.0,409.0,44919.0,Ir,,11171.415039,39.459183,0.004897,0.620285,0,11171.415039,0.007387,-0.00118,0.223537


#### Saving results as .csv file

In [15]:
# save_data(df_data, filename='Normalized_XAS_data.csv')

## Data inspection
It is always a good idea to visually inspect the data to see if it behaves as it should.

In [29]:
plot_data(df_data, 'Pd', foils=df_foils, precursors=df_precursors, precursor_suffix=None, interactive=interactive)

## Linear combination analysis
This section performs linear combination analysis (LCA) of every combination of two-component systems consisting of 1 metal foil and 1 precursor (with the same metal). 

The estimated uncertainties of the dependent parameter behaves weird when the independent parameter is approximately zero. In the column "StdCorrected" this is handled by using the same uncertainty for both parameters.

In [17]:
# # LCA using measured references
# df_results = linear_combination_analysis(
#     data = df_data, 
#     products = df_foils, 
#     precursors = df_precursors,
# )
# df_results.head()

In [18]:
# LCA using specific measurements in the experiment as references
df_results = LCA_internal(
    df_data, 
    initial_state_index = 1, 
    # intermediate_state_index = None, 
    final_state_index = -1,
)
df_results.head()

LCA progress: : 225it [00:02, 87.63it/s, Analysing frame 1 + 45]           


Unnamed: 0,Experiment,Metal,Product,Intermediate,Precursor,Precursor Type,Measurement,Temperature,Temperature Average,Temperature Std,Parameter,Value,StdErr,StdCorrected,Energy Range,Basis Function
0,Frame 1 + last,Ir,last,,1,Internal,1,41.438393,41.440155,1.34725,product_weight,6.106227e-15,1.789563e-16,1.789563e-16,"[11170.1015625, 11170.3076171875, 11170.659179...","[0.005100496047539261, 0.0107640321886341, 0.0..."
1,Frame 1 + last,Ir,last,,1,Internal,1,41.438393,41.440155,1.34725,precursor_weight,1.0,0.0,1.789563e-16,"[11170.1015625, 11170.3076171875, 11170.659179...","[0.008543213423572045, 0.007511793678159319, 0..."
2,Frame 1 + last,Ir,last,,1,Internal,2,46.650097,43.886864,1.941293,product_weight,4.218847e-15,0.008453071,0.008453071,"[11170.083984375, 11170.287109375, 11170.66699...","[0.005100496047539261, 0.010429008441956175, 0..."
3,Frame 1 + last,Ir,last,,1,Internal,2,46.650097,43.886864,1.941293,precursor_weight,1.0,0.0,0.008453071,"[11170.083984375, 11170.287109375, 11170.66699...","[0.008543213423572045, 0.0076144468281766995, ..."
4,Frame 1 + last,Ir,last,,1,Internal,3,47.809467,47.003166,2.54706,product_weight,2.220446e-16,0.01560961,0.01560961,"[11170.138671875, 11170.3037109375, 11170.6630...","[0.005664691477536346, 0.010800257553905195, 0..."


#### Saving results as .csv file

In [19]:
# save_data(df_results, filename='LCA_results.csv')

### Results plotting

#### Temperature curves

In [20]:
plot_temperatures(
    df_results, 
    with_uncertainty=True, 
    interactive=interactive
)

#### Waterfall plots


In [27]:
plot_insitu_waterfall(
    df_data, 
    experiment='Pd', 
    # lines=[5,33,109],
    vmin=0.8, 
    # vmax=1.3, 
    y_axis='Measurement',
    time_unit='m',
    interactive=interactive,
    homogenize_measurements=True,
)

In [26]:
plot_insitu_change(
    df_data, 
    experiment='Pd', 
    reference_measurement=1,
    # lines=[5,348],
    vmin=-0.25, 
    vmax=0.25, 
    y_axis='Measurement',
    time_unit='m',
    interactive=interactive,
    homogenize_measurements=True,
)

#### Plot of a single LCA fit
Plot showing the measurement that is being fitted, the contributions from the components and the residual.

In [23]:
plot_LCA(
    df_results, 
    df_data, 
    experiment='Frame 1 + last', 
    metal='Pd',
    measurement=27, 
    interactive=interactive
)

#### Plot of LCA component weights over time
Plot showing how the weight of each component changes over time.

In [24]:
plot_LCA_change(df_results, product='last', precursor=1, metal='Pd', x_axis='Measurement', with_uncertainty=True, interactive=interactive)

#### Comparison of reduction times of different metals
Plot showing the weight of the metal foil component over time for the different metal species in the sample. 

In [25]:
plot_reduction_comparison(
    df_results, 
    precursor_type='all', 
    x_axis='Measurement', 
    with_uncertainty=True, 
    interactive=interactive
)