<a href="https://colab.research.google.com/github/fsacconeUBA/PyMossFit/blob/paper/PyMossFit_paper.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
# IMPORTANT: RUN THIS CELL IN ORDER TO IMPORT YOUR KAGGLE DATA SOURCES,
# THEN FEEL FREE TO DELETE THIS CELL.
# NOTE: THIS NOTEBOOK ENVIRONMENT DIFFERS FROM KAGGLE'S PYTHON
# ENVIRONMENT SO THERE MAY BE MISSING LIBRARIES USED BY YOUR
# NOTEBOOK.
import kagglehub
fabiodanielsaccone_iguazu_sample_path = kagglehub.dataset_download('fabiodanielsaccone/iguazu-sample')
fabiodanielsaccone_opportunity_path = kagglehub.dataset_download('fabiodanielsaccone/opportunity')

print('Data source import complete.')


Downloading from https://www.kaggle.com/api/v1/datasets/download/fabiodanielsaccone/iguazu-sample?dataset_version_number=1...


100%|██████████| 68.5k/68.5k [00:00<00:00, 31.0MB/s]

Extracting files...





Downloading from https://www.kaggle.com/api/v1/datasets/download/fabiodanielsaccone/opportunity?dataset_version_number=1...


100%|██████████| 58.4k/58.4k [00:00<00:00, 31.4MB/s]

Extracting files...
Data source import complete.





# PyMossFit: A Google Colab option for Mössbauer Spectra Fitting

# Introduction
Mössbauer spectroscopy is a highly specialized technique that investigates the resonant absorption of gamma rays by atomic nuclei. The Mössbauer effect, observed primarily in isotopes like iron-57 ($^{57}Fe$), provides insight into hyperfine interactions, including isomer shifts, quadrupole splittings, and magnetic hyperfine fields. These parameters offer detailed information about the electronic, magnetic, and structural environment of the sample, making the technique invaluable in material science, chemistry, and condensed matter physics. The gamma ray of resonant absorption in $^{57}Fe$ nuclei is *14.4 keV*

In a typical Mössbauer experiment, a radioactive source emits gamma rays, which are absorbed by the nuclei in the sample. Resonant absorption occurs, in a small fraction, when gamma rays hit the probe with recoil less. The detector measures the intensity of the transmitted radiation as a function of the velocity of the gamma-ray source. This results in a Mössbauer spectrum, typically characterized by sharp peaks or dips at resonance frequencies, corresponding to different hyperfine interactions within the sample. The most common experimental setup corresponds to *Transmission Geometry* (in this case, the observed lines come from a reduction of *14.4 keV* gamma counting in detector, as compared to the background signal). Other option is the *Conversion Electron Mössbauer Spectroscopy (CEMS)* that correspond to detection of back scattered electrons after gamma absorption.

Recently, in a review written by Grandjean et al [[Grandjean2021]](https://doi.org/10.1021/acs.chemmater.1c00326),the authors made a series of suggestion about how good measurements should be taken and which could be a good practice for Mössbauer data treatment and its corresponding fitting presentation.

In this sense, Google Colab is a useful tool for a colaborative team job in data analysis with the aditional advantage of no aditional packages locally installed, also compatible with the fact that the user can run their codes from multiple devices.

# Mössbauer Spectra and Curve Shapes
The shape of a Mössbauer spectrum varies depending on the nature of the hyperfine interactions in the sample. For example, a simple paramagnetic material might produce a single absorption peak (Lorentzian, Gaussian  or PseudoVoigt) due to the isomer shift. Materials with quadrupole splitting generate a doublet (two peaks), while materials experiencing magnetic hyperfine interactions often exhibit more complex sextet patterns. These spectral features often overlap, making it challenging to isolate and quantify each component manually.

Curve fitting plays a crucial role in Mössbauer spectral analysis, as it allows for the decomposition of these complex spectra into individual contributions, each associated with specific hyperfine parameters. The challenge lies in accurately reproducing the spectral shapes using mathematical models and adjusting the parameters until a satisfactory fit is achieved.

# Curve Fitting and Least Squares Method
To accurately extract physical parameters from Mössbauer spectra, fitting procedures are applied to the experimental data. One of the most common approaches is the *least-squares method*, which minimizes the difference between the experimental data points and the theoretical model curve. The objective is to adjust the parameters of the theoretical model (such as isomer shift, quadrupole splitting, and line broadening) so that the calculated spectrum fits the experimental data as closely as possible. Then, it is required a minimization of the $\chi^2$, defined as:

$$\chi^2=\sum_{i=1}^{N}\frac{(y_i^{exp}-y_i^{model}(\vec{p}))^2}{\epsilon_i^2}$$

In Python, this process can be implemented using libraries like *Lmfit*, *SciPy*, or *NumPy*, which provide robust tools for least-squares curve fitting. The general approach involves defining a model function that represents the expected shape of the Mössbauer spectrum, which could be a sum of multiple Lorentzian or Gaussian functions, depending on the number and type of spectral components. Lorentzian functions are preferred with crystalline samples, while PseudoVoigts (sum of Lorentzian and Gaussian functions) are appropriated for Fe sites in disordered materials. The least-squares fitting algorithm iteratively adjusts the parameters of the model until the sum of squared residuals between the experimental and calculated spectra is minimized.

# Description of the Python code for Google Colab (PyMossFit)
The Python code is available in several Jupyter notebooks as typical examples found in practice. Some selected parts of it are described througout this page.
[[Saccone2024a]](https://github.com/fsacconeUBA/Mossbauer/releases/tag/PyMossFit-V3)[[Saccone2024b]](http://dx.doi.org/10.13140/RG.2.2.20717.81127). (The comments and code section headings are in spanish)

The code is structured in three cells. The first one includes the installation of *Lmfit*, the core of data fitting. Also, it imports some packages of *Scipy*, *Pandas*, *Matplotlib* and *Numpy*, among others. The next step, includes a Drive connection which asks permission to user.

In [None]:
!pip install lmfit
from google.colab import drive
drive.mount('/content/drive/', force_remount= True)
img = '/content/drive/MyDrive/Colab Notebooks/My Directory/My File'

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from lmfit import Parameters, minimize, fit_report
from scipy.constants import *
from scipy.integrate import trapz
from scipy.signal import savgol_filter
from pathlib import Path

path= Path(img); name= path.stem; title= path.parent.name; full= path.parents[0]; print(name, title, full)

The second cell, reads the datafile (format should be inspected previously to define "delimiter", "columns" and "skiprows" parameters). The required inputs are date (in a YYYYMMDD format) and maximum velocity asociated to the extreme channels.
In this cell, the spectrum folding is performed with a Discrete Fourier Transforming routine, the Numpy fft. The theory of this procedure corresponds to the Nyquist-Shannon Sampling Theorem that helps to determine a folding channel from the symmetry of Discrete Fourier spectra [[Kong2020]](https://pythonnumericalmethods.studentorg.berkeley.edu/notebooks/chapter24.02-Discrete-Fourier-Transform.html). Also data can be smoothed by means of a Savitsky-Golay package (I use savgol from Scipy). After folding, a new datafile is saved for fitting with the use of the next cell.

An example of the output plot, before and after folding, is shown below (Sample corresponds to an iron-zinc mixed ferrites [[Ferrari2015]](http://dx.doi.org/10.1109/TMAG.2014.2377132).

![imagen.png](attachment:b90bc39b-8d8e-4c8a-9135-8672b4529ac9.png)

![imagen.png](attachment:96e9ca11-185c-4bc5-b1ff-beb12d24fe79.png)

The last cell performs the fitting procedure, gives a statistical report and shows in plot the original data and the fitting model. The functions for fitting are defined as follows (doublet_v and sextet_v stands for PseudoVoigt functions):

In [None]:
def singlet(a, b, m, x):
    return 2*a*b/(pi*(b**2+4*(x-m)**2))

def doublet(a, b, m, d, x):
    return 2*a*b/(pi*(b**2+4*(x-(m-d/2))**2))+2*a*b/(pi*(b**2+4*(x-(m+d/2))**2))

def sextet(a, b, m, d, q, l23, x):
    return 3*a*b/(pi*(b**2+4*(x-(m+5*d))**2))+l23*a*b/(pi*(b**2+4*(x-(m+3*d-q))**2))+a*b/(pi*(b**2+4*(x-(m+d))**2))+a*b/(pi*(b**2+4*(x-(m-d))**2))+l23*a*b/(pi*(b**2+4*(x-(m-3*d+q))**2))+3*a*b/(pi*(b**2+4*(x-(m-5*d))**2))

def doublet_v(av, mv, bl, bg, dl, dg, x):
      return 2*av*bl/pi*(bl**2+(4*(x-(mv-dl/2))**2))+2*av*bl/pi*(bl**2+(4*(x-(mv+dl/2))**2))+(1-av)*np.exp(-(x-(mv-dg/2))**2/(2*bg**2))/(bg*np.sqrt(2*np.pi))+(1-av)*np.exp(-(x-(mv+dg/2))**2/(2*bg**2))/(bg*np.sqrt(2*np.pi))

def sextet_v(a, b, m, d, q, l23, x):
    return 3*a*(b/(pi*(b**2+4*(x-(m+5*d))**2))+np.exp(-(x-(m+5*d))**2/(2*b**2))/(b*np.sqrt(2*pi)))+l23 *a*(b/(pi*(b**2+4*(x-(m+3*d-q))**2))+np.exp(-(x-(m+3*d-q))**2/(2*b**2))/(b*np.sqrt(2*pi)))+a*(b/(pi*(b**2+4*(x-(m+d))**2))+np.exp(-(x-(m+d))**2/(2*b**2))/(b*np.sqrt(2*pi)))+a*(b/(pi*(b**2+4*(x-(m-d))**2))+np.exp(-(x-(m-d))**2/(2*b**2))/(b*np.sqrt(2*pi)))+l23*a*(b/(pi*(b**2+4*(x-(m-3*d+q))**2))+np.exp(-(x-(m-3*d+q))**2/(2*b**2))/(b*np.sqrt(2*pi)))+3*a*(b/(pi*(b**2+4*(x-(m-5*d))**2)+np.exp(-(x-(m-5*d))**2)/(2*b**2))/(b*np.sqrt(2*pi)))

def linear_fitting_lmfit(params, x, y):
    b1= params['b1']; m1= params['m1']; d1= params['d1']; a1= params['a1']
    a2= params['a2']; b2= params['b2']; m2= params['m2']; d2= params['d2']; q2= params['q2']; l232= params['l232']
    a3= params['a3']; b3= params['b3']; m3= params['m3']; d3= params['d3']; q3= params['q3']; l233= params['l233']
    a4= params['a4']; b4= params['b4']; m4= params['m4']; d4= params['d4']

    y_fit= 1-(doublet(a1, b1, m1, d1, x))-(sextet(a2, b2, m2, d2, q2, l232, x))-(sextet(a3, b3, m3, d3, q3, l233, x))-doublet(a4, b4, m4, d4, x)

    return y_fit-y

The set of parameters such as linewidth, isomer shifts, quadrupole splitting and hyperfine field are calculated after adjusting the amplitud (*a*), full width at half maximum (*b*), centroid (*m*), line shift (*d*) and line separation (*q*). Initial set of these parameters can be fitted or fixed by selecting the *"True*" or *"False"* options, respectively. A typical fit report of the output of this cell looks as following:


In [None]:
ancho (sigma/sqrt(2)) es: 0.24 mm/s
Centroide (ISO1) es: 1.066 mm/s
Amplitud (a) es: 0.0621
Doblete es: 2.896 mm/s
área: 100.0 %
[[Fit Statistics]]
    # fitting method   = least_squares
    # function evals   = 35
    # data points      = 255
    # variables        = 4
    chi-square         = 2.8129e-04
    reduced chi-square = 1.1207e-06
    Akaike info crit   = -3489.93535
    Bayesian info crit = -3475.77029
[[Variables]]
    b:  0.33342395 +/- 0.00174860 (0.52%) (init = 0.3)
    m:  1.06583140 +/- 6.2960e-04 (0.06%) (init = 1)
    d:  2.89583004 +/- 0.00125858 (0.04%) (init = 2.9)
    a:  0.06212909 +/- 2.2884e-04 (0.37%) (init = 0.13)
[[Correlations]] (unreported correlations are < 0.100)
    C(b, a) = +0.6938

Finally, the code loads fitted parameters, experimental and modeled subespectra in CSV formatted files. One final cell, allows to identify the phases from an own database. The ML algorithm employed to guess the present phases is based on a K-Nearest Neighbors.

In [None]:
df = pd.DataFrame({
    'Ancho(mm/s)': results_df.loc[results_df['Parameter'].isin(['d1_gamma', 'd2_gamma']), 'Value'].values.tolist(),
    'IS (mm/s)': results_df.loc[results_df['Parameter'].isin(['d1_delta', 'd2_delta']), 'Value'].values.tolist(),
    'Quad Splitting (mm/s)': results_df.loc[results_df['Parameter'].isin(['d1_quad', 'd2_quad']), 'Value'].values.tolist(),
    'Bhf (T)': [0, 0]  # Ensure 'Bhf (T)' has the same length as others
})
df.to_csv(f"{full}/{file}-report.csv", index=False)

In [None]:
import pandas as pd
import numpy as np
from sklearn.neighbors import NearestNeighbors

# 1. Montar Google Drive
from google.colab import drive
drive.mount('/content/drive')

# 2. Cargar datos de referencia (base de datos)
reference_path = '/content/drive/MyDrive/Colab-Notebooks/reference_data.csv'  # ¡Ajusta la ruta!
df_ref = pd.read_csv(reference_path)

# Función para convertir rangos a valores medios (ej: "0.37-0.45" → 0.41)
def parse_value(value):
    if isinstance(value, str) and '-' in value:
        min_val, max_val = map(float, value.split('-'))
        return (min_val + max_val) / 2
    return float(value)

# Procesar columnas relevantes
cols = ['IS (mm/s)', 'Quad Splitting (mm/s)', 'Bhf (T)']
for col in cols:
    df_ref[col] = df_ref[col].apply(parse_value)

# 3. Cargar datos experimentales
experimental_path = f"{full}/{file}-report.csv"  # ¡Ajusta la ruta!
df_exp = pd.read_csv(experimental_path)

# 4. Preprocesar datos experimentales (manejar NaN)
X_exp = df_exp[cols].fillna(0).values  # Si Bhf no existe, reemplazar NaN por 0

# 5. Entrenar modelo KNN
X_ref = df_ref[cols].values
model = NearestNeighbors(n_neighbors=3, metric='euclidean')
model.fit(X_ref)

# 6. Encontrar coincidencias
distances, indices = model.kneighbors(X_exp)

# 7. Mostrar resultados
for i, (dist, idx) in enumerate(zip(distances, indices)):
    print(f"\nMuestra experimental {i+1}:")
    for j, (d, pos) in enumerate(zip(dist, idx)):
        compound = df_ref.iloc[pos]['Compound Name']
        formula = df_ref.iloc[pos]['Chemical Formula']
        is_ref = df_ref.iloc[pos]['IS (mm/s)']
        qs_ref = df_ref.iloc[pos]['Quad Splitting (mm/s)']
        bhf_ref = df_ref.iloc[pos]['Bhf (T)']
        print(f"  Match {j+1}: {compound} ({formula})")
        print(f"    IS: {is_ref:.2f} mm/s | QS: {qs_ref:.2f} mm/s | Bhf: {bhf_ref:.1f} T")
        print(f"    Distancia euclidiana: {d:.2f}\n")

print("## Usar este resultado a modo orientativo. Se recomienda contar con información de la muestra en su composición y estructura ##")

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).

Muestra experimental 1:
  Match 1: Condritas-Piroxeno ( (Mg;Fe)SiO₃)
    IS: 1.17 mm/s | QS: 2.06 mm/s | Bhf: 0.0 T
    Distancia euclidiana: 0.11

  Match 2: Hemoglobina (deoxy) (Fe(C₃4H₃2N4O4))
    IS: 0.95 mm/s | QS: 2.25 mm/s | Bhf: 0.0 T
    Distancia euclidiana: 0.24

  Match 3: Siderite (FeCO₃)
    IS: 1.20 mm/s | QS: 1.90 mm/s | Bhf: 0.0 T
    Distancia euclidiana: 0.28


Muestra experimental 2:
  Match 1: FeWO₄ (FeWO₄)
    IS: 0.95 mm/s | QS: 1.65 mm/s | Bhf: 0.0 T
    Distancia euclidiana: 0.45

  Match 2: Hemoglobina (deoxy) (Fe(C₃4H₃2N4O4))
    IS: 0.95 mm/s | QS: 2.25 mm/s | Bhf: 0.0 T
    Distancia euclidiana: 0.50

  Match 3: Nitroprusiato de sodio (Na₂[Fe(CN)₅NO])
    IS: 0.10 mm/s | QS: 1.70 mm/s | Bhf: 0.0 T
    Distancia euclidiana: 0.52

## Usar este resultado a modo orientativo. Se recomienda contar con información de la muestra en su composición y estructura ##

# Some Examples

Mössbauer spectrum of a comercial $LiFePO_{4}$ active material for batteries. Residual phases were detected, such as $FePO_{4}$ and $Fe_{2}P$. This last phase, shows the presence of $^{57}Fe$ in two diferent oxidation states, *Fe(I)* and *Fe(II)*, respectively.

![image.png](attachment:5648c966-6a32-4bd0-91ba-0a8b05a3a567.png)

Mössbauer spectrum of the $LiFePO_{4}$ active material for batteries, after synthesis at lab scale. In this case, just $LiFePO_{4}$ was detected, besides the extraordinary sensitivity of the
characterization technique.

![image.png](attachment:d26c793e-eaec-468b-ba77-04b8ffa4eb2f.png)

CEMS spectrum of pyroxene in Mars soil (Opportunity Mission, Sept. 2004, T= 240-260K) [[Morris2006]](https://doi.org/10.1029/2006JE002791). The subespectra correspond to two different $^{57}Fe$ sites, M1 and M2, tipically found in pyroxene. [[Oshtrakh2007]](https://doi.org/10.1007/s10751-008-9646-4)

![imagen.png](attachment:9a89fd2c-4cd5-43ef-9625-4d71a793bbcf.png)


$^{57}Fe$ Mössbauer spectra of an Iguazú falls soil sample (Argentina-Brazil border)

![image.png](attachment:960c373a-2e1f-4a76-b5fb-896837f89ee3.png)

# References
[Mossbauer_Wikipedia] *Mössbauer Spectroscopy*, http://en.m.wikipedia.org/wiki/Mössbauer_spectroscopy

[Grandjean2021] Grandjean, F. and Long, G. J., *Best Practices and Protocols in Mössbauer Spectroscopy*,Chem. Mater. 2021, 33, 3878−3904

[Saccone2024a] Saccone, F. D., Release of *PyMossFit* in Github, https://github.com/fsacconeUBA/Mossbauer/releases/tag/PyMossFit-V6

[Saccone2024b] Saccone, F.D., *PyMossFit*, ResearchGate, http://dx.doi.org/10.13140/RG.2.2.20717.81127

[Kong2020] Kong, Q; Siauw, T. and Bayen, A., *Python Programming and Numerical Methods - A Guide for Engineers and Scientists*, https://pythonnumericalmethods.studentorg.berkeley.edu/notebooks/Index.html

[Ferrari2015] Ferrari, S. et al, *Structural and magnetic properties of Zn doped magnetite nanoparticles obtained by wet chemical method*, IEEE Transactions on Magnetics, vol. 51, no. 6, pp. 1-6, June 2015, Art no. 2900206

[Morris2006] Morris, R. V., et al. (2006), *Mösbauer mineralogy of rock, soil, and dust at Meridiani Planum, Mars: Opportunity’s
journey across sulfate-rich outcrop, basaltic sand and dust, and hematite lag deposits*, J. Geophys. Res., 111, E12S15,

[Oshtrakh2007] Oshtrakh, M. I., *Determination of quadrupole splitting for 57Fe in M1
and M2 sites of both olivine and pyroxene in ordinary
chondrites using Mössbauer spectroscopy with high
velocity resolution*, Hyperfine Interact (2007) 177, pp. 65-71