# Introduction to *HyperSpy*

> **Multidimensional Data Analysis in Python Using [HyperSpy](https://hyperspy.org)**

Tutorial for the **DPG Frühjahrstagung SKM 2026**

> Dresden, March 8, 2026

**Table of Contents:**

- [Import packages](#Import-packages)
- [Loading files](#Loading-files)
- [Data structure / Axes handling](#Data-structure-/-Axes-handling)
- [Plot / Explore](#Plot-/-Explore)
- [Indexing](#Indexing)
- [Examples of data processing](#Examples-of-data-processing)
- [Basic model fitting](#Basic-model-fitting)

## Import packages

We import the public functions (api = application programming interface) of `HyperSpy`. Object oriented functions of extensions such as `LumiSpy`, `eXSpy` or `pyxem` will be directly available if installed. For some additional utilities, the packages would need to be loaded separately.

Finally, `numpy` provides numerical operations on arrays that we will use:

In [None]:
# Use '%matplotlib widget' or '%matplotlib ipympl' in JupterLab and '%matplotlib notebook' in JupyterNotebook for interactive inline functionality (e.g. in a notebook or on binder)
# For pop-up window plots on your local computer, use '%matplotlib tk' or '%matplotlib qt' instead
# %matplotlib qt 
%matplotlib ipympl

import hyperspy.api as hs
import numpy as np

# Plot multiple inline figures side-by-side horizontally
hs.preferences.Plot.widget_plot_style = 'horizontal'

**LumiSpy**, **eXSpy** and **pyxem** provide dedicated signal classes.

We can check the **available signal types**:

In [None]:
hs.print_known_signal_types()

## Loading files

For saving analyses, HyperSpy has its own hdf5-based data format `.hspy`.

**RosettaSciIO** provides support for a wide range of microscopy (and spectroscopy) related [data file types](https://rosettasciio.readthedocs.io/en/latest/supported_formats/index.html)!

We will load one file that we will use during the demo, a preprocessed dataset saved in the `hspy` format:

*We assume the file location as in the demo repository, if you downloaded the notebook and the data files individually, you might need to adapt the path.*

In [None]:
s1 = hs.load("data/nanoparticles.hspy")

To see **parameters** that the function takes, in Jupyter, you can **display the docstring** by using a `?`:

In [None]:
hs.load?

## Data structure / Axes handling

Each HyperSpy signal object has certain attributes that contain the relevant data about the **axes, data and metadata**.

To understand the HyperSpy datastructure, lets have a look at the dataset `s1`.

As **LumiSpy** is installed, the dataset is directly recognized as cathodoluminescence (CL) data and the `signal_type` set to `CLSpectrum`. (The fallback would be the more generic `Signal1D` if LumiSpy is not installed).

The **signal class** provides certain specific routines, for example conversion to energy axis in the case of luminescence data.

Our sample dataset has **two navigation dimensions** and **one signal (spectral) dimension**:

In [None]:
s1

### Axes

The **information about the axes** is stored in the `axes_manager`. Thus, we can get more details about the different axes, by calling the **axes manager**.

HyperSpy distinguishes three types of axes:

- `UniformDataAxis` defined by the initial value `offset` and spacing `scale`
- `FunctionalDataAxis` defined by a `UniformDataAxis` and a function `expression`
- `DataAxis` defined by an array `axis`

All three axes of this example are of type `UniformDataAxis`:

In [None]:
s1.axes_manager

### Data

The **actual data** (signal intensity) is stored in a multidimensional numpy array:

In [None]:
s1.data

### Metadata

For most supported file formats, the metadata is automatically parsed into **HyperSpy's metadata tree**.
It contains information about the measurement, but potentially also about post-processing.

In a separate tree, the **complete metadata from the vendor format** is read in (which follows different conventions depending on the format): `s1.original_metadata` (empty for the current example).

In [None]:
s1.metadata

## Plot / Explore

We can easily plot and explore the hyperspectral data (drag the marker in the *navigation* window to change the displayed spectrum):

Some convenient keyboard commands when exploring plots using the 2D-Navigator map:
- `Ctrl`&`ArrowKeys` moves the cursor (alternatively to dragging with the mouse)
- `+` Increases the size of the marker, e.g. to easier select it with the mouse
- `-` Decreases the size of the marker
- `e` adds a second marker to compare two spectra
- `l` toggle log-scale for selected plot

*(In the following, we will use the preprocessed dataset `s1`. The sample contains MethylammoniumLead Bromine (MAPbBr3) perovskite single crystals fabricated by Alice Dearle and measured by Jordi Ferrer Orri at Cambirdge University.)*

In [None]:
s1.plot()

Plot the **average CL spectrum** of the whole map:

In [None]:
s1.mean().plot()

## Indexing

HyperSpy has a powerful numpy (Matlab) style indexing mechanism that distinguishes between navigation and signal axes:

- `.inav[x1:x2,y1:y2]`
- `.isig[s1:s2]`

The index parameters can be either:
- `int` (integer): Index in the axis array
- `float`: Value in calibrated axis units

For example, we can either plot a **subset of the map** in navigation space (selected using pixels as index):

In [None]:
s1.inav[2:23,0:20].plot()

Or, we can plot the mean spectrum in a certain spectral range (selected using wavelength units):

In [None]:
s1.isig[440.:600.].mean().plot()

### Chromatic imaging:

Indexing can also be used for color-filtered (chromatic) imaging.

First, lets plot the **panchromatic image** (integrated over wavelength):

*(the object is transposed, so that we plot the intensity over navigation instead of signal dimensions)*

In [None]:
s1.T.mean().plot(cmap='viridis')

Now, we can **plot the intensity in a selected spectral window** (color-filtered image) using indexing:

In [None]:
s1.isig[480.:550.].T.mean().plot(cmap='viridis')

Alternatively, we can interactively select a spectral window (color-filtered image) using **regions of interest (ROIs)**:

In [None]:
im = s1.T
im.plot()
roi1 = hs.roi.SpanROI(left=455, right=485) #sets a digitalbandfilter
im_roi1 = roi1.interactive(im, color="red")
im_roi1_mean = hs.interactive(im_roi1.mean,
                          event=roi1.events.changed,
                          recompute_out_event=None)
im_roi1_mean.plot(cmap='viridis')

The same functionality is available through the dedicated plot function `plot_roi_map`, where multiple ROIs can be used to filter an image:

In [None]:
hs.plot.plot_roi_map(s1, rois=2)

## Examples of data processing

### Rebinning and smoothing

The data is quite noisy, while the pixel number in the spectral dimension is high. So **rebinning** could improve the data display:

*As we want to use the non-processed data afterwards for fitting, we work with a rebinned copy of the dataset.*

In [None]:
s2 = s1.rebin(scale=[1,1,2])
s2

In [None]:
s2.plot()

Additionally, HyperSpy provides three different functions for **data smoothing**:

- `smooth_lowess` (lowess smoothing)
- `smooth_savitzky_golay` (Savitzky Golay filter)
- `smooth_tv` (total variation data smoothing)

These functions can be run **interactively** to choose the right parameters, but the parameters can also be passed to the function. You can play with the parameters and get a live preview, and hit `Apply` when you are happy with the smoothed curve.

In [None]:
s2.smooth_savitzky_golay()

In [None]:
s2.plot()

If we want to save the cleaned dataset to reload it in the future, we would use the `hspy` format provided by **RosettaSciIO**:

In [None]:
s2.save("data/nanoparticles_smoothed.hspy")

### Signal math

We can **directly perform mathematical operations** on a `signal object`.

For example, we can simply add an offset of `20` to the whole dataset.

*We'll work with a copy of the signal object.*

In [None]:
s3 = s1.deepcopy()
s3 += 20
s3.plot()

However, we can also perform **mathematical operations between `signal objects`** if their dimensions are compatible.

As example, we create a 1D signal object with exactly the same dimensions as our original signal with the value of 20 everywhere:

In [None]:
bg = hs.signals.Signal1D(20 * np.ones(s3.data.shape))
bg

In [None]:
bg.plot()

If we subtract that new signal object from our original one, we do the same thing as if we would do `s3 -= 20`:

*As we do not assign the result to a new object it is only used for plotting.*

In [None]:
(s3 - bg).plot()

We can also do math with a signal that has the same signal, but no navigation dimensions - so we can take a single pixel from `bg`:

In [None]:
(s3 - bg.inav[0,0]).plot()

### The `map` function

To perform an operation on the data at each coordinate, HyperSpy provides the `map()` function.

As a simple example, we will apply `np.max` to get the maximum intensity from each spectrum. However, the  [`map` function](https://hyperspy.org/hyperspy-doc/current/user_guide/signal.html#iterating-external-functions-with-the-map-method) can be used to apply any function defined for individual datasets on a complete spectral image.

In [None]:
s1max = s1.map(np.max, inplace=False)
s1max.plot()

Obviously, our dataset contains a cosmic spike (single bright pixel) that we can easily remove with the `spikes_removal_tool`:

In [None]:
s1.spikes_removal_tool(interactive=False, max_num_bins=6000)
s1max = s1.map(np.max, inplace=False)
s1max.plot()

However, basic mathematic functions such as `max` are directly implemented in *HyperSpy* and we can get the same result without using `map`:

In [None]:
s1max = s1.max(axis=-1)
s1max

As the resulting signal has navigation, but no signal dimensions, we have to transpose it if we want to change the colormap, as the navigator plot does not support different colormaps:

In [None]:
s1max.T.plot(cmap='viridis')

## Basic model fitting

We will start by introducing very basic fitting functionality. For more details see also the [HyperSpy demos repository](https://github.com/hyperspy/hyperspy-demos).

First, we need to **initialize the model** (using the unsmoothed data):

In [None]:
m = s1.create_model()

A HyperSpy model can be composed of several **components** (functions).

We can **check the components** of the model – should be empty, but for some types of signals like EDS and EELS, the model is automatically initialized with components:

In [None]:
m.components

Thus, we need to **create some components** and **add them to the model**.

As the emission peak in our dataset is symmetric, we will use a single `GaussianHF` component. This function is characterized by a position `centre`, a `height`, and a width parameter `fwhm` (therefore *HF* in contrast to a Gaussian defined via area and sigma). The only start value we need to set for a successful fit is a centre wavelength `centre=515 nm`.

*Note that HyperSpy has a range of [built-in functions](https://hyperspy.org/hyperspy-doc/current/user_guide/model/model_components.html#pre-defined-model-components) covering most needs that can be added as components to a model. However, it also has an intuitive mechanism to [define custom functions](https://hyperspy.org/hyperspy-doc/current/user_guide/model/model_components.html#define-components-from-a-mathematical-expression).*

In [None]:
# Docstring of the GaussianHF component
hs.model.components1D.GaussianHF?

In [None]:
g1 = hs.model.components1D.GaussianHF(centre=515)
## Alternative way to set the start value of centre:
# g1.centre.value = 515
m.append(g1)
## Alternatively add a list of components:
# o1 = hs.model.components1D.Offset()
# m.extend([g1,o1])
m.components

To see the parameters of our components and their default values, we can **print all parameter values**:

In [None]:
m.print_current_values()

As many pixels of the map contain only noise, we first establish improved starting values by fitting a pixel with a good signal intensity:

In [None]:
s1.axes_manager.indices = (7,7)
m.fit()

In [None]:
m.plot()

We can now print the updated parameters values at the current index:

In [None]:
m.print_current_values()

We now assign these parameters as starting values to all pixels. To then apply the fit to all the spectra in the map, we use the `multifit` command.

In the current case of a single, fairly well defined peak, we achieve a good fit without setting any boundaries.

In [None]:
m.assign_current_values_to_all()
m.multifit()

We can now plot the **model with the data**:

In [None]:
m.plot()

To plot a **map of the `height` parameter**, we convert it to a signal:

In [None]:
g1.height.as_signal().plot(cmap='viridis')

## Now try with your own data!
