# Dimensionality reduction and blind source separation

Requires **HyperSpy v2.0** or above.

## Summary

This tutorial shows how to perform matrix factorization and blind source separation on spectra using HyperSpy. The same procedure can be used to analyse objects of higher dimensionality.

## Credits and changes

* 17/06/2010 Created by Francisco de la Peña for the EELSLab workshop at the LPS, Universté Paris-Sud
* 23/8/2016 Michael Walls. Extra explanations.
* 27/7/2016 Francisco de la Peña. Updated for HyperSpy v1.0.1.
* 6/3/2016 Francisco de la Peña. Adapted from previous tutorials for the SCANDEM workshop.
* 22/08/2024 Francisco de la Peña. Shorten the tutorials by starting directly with the dataset displaying energy instability. 

## Table of contents

1. [Singular value decomposition](#1.-Singular-value-decomposition)
2. [Blind source separation](#2.-Blind-source-separation)
3. [Pre-processing](#3.-Pre-processing)


## 1. Singular value decomposition

This is a powerful method for reducing noise in hyperspectral datasets. Here we begin with a few lines of matrix algebra outlining the principles of the technique, but the demo can be followed easily without a full understanding of the mathematics. As with the "getting started" demo you can run this one interactively by downloading it and saving it as a .ipynb file.

Lets start by supposing a line-spectrum $D$ that can be described by a
linear model.

$D=\left[d_{i,j}\right]_{m\times n}={\displaystyle \left[{\color{green}p_{i,j}}\right]_{m\times{\color{red}l}}\times}\left[{\color{red}{\color{blue}s_{i,j}}}\right]_{{\color{red}l}\times n}$
where $m$ is the *number of pixels* in the line scan, $n$ the *number of channels* in the spectrum and $l$ the *number of components* e.g. spectra of individual compounds.

Normally, what we actually measure is a noisy version of $D$, $D'$,

$D'={\displaystyle \left[d'_{i,j}\right]_{m\times n}=\left[{\color{green}p_{i,j}}\right]_{m\times{\color{red}l}}\times}\left[{\color{red}{\color{blue}s_{i,j}}}\right]_{{\color{red}l}\times n}+\mathrm{Noise}$


$D'$ could be factorized as follows:

$D'={\displaystyle \left[{\tilde{\color{green}p}}_{{i,j}}\right]_{m\times{\color{red}k}}\times}\left[\tilde{s}_{i,j}\right]_{{\color{red}k}\times n}
$ where $k\leq\min\left(m,n\right)$.

Extra constraints are needed to fully determine the matrix factorization. When we add the orthogonality constraint we refer to this decomposition as singular value decomposition (SVD).

In our assumption of a linear model:

$D'={\displaystyle \left[{\tilde{\color{green}p}}_{{i,j}}\right]_{m\times{\color{red}l}}\times}\left[\tilde{s}_{i,j}\right]_{{\color{red}l}\times n}+
{\displaystyle \left[{\tilde{\color{green}p}}_{{i,j}}\right]_{m\times{\color{red}{k-l}}}\times}\left[\tilde{s}_{i,j}\right]_{{\color{red}{k-l}}\times n}$

With 

$D\approx{\displaystyle \left[{\tilde{\color{green}p}}_{{i,j}}\right]_{m\times{\color{red}l}}\times}\left[\tilde{s}_{i,j}\right]_{{\color{red}l}\times n}$

$\mathrm{Noise}\approx{\displaystyle \left[{\tilde{\color{green}p}}_{{i,j}}\right]_{m\times{\color{red}{k-l}}}\times}\left[\tilde{s}_{i,j}\right]_{{\color{red}{k-l}}\times n}$


 

We start by downloading the data for this demo, activating the matplotlib backend, importing HyperSpy and loading a dataset.   

### Set up matplotlib and import Hyperspy

**NOTE**: In the online version of this document we use the `inline` backend that displays interactive figures inside the Jupyter Notebook. However, for interactive data analysis purposes most would prefer to use the `qt4`, `wx` or `nbagg` backends.

In [None]:
%matplotlib widget
# or qt4 etc
import hyperspy.api as hs

### Load a dataset

In [None]:
s = hs.load("datasets/CL.hspy")

This is a synthetic electron energy-loss spectroscopy dataset. The procedure, although not fully general, can easily be used as is or with minor adaptation to analyse other kinds of data including images and higher dimensional signals.

In [None]:
s.plot()

In [None]:
hs.preferences.gui()

In [None]:
s.plot()

To perform SVD in HyperSpy we use the `decomposition` method that, by default, performs SVD.

In [None]:
s.decomposition()

The result of the decomposition is stored in the `learning_results` attribute.

In [None]:
s.learning_results.

SVD decomposes the data in so-called "components" and sorts them in order of decreasing relevance. It is often useful to estimate the dimensionality of the data by plotting the explained variance against the component index on a logarithmic y-scale. This plot is sometimes called a scree-plot and it should drop quickly, eventually becoming a slowly descending line. The point at which it becomes linear is often a good estimation of the dimensionality of the data (or equivalently, the number of components that should be retained).

To plot the scree plot, run the `plot_explained_variance_ratio` method e.g.:


In [None]:
s.plot_explained_variance_ratio()

Unfortunately, the scree plot appears to overestimate the number of relevant components this time. By examining the dataset, we can see that the data suffers from energy instability, which is evident from the shift in the EELS features. To address this issue, we can use a simultaneously acquired low-loss spectrum to align the core-loss spectrum image.

In [None]:
ll = hs.load("datasets/LL.hspy")

In [None]:
ll.align_zero_loss_peak(also_align=[s])

In [None]:
s.decomposition()

In [None]:
s.plot_explained_variance_ratio()

This time from the scree plot we estimate that there are 5 principal components. However, we know (because this is obviously a synthetic dataset), that the correct number of components is 4.

We can store the scree plot as a `Spectrum` instance using the following method before we try to improve the decomposition: 

In [None]:
scree_plot = s.get_explained_variance_ratio()

In [None]:
scree_plot.isig[:30].plot()

### Shot noise

PCA assumes gaussian noise, however, the noise in EELS spectra is approximately poissonian (shot noise). It is possible to approximately "normalise" the noise by using a liner transformation, which should result in a better decomposition of data with shot noise. This is done in HyperSpy as follows:

In [None]:
s.decomposition(True)

Let's plot the scree plot of this and the previous decomposition in the same figure

In [None]:
s.plot_explained_variance_ratio()

In [None]:
ax = hs.plot.plot_spectra([scree_plot.isig[:20],
                          s.get_explained_variance_ratio().isig[:20]],
                          legend=("Std", "Normalized"))

Let's improve the plot using some [matplotlib](http://matplotlib.org/) commands:

In [None]:
ax.set_yscale('log')
ax.lines[0].set_marker("o")
ax.lines[1].set_marker("o")
ax.figure #for details of how these commands work see the matplotlib webpage

As we can see, the explained variance of the first four components in the normalized decomposition is significantly higher than that of the other points, compared to the original data. This indicates a better decomposition and suggests that the optimal number of components is four.

### Visualise the decomposition results

The following commands can be used to plot the first $n$ principal components.
(Note: the `_=` part at the beginning is for webpage rendering purposes and is not strictly needed in the command) 

In [None]:
s

In [None]:
s.T.plot()

In [None]:
_ = s.plot_decomposition_loadings(4)
_ = s.plot_decomposition_factors(4, comp_label="")

Alternatively (and usually more conveniently) we can use the following plotting method. Use the slider or the left/right arrow keys to change the index of the components in interactive plotting mode. 

In [None]:
s.plot_decomposition_results()

### Noise reduction

A common application of PCA is noise reduction, which is achieved by dimensionality reduction. We can create a "denoised" version of the dataset by inverting the decomposition using only the number of principal components that we want to retain (in this case the first 4 components). This is done with the `get_decomposition_model` command.

In [None]:
sc = s.get_decomposition_model(4)

In [None]:
sc.plot()

Let's plot the spectra at coordinates (30,30) from the original and PCA-denoised datasets

In [None]:
(s + sc * 1j).plot()

Calculating and plotting the residuals at a given position can be done in one single line

In [None]:
(s - sc).inav[30,30].plot()

## 2. Blind source separation

### Independent component analysis

As we have seen in the previous section, the principal components are a linear mixture of EELS elemental maps and spectra, but the mixing matrix is unknown. We can use blind source separation (BSS) to estimate the mixing matrix. In this case we will use independent component analysis. BSS is performed in HyperSpy using the `blind_source_separation` method that by default uses "FastICA" which is a well-tested ICA routine.

In [None]:
s.blind_source_separation(4)

The results are also stored in the `learning_results` attribute:

In [None]:
print(s.learning_results.summary())

And can be visualised with the following commands:

In [None]:
_ = s.plot_bss_loadings()
_ = s.plot_bss_factors()

Or usually more conveniently:

In [None]:
s.plot_bss_results()

### Non-negative matrix factorization

By using a different matrix factorization method, non-negative matrix factorization (NMF) we can decompose the data into "elemental" components directly. NMF replaces the orthogonality constraint in SVD by a positivity constraint.

In [None]:
s.decomposition(True, algorithm="NMF", output_dimension=4)

In [None]:
_ = s.plot_decomposition_loadings()
_ = s.plot_decomposition_factors()

In [None]:
s.plot_decomposition_results()

ICA and NMF both do a good job, NMF being slightly better in this case.