**Exercise set 12**
==============


>In this exercise, we will process signals and you will learn how you
>can remove noise, obtain numerical derivatives of noisy signals, and
>correct near-infrared spectra by removing unwanted scattering effects.


**Exercise 12.1**

In this exercise, we will test out the Savitzky&ndash;Golay filter for smoothing
and numerical differentiation. We will use a test signal which has been generated
from the following analytical function,

\begin{equation}
y(t) = \sin 8t - 1.8t^2 + 0.5t^3.
\label{eq:signal}
\tag{1}\end{equation}

The signal is available in the file `Data/signal.txt`.
In addition, we will investigate a test signal which is generated from the same
analytical function, but with noise added. This signal is available in the file
`Data/signal_noise.txt`.
In `scipy` a Savitzky&ndash;Golay filter can be created by using
the method `savgol_filter` from `scipy.signal`.(https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.savgol_filter.html)


**(a)**  Create a Savitzky&ndash;Golay smoothing filter with a window of size $5$
and polynomial order of $3$. Apply this filter to the signal without noise
and compare the filtered signal with the original signal.



In [None]:
# Your code here

**Your answer to question 12.1(a):** *Double click here*

**(b)**  Create a Savitzky&ndash;Golay smoothing filter with a window of size $7$
and polynomial order of $3$. Apply this filter to the signal without noise
and compare the filtered signal with the original signal.



In [None]:
# Your code here

**Your answer to question 12.1(b):** *Double click here*

**(c)**  Create a Savitzky&ndash;Golay filter for first-order differentiation with a
window size of $5$ and polynomial order of $3$. Apply this to the signal
without noise and compare the differentiated signal with the analytical
derivative of Eq. 1. 

**Note:** In order to obtain the derivative,
you will have to supply the spacing between your points using the
`delta` parameter of the `savgol_filter` method.

In [None]:
# Your code here

**Your answer to question 12.1(c):** *Double click here*

**(d)**  Create a Savitzky&ndash;Golay filter for first-order differentiation with a
window size of $7$ and polynomial order of $3$. Apply this to the signal
without noise and compare the differentiated signal with the analytical
derivative of Eq. 1.



In [None]:
# Your code here

**Your answer to question 12.1(d):** *Double click here*

**(e)**  Create a Savitzky&ndash;Golay filter for smoothing of the signal with noise.
Here, you have to experiment with the window size to use. Use a polynomial
order of $3$, and compare your smoothed signal with the original signal, and
the signal without noise.



In [None]:
# Your code here

**Your answer to question 12.1(e):** *Double click here*

**(f)**  Create a Savitzky&ndash;Golay filter for first-order differentiation of the signal
with noise. Here, you have to experiment with the window size to use.
Use a polynomial order of $3$ and compare your differentiated signal
with the analytical derivative of Eq. 1.


In [None]:
# Your code here

**Your answer to question 12.1(f):** *Double click here*



**Exercise 12.2**

In this exercise, we will try to smooth by performing convolution.
We will attempt to smooth the signal given in Eq. 1
with added noise (data file: `Data/signal_noise.txt`). A short example
on performing convolution with `numpy`:
```python
import numpy as np

signal = np.loadtxt('Data/signal_noise.txt')[:, 1]  # Load signal.
window = np.bartlett(21)  # Create a triangular window.
window /= window.sum()  # Normalize the window.
conv = np.convolve(signal, window, mode='same')  # Calculate convolution.
```


**(a)**  Select a windowing function(A list of the available windowing
functions in `numpy` can be found at [here](https://docs.scipy.org/doc/numpy/reference/routines.window.html)
and use this to smooth the signal by convolution. Compare your
smoothed signal with the raw data and the analytical expression
without noise (Eq. 1).



In [None]:
# Your code here

**Your answer to question 12.2(a):** *Double click here*

**(b)**  Create a windowing function which is "rectangular": constant for a small and
finite region, and zero otherwise. Use this function to smooth the signal, and
compare it with the raw data and Eq. 1.


In [None]:
# Your code here

**Your answer to question 12.2(b):** *Double click here*


**Exercise 12.3**

We will now investigate a noise signal that contains an artificially added
trend. We will try to remove this trend and locate the peaks present in the signal.


**(a)**  Consider the signal given in `Data/peaks.txt`. 
Use a third-order polynomial to estimate the trend in the data.
Subtract this trend from the raw data and plot the resulting signal.



In [None]:
# Your code here

**Your answer to question 12.3(a):** *Double click here*

**(b)** Smooth the signal with the trend removed (either by
convolution or by using a Savitzky&ndash;Golay
filter) and plot the smoothed signal. Identify the $6$ main peaks present
in the signal. 

In [None]:
# Your code here

**Your answer to question 12.3(b):** *Double click here*


**Exercise 12.4**

You have measured the signal given in the file `Data/spike.txt`. Unfortunately, 
the signal contains a prominent spike that you would like to remove.

**(a)**  Process the signal by convolution. Use a window of your choice,
for instance, the [Bartlett](https://en.wikipedia.org/wiki/Window_function#Triangular_window) window.
Are you able to remove the spike?



In [None]:
# Your code here

**Your answer to question 12.4(a):** *Double click here*

**(b)**  Process the signal by implementing a *median* filter.
The median filter
returns the median value of the signal in a given window.
Unfortunately, this filter can not
be expressed as a simple convolution and you will thus have to create
a new method that processes the signal.
Create such a method, and process the signal. Are you able to remove the
spike with this filter?



In [None]:
# Your code here

**Your answer to question 12.4(b):** *Double click here*

**(c)**  Compare the median filter you have created with the convolution results.


In [None]:
# Your code here

**Your answer to question 12.4(c):** *Double click here*



**Exercise 12.5**

Multiple Scatter Correction (MSC) is one approach to remove non-linear
effects in near-infrared (NIR) spectra. Such effects may arise as a result
of scattering effects in a sample.
This will generate additional variance which is not related to the
chemical interesting information contained in the spectrum, and we would,
therefore, like to remove it.

The file `Data/nir_msc.txt` contains $222$ spectra which have been
sampled at $121$ wavelengths. Each row contains a spectrum, and each
column represents a single wavelength. You will now apply MSC to
correct these spectra.


**(a)**  Find a representative spectrum by taking the mean of the $222$ spectra.
We will refer to this representative spectrum as $f(x)$ in the following.



In [None]:
# Your code here

**Your answer to question 12.5(a):** *Double click here*

**(b)**  Correct each spectrum, $h_i(x)$, by first fitting it to a linear
equation,
\begin{equation}
h_i(x) = a_i f(x) + b_i ,
\end{equation}

and then remove the scattering effects by taking,
\begin{equation}
h_{i, \text{corrected}}(x) = \frac{h_i(x) -b_i}{a_i} .
\end{equation}




In [None]:
# Your code here

**Your answer to question 12.5(b):** *Double click here*

**(c)**  Plot the corrected and uncorrected spectra.
Does this look like what you would expect?



In [None]:
# Your code here

**Your answer to question 12.5(c):** *Double click here*

**(d)**  Quantify the effect of the MSC by calculating the sum of squares, $SS_0$,
of the *centered* spectra,
\begin{equation}
SS_0 = \sum_i \sum_j (x_{ij} - \overline{x}_j)^2,
\end{equation}

where $x_{ij}$ is the intensity for spectrum $i$ at wavelength $j$, and
$\overline{x}_j$ is the mean of all spectra for wavelength $j$.
Calculate $SS_0$ for both the corrected and uncorrected spectra.
Are these values as you would expect?



In [None]:
# Your code here

**Your answer to question 12.5(d):** *Double click here*

**(e)**  A simple alternative to MSC is to "auto-scale" each spectrum,
\begin{equation}
h_{i, \text{auto-scale}}(x) = \frac{h_i(x) - \overline{h}_i}{\sigma_i},
\end{equation}

where $h_i(x)$ is the original spectrum, $\overline{h}_i$ its average, and
$\sigma_i$ the standard deviation of the spectrum. Apply this to the
original spectra, and compare with the uncorrected spectra and the MSC spectra.


In [None]:
# Your code here

**Your answer to question 12.5(e):** *Double click here*