* * *
<pre> NYU Paris            <i> Artificial intelligence - Fall 2022 </i></pre>
* * *


<h1 align="center"> Lab 9: Clustering - Blind source separation (Part 2) </h1>

<pre align="left"> October 27th 2022               <i> Author: Hicham Janati </i></pre>
* * *


##### Goals:
- Understand the power and limits of PCA and kernel-PCA and ICA
- Understand the difference between correlation and statistical independence
- Perform blind source separation on real world applications

In [None]:
import numpy as np
from matplotlib import pyplot as plt
from sklearn.decomposition import FastICA, PCA, KernelPCA

sr = 22100

## Part II -  Blind source separation applied to neuroscience


In the first part, we have manually combined the sources with a mixing matrix then tried to separate them. In part II, we are given real sensor data of 59 electromagnetic sensors around the head of an indivudal. Each sensor measures the signals of the electromagnetic field produced by electrical currents in the brain at multiple time points. These data are called Electro-encephalography / Magneto-encephalography (EEG/MEG) data.

In [None]:
import mne

data_path = mne.datasets.sample.data_path()
raw_fname = data_path / 'MEG' / 'sample' / 'sample_audvis_raw.fif'
raw = mne.io.read_raw_fif(raw_fname, verbose=False)
raw.crop(tmin=0, tmax=150, verbose=False)
raw.pick_types(eeg=True, verbose=False)
raw.load_data(verbose=False)
raw.filter(l_freq=1., h_freq=None, verbose=False)
X = raw.get_data(verbose=False)
times = raw.times
X.shape, times.shape

After the stimulus at t = 0ms, the `times` array contains the time coordinates in milliseconds of each sensor measurement. 

In [None]:
times

### Question 1
Visualize the raw time series produced by each sensor in a 59 x 1 plt.subplots figure. Use a large figsize to see them all. Use the times numpy array returned by get_neuro_data for the x axis coordinates.

### Question 2
ICA is used by clinicians to process sensor (EEG/MEG) data and detect artifacts that are captured by the sensors but are not of neuroscientific interest such as heart beats, muscle movements and eye blinks.
Run PCA and ICA on the data with varied number of components (5-15) and visualize the components. Do you notice any particular ones ?

### Question 3


The following function can visualize the component (i.e reconstructed source) in the original sensor. This data can be found in the attribute `components_` of the PCA/ICA object. Visualize the components in the original sensor space using the following function. 

In [None]:
def plot_component_on_brain(component_data, component_number=0, ax=None):
    show = False
    if ax is None:
        _, ax = plt.subplots(1, 1, figsize=(6, 6))
        show = True
    mne.viz.plot_topomap(ica_comp, raw.info, axes=ax, show=False)
    ax.set_title(f"ICA component {component_number}")
    if show:
        plt.show()
    return ax

### Question 4
This data was collected on a subject who was exposed to an audio-visual stimulus at t=0. On average, humans react to such stimuli at least 100ms after the onset. Here, the onset is t=0ms.  

Using ICA with 15 components, relying on both the temporal and the sensor visualizations, can you propose a plausible explanation for the source captured by some of the components ?

### Question 5
Repeat this analysis with PCA. What do you observe ?