# Hand motions classification using non-invasive EEG recordings
### by Cedric Simar and Antoine Passemiers
<hr/>

## Table of content

* [1 - Preprocessing](#preprocessing)
  * [1.1. Import useful libraries](#import-libraries)
  * [1.2. Load the data](#load-data)


* [2. Riemannian-based kernel trick](#kernel-trick)


* [3. Validating the models](#models)


* [4. Bibliography](#bibliography)

## Preprocessing <a class="anchor" id="preprocessing"></a>
<hr/>

### Import useful libraries <a class="anchor" id="import-libraries"></a>

In [10]:
%load_ext Cython

import os
import gc
import time
import scipy.linalg
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

### Load the data <a class="anchor" id="load-data"></a>

TODO Cedric: descriptions (electrodes, multi-labels, échantillonage, tout ça tout ça)

<figure style="text-align:center;">
  <img src="imgs/EEG_Electrode_Numbering.jpg" style="width:450px;">
  <figcaption> Source: [Kaggle](#bib-kaggle) </figcaption>
</figure>

In [2]:
N_PATIENTS = 12
ELECTRODE_NAMES = [
    'Fp1', 'Fp2', 'F7', 'F3', 'Fz', 'F4', 'F8', 'FC5', 'FC1', 'FC2', 'FC6',
    'T7', 'C3', 'Cz', 'C4', 'T8', 'TP9', 'CP5', 'CP1', 'CP2', 'CP6', 'TP10',
    'P7', 'P3', 'Pz', 'P4', 'P8', 'PO9', 'O1', 'Oz', 'O2', 'PO10']
EVENT_NAMES = ['HandStart', 'FirstDigitTouch', 'BothStartLoadPhase', 'LiftOff', 'Replace', 'BothReleased']


def load_dataset(subject=-1, data_dir='', with_test_data=True):
    """
    Parameters
    ----------
    subject: int, list, np.ndarray
        Either a subject id or a sequence of ids. If subject is set to -1,
        the data from all patients will be returned
    data_dir: str
        path to the folder where train/ and test/ subfolders are located
    with_test_data: bool
        If set to true, test series/set will be returned
    
    Return
    ------
    If with_test_data: List of couples of dataframes
    Otherwise: List of dataframes
    """
    if type(subject) == int:
        subject_ids = list(range(1, N_PATIENTS+1)) if subject == -1 else [subject]
    else:
        subject_ids = list(subject)
    print("Loading dataset...")
    subjects_data = list()
    for subject_id in subject_ids:
        subject_train_data = list()
        for series_id in range(1, 9):
            print("\tSeries %i of subject %i" % (series_id, subject_id))
            train_data_filename = os.path.join(
                data_dir, 'train/subj%s_series%s_data.csv' % (subject_id, series_id))
            events_filename = os.path.join(
                data_dir, 'train/subj%s_series%s_events.csv' % (subject_id, series_id))
            train_series, events = pd.read_csv(train_data_filename), pd.read_csv(events_filename)
            train_series[EVENT_NAMES] = events[EVENT_NAMES]
            subject_train_data.append(train_series)
        if with_test_data:
            subject_test_data = list()
            for series_id in range(9, 11):
                test_data_filename = os.path.join(
                    data_dir, 'test/subj%s_series%s_data.csv' % (subject_id, series_id))
                subject_test_data.append(pd.read_csv(test_data_filename))
            subjects_data.append((subject_train_data, subject_test_data))
        else:
            subjects_data.append(subject_train_data)
    return subjects_data

In [3]:
dataset = load_dataset(subject=1, data_dir='data')[0] # Take first list since we consider only patient number 1

Loading dataset...
	Series 1 of subject 1
	Series 2 of subject 1
	Series 3 of subject 1
	Series 4 of subject 1
	Series 5 of subject 1
	Series 6 of subject 1
	Series 7 of subject 1
	Series 8 of subject 1


## Riemannian-based kernel trick <a class="anchor" id="kernel-trick"></a>

In [20]:
training_set, test_set = dataset
training_set = pd.concat(training_set)

X = np.asarray(training_set[ELECTRODE_NAMES], dtype=np.int16)
Y = np.asarray(training_set[EVENT_NAMES], dtype=np.int16)

In [21]:
%%cython

import numpy as np
cimport numpy as cnp
cnp.import_array()

def extract_cov_matrices(cnp.int16_t[:, :] data, Py_ssize_t w):
    cdef Py_ssize_t n_features = data.shape[1]
    cdef Py_ssize_t n_mat = data.shape[0] - w
    cdef cnp.float_t[:, :, :] sigmas = np.empty((n_mat, n_features, n_features), dtype=np.float)
    cdef Py_ssize_t i, j
    with nogil:
        for i in range(n_mat):
            pass # TODO

In [22]:
extract_cov_matrices(X, 150)

## Validating the models <a class="anchor" id="models"></a>
<hr/>

## Bibliography <a class="anchor" id="bibliography"></a>
<hr/>

* <span class="anchor" id="bib-riemann">
    [1] Classification of covariance matrices using a Riemannian-based kernel for BCI applications <br>
    Alexandre Barachant, Stéphane Bonnet, Marco Congedo, Christian Jutten <br>
    https://hal.archives-ouvertes.fr/file/index/docid/820475/filename/BARACHANT_Neurocomputing_ForHal.pdf <br>
  </span>
<br>

* <span class="anchor" id="bib-kaggle">
    [2] Grasp-and-Lift EEG Detection Kaggle Competition <br>
    https://www.kaggle.com/c/grasp-and-lift-eeg-detection <br>
  </span>