<a href="https://colab.research.google.com/github/barbaractong/motor-imagery/blob/main/classificador.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
!pip install mne

Collecting mne
  Downloading mne-1.4.2-py3-none-any.whl (7.7 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.7/7.7 MB[0m [31m10.9 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: mne
Successfully installed mne-1.4.2


In [2]:
import gdown
import glob
import math
import mne
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import scipy.signal as signal

from numpy.fft import fft
from scipy.io import loadmat
from sklearn.model_selection import KFold

In [3]:
# download data from repository to colab - .mat files (1000Hz samples)
!gdown --folder https://drive.google.com/drive/folders/1mMD7zs-H86a7taNh72k4qblaVAbMycm3

Retrieving folder list
Processing file 1srzjwbJyJrIEESYwvbgiDBltFggs7KLr BCICIV_calib_ds1a_1000Hz.mat
Processing file 1PPxsVuseZPBIGdDGKdGOqBh2QCBytHid BCICIV_calib_ds1b_1000Hz.mat
Processing file 1t33ezNoxsR9iTk_1JBSjpsfI4UKznTw9 BCICIV_calib_ds1c_1000Hz.mat
Processing file 1-MBgaMhi3p2oLtg-Qi48RQum5_yu0N78 BCICIV_calib_ds1d_1000Hz.mat
Processing file 1gA1DHlOOi_0Tg8c9XxnLUvBajrRy9jow BCICIV_calib_ds1e_1000Hz.mat
Processing file 1B86Jdt_Z1fLf66s70i-FIG4X5Zt8GvKm BCICIV_calib_ds1f_1000Hz.mat
Processing file 1ccseyvUZ9hdjXYShLf8I_3S-ROOCUVGF BCICIV_calib_ds1g_1000Hz.mat
Retrieving folder list completed
Building directory structure
Building directory structure completed
Downloading...
From: https://drive.google.com/uc?id=1srzjwbJyJrIEESYwvbgiDBltFggs7KLr
To: /content/BCICIV_1calib_1000Hz_mat/BCICIV_calib_ds1a_1000Hz.mat
100% 225M/225M [00:01<00:00, 199MB/s]
Downloading...
From: https://drive.google.com/uc?id=1PPxsVuseZPBIGdDGKdGOqBh2QCBytHid
To: /content/BCICIV_1calib_1000Hz_mat/BCICIV_c

### Dataset information

**Calibration data**:

In the **first two runs**, arrows pointing left, right, or down were presented as visual cues on a computer screen. **Cues were displayed for a period of 4s during which the subject was instructed to perform the cued motor imagery task.** These periods were interleaved with 2s of blank screen and 2s with a fixation cross shown in the center of the screen. The fixation cross was superimposed on the cues, i.e. it was shown for 6s. These data sets are provided with complete marker information.

**Dict description:**

Data are provided in Matlab format (*.mat) containing variables:

- cnt: the continuous EEG signals, size [time x channels]. The array is stored in datatype INT16. To convert it to uV values, use cnt= 0.1*double(cnt); in Matlab.
- mrk: structure of target cue information with fields (the file of evaluation data does not contain this variable)
  - pos: vector of positions of the cue in the EEG signals given in unit sample, length #cues
  - y: vector of target classes (-1 for class one or 1 for class two), length #cues
- nfo: structure providing additional information with fields
  - fs: sampling rate,
  - clab: cell array of channel labels,
  - classes: cell array of the names of the motor imagery classes,
  - xpos: x-position of electrodes in a 2d-projection,
ypos: y-position of electrodes in a 2d-projection.

## Data pre-processing

Loading the data from google drive. For this project, it will be loaded the 1000Hz data sample.

In [62]:
def load_mat_file(fpath):
  return loadmat(fpath, struct_as_record = True)

path = '/content/BCICIV_1calib_1000Hz_mat'
dataFiles = r''+path+'/*.mat'
files = glob.glob(dataFiles)
files.sort()

samples = [load_mat_file(f) for f in files]

In this step, it will be select only the tranning data that is from cue information to save the X_train matrix and the y_train vector.

In [135]:
X_train = []

In [136]:
for sample in range(0, len(samples)):
  for position in samples[sample]['mrk']['pos'][0][0][0]:
    X_train.append(samples[sample]['cnt'][position]*0.1) # Convert to Volt

In [137]:
X_train = np.vstack(X_train)

In [139]:
X_train.shape

(1400, 59)

In [156]:
y_train = []

In [157]:
for sample in range(0, len(samples)):
  y_train.append(samples[sample]['mrk']['y'][0][0][0])

In [159]:
y_train = np.hstack(y_train)

In [160]:
y_train.shape

(1400,)

- Loading data parameters

In [167]:
# Checking if there is any data with a different sample rate
for s in samples:
  print(s['nfo']['fs'])

[[array([[1000]], dtype=uint16)]]
[[array([[1000]], dtype=uint16)]]
[[array([[1000]], dtype=uint16)]]
[[array([[1000]], dtype=uint16)]]
[[array([[1000]], dtype=uint16)]]
[[array([[1000]], dtype=uint16)]]
[[array([[1000]], dtype=uint16)]]


In [181]:
sample_rate = samples[0]['nfo']['fs'][0][0][0][0] # Hz

In [183]:
print(f"Sample rate: {sample_rate} Hz")

Sample rate: 1000 Hz


In [174]:
# Extract given param - Channel names
channel_id = [cid[0] for cid in samples[0]['nfo']['clab'][0][0][0]]

In [190]:
# Check the classes for all volunteers
classes_per_run = []
for s in samples:
  classes_per_run.append(s['nfo']['classes'][0][0][0])

print(classes_per_run)

[array([array(['left'], dtype='<U4'), array(['foot'], dtype='<U4')],
      dtype=object), array([array(['left'], dtype='<U4'), array(['right'], dtype='<U5')],
      dtype=object), array([array(['left'], dtype='<U4'), array(['right'], dtype='<U5')],
      dtype=object), array([array(['left'], dtype='<U4'), array(['right'], dtype='<U5')],
      dtype=object), array([array(['left'], dtype='<U4'), array(['right'], dtype='<U5')],
      dtype=object), array([array(['left'], dtype='<U4'), array(['foot'], dtype='<U4')],
      dtype=object), array([array(['left'], dtype='<U4'), array(['right'], dtype='<U5')],
      dtype=object)]


- Set the parameters

In [192]:
# Checking for the number of total samples
for s in samples:
  print(s['cnt'].shape)

(1905940, 59)
(1905940, 59)
(1905499, 59)
(1904735, 59)
(1903295, 59)
(1906080, 59)
(1906020, 59)


## Run \#1 - Only 'C3', 'C4', 'Cz' channels

- Select only the channels that will be used to check motor stimulus

In [199]:
channels_for_model = []
labels = []
selected_channels = ['C3', 'C4', 'Cz']
for idx, c in enumerate(channel_id):
  if c in selected_channels:
    labels.append(c)
    channels_for_model.append(idx)

print(channels_for_model)
print(labels)

[26, 28, 30]
['C3', 'Cz', 'C4']


- Get the data from X_train matrix

In [201]:
X_train_run_one = X_train[:, channels_for_model]

In [202]:
X_train_run_one.shape

(1400, 3)