# EEG Classification - Dataset creation
updated: Sep. 01, 2018

Data: https://www.physionet.org/pn4/eegmmidb/

## 1. Data Downloads

### Warning: Executing these blocks will automatically create directories and download datasets.

In [1]:
# System
import requests
import re
import os
import pathlib
import urllib

# Essential Data Handling
import numpy as np
import pandas as pd
from math import ceil, floor

# Get Paths
from glob import glob

# EEG package
from mne import pick_types, events_from_annotations
from mne.io import read_raw_edf

import pickle
import sys

In [11]:
# just to take the dataset from physionet web site
CONTEXT = 'pn4/'
MATERIAL = 'eegmmidb/'
URL = 'https://www.physionet.org/' + CONTEXT + MATERIAL

# Change this directory according to your setting
USERDIR = './dataset/raw_data/'

page = requests.get(URL).text
FOLDERS = sorted(list(set(re.findall(r'S[0-9]+', page))))

URLS = [URL+x+'/' for x in FOLDERS]

## Data Description

Subjects performed different motor/imagery tasks while 64-channel EEG were recorded using the BCI2000 system (http://www.bci2000.org). Each subject performed 14 experimental runs: 

- two one-minute baseline runs (one with eyes open, one with eyes closed)
- three two-minute runs of each of the four following tasks:
    - 1:
        - A target appears on either the left or the right side of the screen. 
        - The subject opens and closes the corresponding fist until the target disappears. 
        - Then the subject relaxes.
    - 2:
        - A target appears on either the left or the right side of the screen. 
        - The subject imagines opening and closing the corresponding fist until the target disappears. 
        - Then the subject relaxes.
    - 3:
        - A target appears on either the top or the bottom of the screen. 
        - The subject opens and closes either both fists (if the target is on top) or both feet (if the target is on the bottom) until the target disappears. 
        - Then the subject relaxes.
    - 4:
        - A target appears on either the top or the bottom of the screen. 
        - The subject imagines opening and closing either both fists (if the target is on top) or both feet (if the target is on the bottom) until the target disappears. 
        - Then the subject relaxes.

The data are provided here in EDF+ format (containing 64 EEG signals, each sampled at 160 samples per second, and an annotation channel). 
For use with PhysioToolkit software, rdedfann generated a separate PhysioBank-compatible annotation file (with the suffix .event) for each recording. 
The .event files and the annotation channels in the corresponding .edf files contain identical data.

# Summary tasks

Remembering that:

    - Task 1 (open and close left or right fist)
    - Task 2 (imagine opening and closing left or right fist)
    - Task 3 (open and close both fists or both feet)
    - Task 4 (imagine opening and closing both fists or both feet)

we will referred to 'Task *' with the meneaning above. 

In summary, the experimental runs were:

1.  Baseline, eyes open
2.  Baseline, eyes closed
3.  Task 1 
4.  Task -2 
5.  Task --3 
6.  Task ---4 
7.  Task 1
8.  Task -2
9.  Task --3
10. Task ---4
11. Task 1
12. Task -2
13. Task --3
14. Task ---4

# Annotation

Each annotation includes one of three codes (T0, T1, or T2):

- T0 corresponds to rest
- T1 corresponds to onset of motion (real or imagined) of
    - the left fist (in runs 3, 4, 7, 8, 11, and 12)
    - both fists (in runs 5, 6, 9, 10, 13, and 14)
- T2 corresponds to onset of motion (real or imagined) of
    - the right fist (in runs 3, 4, 7, 8, 11, and 12)
    - both feet (in runs 5, 6, 9, 10, 13, and 14)
    
In the BCI2000-format versions of these files, which may be available from the contributors of this data set, these annotations are encoded as values of 0, 1, or 2 in the TargetCode state variable.

{'T0':0, 'T1':1, 'T2':2}

In our experiments we will see only :

- run_type_0:
    - append_X
- run_type_1
    - append_X_y
- run_type_2
    - append_X_y
    
and the coding is: 

- T0 corresponds to rest 
    - (2)
- T1 (real or imagined)
    - (4,  8, 12) the left fist 
    - (6, 10, 14) both fists 
- T2 (real or imagined)
    - (4,  8, 12) the right fist 
    - (6, 10, 14) both feet 

## 2. Raw Data Import

I will use a EEG data handling package named MNE (https://martinos.org/mne/stable/index.html) to import raw data and annotation for events from edf files. This package also provides essential signal analysis features, e.g. band-pass filtering. The raw data were filtered using 1Hz of high-pass filter.

In this research, there are 5 classes for the data, imagined motion of:
    - right fist, 
    - left fist, 
    - both fists, 
    - both feet,
    - rest with eyes closed.

A data (S089) from one of the 109 subjects was excluded as the record was severely corrupted.

In [12]:
# Get file paths
PATH = './dataset/raw_data/'
SUBS = glob(PATH + 'S[0-9]*')
FNAMES = sorted([x[-4:] for x in SUBS])

REMOVE = ['S088', 'S089', 'S092', 'S100']

# Remove subject 'S089' with damaged data and 'S088', 'S092', 'S100' with 128Hz sampling rate (we want 160Hz)
FNAMES = [ x for x in FNAMES if x not in REMOVE] 

emb = {'T0': 1, 'T1': 2, 'T2': 3}

In [94]:
data_type = 'Imaged'
# data_type = 'Real'
user_independent = True

In [89]:
def my_get_data(data_type, subj_num=FNAMES, epoch_sec=0.0625):
    """ Import from edf files data and targets in the shape of 3D tensor
    
        Output shape: (Trial*Channel*TimeFrames)
        
        Some edf+ files recorded at low sampling rate, 128Hz, are excluded. 
        Majority was sampled at 160Hz.
        
        epoch_sec: time interval for one segment of mashes (0.0625 is 1/16 as a fraction)
    """
    
    # Event codes mean different actions for two groups of runs
    if data_type == 'Real':
        run_type_0 = '01'.split(',')
        run_type_1 = '03,07,11'.split(',')
        run_type_2 = '05,09,13'.split(',')
    else:
        run_type_0 = '02'.split(',')
        run_type_1 = '04,08,12'.split(',')
        run_type_2 = '06,10,14'.split(',')
    
    # Initiate X, y
    X = []
    y = []
    p = []
    dim = dict()
    
    # To compute the completion rate
    count = len(subj_num)
    
    # fixed numbers
    nChan = 64 
    sfreq = 160
    sliding = epoch_sec/2 
    timeFromQue = 0.5
    timeExercise = 4.1 #secomds
    magic_number = 51
    
    run_0_segments = int(magic_number * (magic_number*timeFromQue))
    run_segments = magic_number

    # Sub-function to assign X and X, y
    def append_X(n_segments, data, event=[]):
        # Data should be changed
        '''This function generate a tensor for X and append it to the existing X'''
    
        def window(n):
            # (80) + (160 * 1/16 * n) 
            windowStart = int(timeFromQue*sfreq) + int(sfreq*sliding*n) 
            # (80) + (160 * 1/16 * (n+2))
            windowEnd = int(timeFromQue*sfreq) + int(sfreq*sliding*(n+2)) 
            
            while (windowEnd - windowStart) != sfreq*epoch_sec:
                windowEnd += int(sfreq*epoch_sec) - (windowEnd - windowStart)
                
            return [windowStart, windowEnd]
        
        new_x = []
        for n in range(n_segments):
            # print('data[:, ',window(n)[0],':',window(n)[1],'].shape = ', data[:, window(n)[0]:window(n)[1]].shape, '(',nChan,',',int(sfreq*epoch_sec),')')
            
            if data[:, window(n)[0]:window(n)[1]].shape==(nChan, int(sfreq*epoch_sec)):
                new_x.append(data[:, window(n)[0]: window(n)[1]])
                 
        return new_x
    
    def append_X_Y(p, run_type, event, old_x, old_y, old_p, data):
        '''This function seperate the type of events 
        (refer to the data descriptitons for the list of the types)
        Then assign X and Y according to the event types'''
        # Number of sliding windows

        # print('data', data.shape[1])
        n_segments = run_segments
        #n_segments = floor(data.shape[1]/(epoch_sec*sfreq*timeFromQue) - 1/epoch_sec - 1)
        # print('run_'+str(run_type),' n_segments', n_segments, 'data', data.shape)
        
        # Rest excluded
        if event[2] == emb['T0']:
            return old_x, old_y, old_p
        
        # y assignment
        if run_type == 1:
            temp_y = [1] if event[2] == emb['T1'] else [2]
        
        elif run_type == 2:
            temp_y = [3] if event[2] == emb['T1'] else [4]
            
        # print('event[2]', event[2], 'run_type', run_type, 'temp_y', temp_y)            
        
        # print('timeExercise * sfreq', timeExercise*sfreq, ' ?= 656')
        new_x = append_X(n_segments, data, event)
        new_y = old_y + temp_y*len(new_x)
        new_p = old_p + p*len(new_x)
        
        return old_x + new_x, new_y, new_p
    
    # Iterate over subj_num: S001, S002, S003, ...
    for i, subj in enumerate(subj_num):
        # print('subj', subj)

        
        # Return completion rate
        if i%((len(subj_num)//10)+1) == 0:
            print('\n')
            print('working on {}, {:.0%} completed'.format(subj, i/count))
            print('\n')
        
        old_size = np.array(y).shape[0]
        # print('subj:', subj, '| y.shape', np.array(y).shape ,'| X.shape', np.array(X).shape)

        # Get file names
        fnames = glob(os.path.join(PATH, subj, subj+'R*.edf'))
        # Hold only the files that have an even number
        fnames = sorted([name for name in fnames if name[-6:-4] in run_type_0+run_type_1+run_type_2])

        # for each of ['02', '04', '06', '08', '12', '14']
        for i, fname in enumerate(fnames):
            # print('fname', fname)
            
            # Import data into MNE raw object
            raw = read_raw_edf(fname, preload=True, verbose=False)
            
            picks = pick_types(raw.info, eeg=True)
            # print('n_times', raw.n_times)
            
            if raw.info['sfreq'] != 160:
                print('{} is sampled at 128Hz so will be excluded.'.format(subj))
                break
            
            
            # High-pass filtering
            raw.filter(l_freq=1, h_freq=None, picks=picks)

            # Get annotation
            try:
                events = events_from_annotations(raw, verbose=False)
            except:
                continue

            # Get data
            data = raw.get_data(picks=picks)

            # print('event.shape', np.array(events[0]).shape, '| data.shape', data.shape)

            # Number of this run
            which_run = fname[-6:-4]

            """ Assignment Starts """ 
            # run 1 - baseline (eye closed)
            if which_run in run_type_0:

                # Number of sliding windows
                n_segments = run_0_segments
                # n_segments = floor(data.shape[1]/(epoch_sec*sfreq*timeFromQue) - 1/epoch_sec - 1)
                # print('run_0 n_segments', n_segments, 'data', data.shape)

                # Append 0`s based on number of windows
                new_X = append_X(n_segments, data)
                X += new_X
                y.extend([0] * len(new_X))
                p.extend([subj]* len(new_X))
                # print(events[0])   

            # run 4,8,12 - imagine opening and closing left or right fist    
            elif which_run in run_type_1:

                for i, event in enumerate(events[0]):

                    X, y, p = append_X_Y([subj], run_type=1, event=event, old_x=X, old_y=y, old_p=p, data=data[:, int(event[0]) : int(event[0] + timeExercise*sfreq)])
                    # print(event)   

            # run 6,10,14 - imagine opening and closing both fists or both feet
            elif which_run in run_type_2:

                for i, event in enumerate(events[0]):      

                    X, y, p = append_X_Y([subj], run_type=2, event=event, old_x=X, old_y=y, old_p=p, data=data[:, int(event[0]) : int(event[0] + timeExercise*sfreq)])
                    # print(event)    

        print('subj:', subj, '|', np.array(y).shape[0] - old_size, '| y.shape', np.array(y).shape ,'| X.shape', np.array(X).shape, '| p.shape', np.array(p).shape)
        dim[subj] =  np.array(y).shape[0] - old_size
    print(np.array(X).shape)

    X = np.stack(X)
    y = np.array(y).reshape((-1,1))
    p = np.array(p).reshape((-1,1))
    return X, y, p, dim

In [90]:
X, y, p, dim = my_get_data(data_type, FNAMES, epoch_sec=0.0625)



working on S001, 0% completed


subj: S001 | 5890 | y.shape (5890,) | X.shape (5890, 64, 10) | p.shape (5890,)
subj: S002 | 5890 | y.shape (11780,) | X.shape (11780, 64, 10) | p.shape (11780,)
subj: S003 | 5890 | y.shape (17670,) | X.shape (17670, 64, 10) | p.shape (17670,)
subj: S004 | 5890 | y.shape (23560,) | X.shape (23560, 64, 10) | p.shape (23560,)
subj: S005 | 5890 | y.shape (29450,) | X.shape (29450, 64, 10) | p.shape (29450,)
subj: S006 | 5890 | y.shape (35340,) | X.shape (35340, 64, 10) | p.shape (35340,)
subj: S007 | 5890 | y.shape (41230,) | X.shape (41230, 64, 10) | p.shape (41230,)
subj: S008 | 5890 | y.shape (47120,) | X.shape (47120, 64, 10) | p.shape (47120,)
subj: S009 | 5890 | y.shape (53010,) | X.shape (53010, 64, 10) | p.shape (53010,)
subj: S010 | 5890 | y.shape (58900,) | X.shape (58900, 64, 10) | p.shape (58900,)
subj: S011 | 5890 | y.shape (64790,) | X.shape (64790, 64, 10) | p.shape (64790,)


working on S012, 10% completed


subj: S012 | 5890 | y.shape (706

subj: S098 | 5890 | y.shape (559550,) | X.shape (559550, 64, 10) | p.shape (559550,)
subj: S099 | 5890 | y.shape (565440,) | X.shape (565440, 64, 10) | p.shape (565440,)
subj: S101 | 5890 | y.shape (571330,) | X.shape (571330, 64, 10) | p.shape (571330,)
subj: S102 | 5890 | y.shape (577220,) | X.shape (577220, 64, 10) | p.shape (577220,)
subj: S103 | 5890 | y.shape (583110,) | X.shape (583110, 64, 10) | p.shape (583110,)


working on S104, 94% completed


subj: S104 | 5788 | y.shape (588898,) | X.shape (588898, 64, 10) | p.shape (588898,)
subj: S105 | 5890 | y.shape (594788,) | X.shape (594788, 64, 10) | p.shape (594788,)
subj: S106 | 5890 | y.shape (600678,) | X.shape (600678, 64, 10) | p.shape (600678,)
subj: S107 | 5890 | y.shape (606568,) | X.shape (606568, 64, 10) | p.shape (606568,)
subj: S108 | 5890 | y.shape (612458,) | X.shape (612458, 64, 10) | p.shape (612458,)
subj: S109 | 5890 | y.shape (618348,) | X.shape (618348, 64, 10) | p.shape (618348,)
(618348, 64, 10)


In [92]:
print(X.shape)
print(y.shape)
print(p.shape)
print(dim)

(618348, 64, 10)
(618348, 1)
(618348, 1)
{'S001': 5890, 'S002': 5890, 'S003': 5890, 'S004': 5890, 'S005': 5890, 'S006': 5890, 'S007': 5890, 'S008': 5890, 'S009': 5890, 'S010': 5890, 'S011': 5890, 'S012': 5890, 'S013': 5890, 'S014': 5890, 'S015': 5890, 'S016': 5890, 'S017': 5890, 'S018': 5890, 'S019': 5890, 'S020': 5890, 'S021': 5890, 'S022': 5890, 'S023': 5890, 'S024': 5890, 'S025': 5890, 'S026': 5890, 'S027': 5890, 'S028': 5890, 'S029': 5890, 'S030': 5890, 'S031': 5890, 'S032': 5890, 'S033': 5890, 'S034': 5890, 'S035': 5890, 'S036': 5890, 'S037': 5890, 'S038': 5890, 'S039': 5890, 'S040': 5890, 'S041': 5890, 'S042': 5890, 'S043': 5890, 'S044': 5890, 'S045': 5890, 'S046': 5890, 'S047': 5890, 'S048': 5890, 'S049': 5890, 'S050': 5890, 'S051': 5890, 'S052': 5890, 'S053': 5890, 'S054': 5890, 'S055': 5890, 'S056': 5890, 'S057': 5890, 'S058': 5890, 'S059': 5890, 'S060': 5890, 'S061': 5890, 'S062': 5890, 'S063': 5890, 'S064': 5890, 'S065': 5890, 'S066': 5890, 'S067': 5890, 'S068': 5890, 'S069'

In [95]:
pickle.dump( X , open( "./dataset/processed_data/"+data_type+"/X.p", "wb" ) , protocol=4)

In [96]:
pickle.dump( y , open( "./dataset/processed_data/"+data_type+"/y.p", "wb" ) , protocol=4)

In [97]:
pickle.dump( p , open( "./dataset/processed_data/"+data_type+"/p.p", "wb" ) , protocol=4)

In [98]:
pickle.dump( dim , open( "./dataset/processed_data/"+data_type+"/dim.p", "wb" ) , protocol=4)

In [80]:
X = pickle.load( open( "./dataset/processed_data/"+data_type+"/X.p", "rb" ) )
y = pickle.load( open( "./dataset/processed_data/"+data_type+"/y.p", "rb" ) )
p = pickle.load( open( "./dataset/processed_data/"+data_type+"/p.p", "rb" ) )
dim = pickle.load( open( "./dataset/processed_data/"+data_type+"/dim.p", "rb" ) )

In [None]:
print('X', X.shape,'y', y.shape,'p',p.shape,'dim',len(dim))

## 3. Data Preprocessing

The original goal of applying neural networks is to exclude hand-crafted algorithms & preprocessing as much as possible. I did not use any proprecessing techniques further than standardization to build an end-to-end classifer from the dataset

In [218]:
import numpy as np
from sklearn.preprocessing import OneHotEncoder, scale
from collections import defaultdict

def convert_mesh(X):
    
    mesh = np.zeros((X.shape[0], X.shape[2], 10, 11, 1))
    X = np.swapaxes(X, 1, 2)
    
    # 1st line
    mesh[:, :, 0, 4:7, 0] = X[:,:,21:24]; print('1st finished')
    
    # 2nd line
    mesh[:, :, 1, 3:8, 0] = X[:,:,24:29]; print('2nd finished')
    
    # 3rd line
    mesh[:, :, 2, 1:10, 0] = X[:,:,29:38]; print('3rd finished')
    
    # 4th line
    mesh[:, :, 3, 1:10, 0] = np.concatenate((X[:,:,38].reshape(-1, X.shape[1], 1),\
                                          X[:,:,0:7], X[:,:,39].reshape(-1, X.shape[1], 1)), axis=2)
    print('4th finished')
    
    # 5th line
    mesh[:, :, 4, 0:11, 0] = np.concatenate((X[:,:,(42, 40)],\
                                        X[:,:,7:14], X[:,:,(41, 43)]), axis=2)
    print('5th finished')
    
    # 6th line
    mesh[:, :, 5, 1:10, 0] = np.concatenate((X[:,:,44].reshape(-1, X.shape[1], 1),\
                                        X[:,:,14:21], X[:,:,45].reshape(-1, X.shape[1], 1)), axis=2)
    print('6th finished')
               
    # 7th line
    mesh[:, :, 6, 1:10, 0] = X[:,:,46:55]; print('7th finished')
    
    # 8th line
    mesh[:, :, 7, 3:8, 0] = X[:,:,55:60]; print('8th finished')
    
    # 9th line
    mesh[:, :, 8, 4:7, 0] = X[:,:,60:63]; print('9th finished')
    
    # 10th line
    mesh[:, :, 9, 5, 0] = X[:,:,63]; print('10th finished')
    
    return mesh

def create_folder(test_ratio, set_seed, user_independent):
    print('creating folders')
        
    if user_independent:
        fold_name = 'user_independent'
    else:
        fold_name = 'user_dependent'
        
    DIRNAME = './dataset/splitted_data/'+args.data_type+'/'+fold_name
    
    if not os.path.exists(os.path.dirname(DIRNAME)):
        os.makedirs(os.path.dirname(DIRNAME))
        
    DIRNAME = DIRNAME + '/test_rate_' + str(test_ratio) + '/seed_' + str(set_seed)+'/'
    
    if not os.path.exists(os.path.dirname(DIRNAME)):
        os.makedirs(os.path.dirname(DIRNAME))
        
    return DIRNAME


def split_data(X, y, p, test_ratio, set_seed, user_independent):
    # Shuffle trials
    np.random.seed(set_seed)
    if user_independent:
        trials = len(set([ele[0] for ele in p]))
    else:    
        trials = X.shape[0]
    print('trial', trials)
    shuffle_indices = np.random.permutation(trials)
    
    print('-- shaffleing X, y, p')
    if user_independent:
        # Create a dict with empty list as default value.
        d = defaultdict(list)
        # print(y.shape, np.array([ele[0] for ele in y]).shape)
        for index, e in enumerate([ele[0] for ele in y]):
            # print('index', index, 'e', e)
            d[e].append(index)
        new_indexes = []
        for i in shuffle_indices:
            new_indexes += d[i]
        X = X[new_indexes]
        y = y[new_indexes]
        p = p[new_indexes]
        train_size = 0
        for i in shuffle_indices[:int(trials*(1-test_ratio))]:
            train_size += len(d[i])
                
    else:
        X = X[shuffle_indices]
        y = y[shuffle_indices]
        p = p[shuffle_indices]
        # Test set seperation
        train_size = int(trials*(1-test_ratio)) 
    
    print('-- split X, y, p in train-test',train_size)
    
    # X_train, X_test, y_train, y_test, p_train, p_test
    return  X[:train_size,:,:], X[train_size:,:,:], y[:train_size,:], y[train_size:,:], p[:train_size,:], p[train_size:,:]
               
def prepare_data(X, y, p, test_ratio, return_mesh, set_seed, user_independent):
    
    # y encoding
    # oh = OneHotEncoder(categories='auto')
    # y = oh.fit_transform(y).toarray()
    DIRNAME = create_folder(test_ratio, set_seed, user_independent)
    print('Folder: ',DIRNAME)
    
    print('Split dataset:')
    X_train, X_test, y_train, y_test, p_train, p_test = split_data(X, y, p, test_ratio, set_seed, user_independent)
                                    
    # Z-score Normalization
    def scale_data(X):
        shape = X.shape
        for i in range(shape[0]):
            # Standardize a dataset along any axis
            # Center to the mean and component wise scale to unit variance.
            X[i,:, :] = scale(X[i,:, :])
            if i%int(shape[0]//10) == 0:
                print('{:.0%} done'.format((i+1)/shape[0]))   
        return X
    
    print('Scaling data')
    print('-- X train-test along any axis')
    X_train, X_test  = scale_data(X_train), scale_data(X_test)
    
    if return_mesh:
        print('Creating mesh')
        print('-- X train-test to mesh')
        X_train, X_test = convert_mesh(X_train), convert_mesh(X_test)
    
    return DIRNAME, X_train, X_test, y_train, y_test, p_train, p_test
    
    

In [209]:
DIRNAME, X_train, X_test, y_train, y_test, p_train, p_test = \
            prepare_data(X, y, p, test_ratio=0.2, return_mesh=True, set_seed=42, user_independent=True)

creating folders
Folder:  ./dataset/splitted_data/user_independent/test_rate_0.2/seed_42/
Split dataset:
trial 105
-- shaffleing X, y, p
-- split X, y, p in train-test 377475
Scaling data
-- X train-test along any axis
0% done
10% done
20% done
30% done
40% done
50% done
60% done
70% done
80% done
90% done
100% done
0% done
10% done
20% done
30% done
40% done
50% done
60% done
70% done
80% done
90% done
100% done
Creating mesh
-- X train-test to mesh
1st finished
2nd finished
3rd finished
4th finished
5th finished
6th finished
7th finished
8th finished
9th finished
10th finished
1st finished
2nd finished
3rd finished
4th finished
5th finished
6th finished
7th finished
8th finished
9th finished
10th finished


In [224]:
print('X_train', X_train.shape, \
      'y_train', y_train.shape, \
      'p_train', (len(set([p[0] for p in p_train]))), '\n'
      'X_test', X_test.shape,  \
      'y_test', y_test.shape,  \
      'p_test', p_test.shape)

X_train (377475, 10, 10, 11, 1) y_train (377475, 1) p_train 105 X_test (240873, 10, 10, 11, 1) y_test (240873, 1) p_test (240873, 1)


In [211]:
X_train, X_val, y_train, y_val, p_train, p_val = \
            split_data(X_train, y_train, p_train, test_ratio=0.2, set_seed=42, user_independent=True)

trial 105
-- shaffleing X, y, p
-- split X, y, p in train-test 377475


In [217]:
print('X_train', X_train.shape, \
      'y_train', y_train.shape, \
      'p_train', p_train.shape, \
      'X_test', X_test.shape,  \
      'y_test', y_test.shape,  \
      'p_test', p_test.shape)

X_train (377475, 10, 10, 11, 1) y_train (377475, 1) p_train (377475, 1) X_test (240873, 10, 10, 11, 1) y_test (240873, 1) p_test (240873, 1)


In [213]:
pickle.dump( [X_train, y_train, p_train] ,open( DIRNAME + "train.p", "wb" ) , protocol=4)

In [214]:
pickle.dump( [X_val, y_val, p_val]  ,open( DIRNAME + "val.p", "wb" ) , protocol=4)

In [215]:
pickle.dump( [X_test, y_test, p_test]  ,open( DIRNAME + "test.p", "wb" ) , protocol=4)

As the EEG recording instrument has 3D locations over the subjects\` scalp, it is essential for the model to learn from the spatial pattern as well as the temporal pattern. I transformed the data into 2D meshes that represents the locations of the electrodes so that stacked convolutional neural networks can grasp the spatial information.