Data Source: http://mindbigdata.com/opendb/index.html

**Brief Description of Data**
 - 'The version 1.03 of the open database contains 1,207,293 brain signals of 2 seconds each, captured with the stimulus of seeing  a digit (from 0 to 9) and thinking about it, over the course of almost 2 years between 2014 & 2015, from a single Test Subject David Vivancos.'
 - 'All the signals have been captured using commercial EEGs (not medical grade), NeuroSky MindWave, Emotiv EPOC, Interaxon Muse & Emotiv Insight, covering a total of 19 Brain (10/20) locations.'

*Note**
 For this analysis, purely the Mindwave data was utilized. 

In [2]:
import numpy as np
my_data = np.array(np.genfromtxt('data/MW.csv', delimiter=',', usecols = (x for x in range(0,889))))

print(my_data)
print(my_data.shape)

[[ 0.000e+00  1.017e+03  3.800e+01 ...  2.000e+01  3.800e+01  6.100e+01]
 [ 1.000e+00  8.890e+02  8.300e+01 ...  9.000e+00  2.000e+00 -7.000e+00]
 [ 4.000e+00  1.017e+03  1.900e+01 ... -2.300e+01 -2.300e+01 -2.000e+01]
 ...
 [ 6.000e+00  9.530e+02  4.000e+01 ...  3.800e+01  4.300e+01  5.500e+01]
 [ 3.000e+00  1.017e+03  4.000e+01 ...  7.100e+01  8.100e+01  6.700e+01]
 [ 0.000e+00  9.530e+02  3.800e+01 ... -1.300e+01 -3.000e+00  1.100e+01]]
(11842, 889)


The dataset is organized in a manner such that the first column represents the number being thought by the individual (0-9) and the following columns represent the EEG readings in that 2 second testing period.

*Note** To account for unequal number of EEG signals in the dataset, the number of signals was capped off at 889

In [8]:
eeg_labels = my_data.T[0]
eeg_data = my_data.T[1:].T

print(eeg_labels)
print(eeg_labels.shape)

print("")

print(eeg_data)
print(eeg_data.shape)

[0. 1. 4. ... 6. 3. 0.]
(11842,)

[[1017.   38.   48. ...   20.   38.   61.]
 [ 889.   83.   74. ...    9.    2.   -7.]
 [1017.   19.   10. ...  -23.  -23.  -20.]
 ...
 [ 953.   40.   23. ...   38.   43.   55.]
 [1017.   40.   32. ...   71.   81.   67.]
 [ 953.   38.   33. ...  -13.   -3.   11.]]
(11842, 888)


In [9]:
import matplotlib
import matplotlib.pyplot as plt
visuals = [[] for x in range(0,10)]

for current_test in my_data:
    visuals[int(current_test[0])].append([current_test[1:]])

In [13]:
#Splitting the Data
from sklearn.model_selection import train_test_split

train_data, test_data, train_labels, test_labels = train_test_split(eeg_data, eeg_labels, test_size=0.33, random_state=42)

print(train_data.shape)
print(train_labels.shape)

print("\n")

print(test_data.shape)
print(test_labels.shape)

(7934, 888)
(7934,)


(3908, 888)
(3908,)


In [46]:
#CNN Model

from keras.models import Sequential
from keras.layers import Dense, Conv2D, Flatten, MaxPooling2D

model = Sequential()

model.add(Conv2D(32, (1, 1), activation='relu', input_shape=(1, 888, 1)))
model.add(MaxPooling2D((1, 2)))
model.add(Conv2D(64, (1, 1), activation='relu'))
model.add(MaxPooling2D((1, 2)))
model.add(Flatten())
model.add(Dense(512, activation='relu'))
model.add(Dense(1024, activation='relu'))
model.add(Dense(10))

model.summary()

Model: "sequential_28"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_38 (Conv2D)           (None, 1, 888, 32)        64        
_________________________________________________________________
max_pooling2d_33 (MaxPooling (None, 1, 444, 32)        0         
_________________________________________________________________
conv2d_39 (Conv2D)           (None, 1, 444, 64)        2112      
_________________________________________________________________
max_pooling2d_34 (MaxPooling (None, 1, 222, 64)        0         
_________________________________________________________________
flatten_6 (Flatten)          (None, 14208)             0         
_________________________________________________________________
dense_12 (Dense)             (None, 512)               7275008   
_________________________________________________________________
dense_13 (Dense)             (None, 1024)            