<h1>Machine learning using Neural networks to predict eye state data</h1>
<p>
The purpose of this project is to get more familiar with an eeg dataset and learn more about data manipulation. The dataset is slipt into 15 differents columns with the last being the target values (0,1). The target values represent the state of the eye, being open or closed. The other data represents the value of an eeg node which gets the value of the electrical activity in that sector surrounding the node.<br>
This dataset is located in the UCI database found <a href='https://archive.ics.uci.edu/ml/datasets/EEG+Eye+State'>here</a>
</p>

<h2>The purpose of the nodes and locations</h2>
<p>The nodes used in this dataset are located in the red filled nodes shown below in Figure 1.</p>
<img src='data/eeg_eye_state_map.png' alt='eeg_eye_state_map'>Figure 1 <br>Source: https://www.researchgate.net/figure/Surface-map-of-EEG-electrode-locations_fig34_230864997</img>

<h2>The purpose of the node locations</h2>
<p>The node locations are placed uniformly on the scalp in order to get the maximum coverage of the brain with the most uniform data. The barin is composed of different reigons, namely the:
    <ul>
        <li>Occipital lobe: Visual area, sight, image recoginition</li>
        <li>Temporal lobe: Association area, short term memory, emotion</li>
        <li>Parential lobe: Motor function, voluntary muscles</li>
        <li>Frontal lobe: Brocas area, muscles of speech</li>
        <li>Temporal lobe: Auditory area, hearing</li>
        <li>[name]: Emotional area, pain, hunger, 'flight or fight'</li>
        <li>Pariental lobe: Sensory association</li>
        <li>Frontal lobe: Oilfactory, smelling</li>
        <li>Pariental lobe: Sensory area, sensation grom skin and muscles</li>
        <li>Pariental lobe: Somatosensory association, evaluation of weight, temperature, texture (object recognition)</li>
        <li>Temporal lobe: Wernicke's area, Written and spoken language comprehension</li>
        <li>Cerebral Cortex: Motor function, eye movement and orientation</li>
        <li>Frontal lobe: Higher mental functions, concentration, planning, judgement, emotional expression, creativity, inhibition</li>
        <li>Cerebellum: Motor functions, balance, equilibrium, posture, coordination of movement</li>
    </ul>
    The data above can be visualized in Figure 2.
</p>
<img src='data/anatomy-function-brain-areas-basics-large.jpg' alt='Brain location map'>
Figure 2<br>Source: https://www.dana.org/wp-content/uploads/2019/08/anatomy-function-brain-areas-basics-large.jpg</img>
<p>By looking at the node locations in this experiment and comparing with the lobe location and function we can see why these certain locations were picked.
    <ul>
        <li>AF3/AF4: Frontal lobe- Higher mental functions</li>
        <li>F3/F4: Frontal lobe- Higher mental functions</li>
        <li>F7/F8: Inferior Frontal/Brocas- Functions of Speech</li>
        <li>FC5/FC8: Cerebral Cortex-Motor functions <strong>eye movement/ orientation</strong></li>
        <li>T7/T8: Temporal lobe- Short term memory/emotion</li>
        <li>P7/P8: Posterior Temporal- Hearing, speech <strong>visual</strong> recoginition</li>
        <li>O1/O2: Occipital lobe- <strong>Visual, sight, image recognition</strong></li>
    </ul>
</p>

<h2>Code: EEG eye-state prediction</h2>

In [1]:
# import libraries
import numpy as np
import pandas as pd
from scipy.io.arff import loadarff
from scipy import stats

from keras.utils import normalize
from keras.utils import np_utils
from keras.models import Sequential
from keras.layers import Dense, Activation, Dropout
from keras.optimizers import Adam

from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn.preprocessing import StandardScaler


Using TensorFlow backend.


In [2]:
# Load in the data frame
raw_data = loadarff('data/EEG Eye State.arff')
df_data = pd.DataFrame(raw_data[0])
# df_data.head()

<p>In order to pre-process the data I split the data into two sets, The data and the targets. Next I normalized by z-score to make the data more manageable and easier to learn.<br>The targets here are 1 or 0, meaning is the eye open or closed?</p>

In [3]:
# Preprocess Data
# split into data and targets
data_targets=df_data.iloc[:,-1]
data=df_data.iloc[:,:-1]

# data normalization with sklearn z-score
scaler = StandardScaler()
scaler.fit(data)
data = scaler.transform(data)

# convert targets into 1 or 0
data_targets=np_utils.to_categorical(data_targets, num_classes=2)

<p>In order to split into training and test data I use the method train_test_split() which randomizes the data while splitting it into a training and testing data set. The data is split into 80% training and 20% testing and validation.</p>

In [4]:
# split the data
xTrain, xTest, yTrain, yTest = train_test_split(data, data_targets, test_size = 0.2)
# print(xTrain, yTrain)

<h2>Basic classification</h2>
<p>This is the creation of the model.<br>
The model is split into 4 layers with two being the hidden layers. The goal here is to make a network that can accurately predict the satet of the eye, being 0 or 1.<br>
model.add(Dense(500, activation='relu', input_dim=14))
model.add(Dense(200, activation='relu'))
model.add(Dense(150, activation='tanh'))
model.add(Dense(2, activation='sigmoid')) 91.69<br>
The layers [200, 150, 50, 2] with the batch size 32 and 50 apochs gives us an accuracy rate of 90.8%.</p>

In [20]:
# create model
model = Sequential()
model.add(Dense(500, activation='relu', input_dim=14))
model.add(Dense(200, activation='relu'))
model.add(Dense(200, activation='tanh'))
model.add(Dense(2, activation='sigmoid'))
print(model.summary())

model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(x=xTrain, y=yTrain,epochs=50,batch_size=32, validation_data=(xTest,yTest))

score, acc = model.evaluate(x=xTest, y=yTest)
# print(model.predict(xTest))

print('Accuracy: ',100*(acc))

Model: "sequential_16"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_61 (Dense)             (None, 500)               7500      
_________________________________________________________________
dense_62 (Dense)             (None, 200)               100200    
_________________________________________________________________
dense_63 (Dense)             (None, 200)               40200     
_________________________________________________________________
dense_64 (Dense)             (None, 2)                 402       
Total params: 148,302
Trainable params: 148,302
Non-trainable params: 0
_________________________________________________________________
None
Train on 11984 samples, validate on 2996 samples
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17

Epoch 50/50
Accuracy:  90.20360708236694


<h2>CNN based classification</h2>

<h2>Sources</h2>
<ul>
    <li>https://www.dana.org/article/neuroanatomy-the-basics/</li>
</ul>