## Overview
This script aims to preprocess the excel spreadsheet of Motor Imagery, Thermal, Electrical, and Finger Tapping data. Then, it uses categorizes these data into the following groups for the Applied Machine Learning aspect.
- Motor: Motor Imagery & Finger Tapping Data
- Sesnory: Thermal & Electrical Data

In [2]:
# Import required libraries
import pandas as pd 
import numpy as np

In [3]:
# Read in each individual sheet in the excel spreadsheet
fNIRS_tstats = pd.read_excel('../data/ds005776.xlsx', sheet_name="tstats")
fNIRS_bstats = pd.read_excel('../data/ds005776.xlsx', sheet_name="bstats")

In [4]:
# Fill all missing NA cells with a scalar value of 0
fNIRS_tstats = fNIRS_tstats.fillna(0)
fNIRS_bstats = fNIRS_bstats.fillna(0)

In [5]:
# Create a new 'Type' column and condition its value on the data type
fNIRS_tstats['Type'] = np.where(
    (fNIRS_tstats['Cond'] == 'FingerTapping') | (fNIRS_tstats['Cond'] == 'MotorImagery'),
    'Motor',
    'Sensory'
)

fNIRS_bstats['Type'] = np.where(
    (fNIRS_bstats['Cond'] == 'FingerTapping') | (fNIRS_bstats['Cond'] == 'MotorImagery'),
    'Motor',
    'Sensory'
)

In [None]:
# Set the index of the dataframes to be by 'Type'
fNIRS_tstats.set_index('Type', inplace=True)
fNIRS_bstats.set_index('Type', inplace=True)

In [7]:
# Convert to a numpy array
fNIRS_tstats_ndarray = fNIRS_tstats.to_numpy()
fNIRS_bstats_ndarray = fNIRS_bstats.to_numpy()

### Machine Learning Models
- This script takes in an N-Dimensional Array from NumPy and then trains ML Models using that data.
- The N-Dimensional arrays will be split into training/testing sets and fed into a ML model
#### Models
- Linear Regression
- LDA

In [12]:
from sklearn.svm import SVC
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

In [17]:
X=fNIRS_bstats_ndarray[:,5::]
y=1*(fNIRS_tstats.index=='Motor')

# Split into training and testing sets, everything up to here is the same
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

In [None]:
# Create and train the SVM model
model = SVC(kernel='linear', C=5) # Only line that needs to change
model.fit(X_train, y_train)

# Make predictions
y_pred = model.predict(X_test)

# Evaluate the model
print("Accuracy:", accuracy_score(y_test, y_pred))

Accuracy: 0.6805555555555556


## Models
- Focus: LDA, Linear Classifier
- Later: Neural Network 

In [15]:
# Train an Polynomial Model
model = SVC(kernel='poly', C=5) # Only line that needs to change
model.fit(X_train, y_train)

# Make predictions
y_pred = model.predict(X_test)

# Evaluate the model
print("Accuracy:", accuracy_score(y_test, y_pred))

Accuracy: 0.6435185185185185


In [None]:
# Train an RBF Model
model = SVC(kernel='rbf', C=5) # Only line that needs to change
model.fit(X_train, y_train)

# Make predictions
y_pred = model.predict(X_test)

# Evaluate the model
print("Accuracy:", accuracy_score(y_test, y_pred))

Accuracy: 0.6435185185185185


In [20]:
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis

# Train an LDA Model
lda = LinearDiscriminantAnalysis()
lda.fit(X_train, y_train)

# Make predictions
y_pred = lda.predict(X_test)

# Evaluate the model
print("Accuracy:", accuracy_score(y_test, y_pred))

Accuracy: 0.6944444444444444
