# Epileptic Seizure Detection

Group Number: 9

| Member Name        | Roll Number |
|--------------------|-------------|
| Umang Paliwal     | 20          |
| Pranav Pathe      | 47          |
| Kunal Dewangan    | 41          |
| Syed Rafey        |58          |


# Introduction
<span style="color:red; font-weight:bold">What is epilepsy?</span>

>Epilepsy is the condition in which there is sudden burst of electric waves in the brain causes the seizure or sometimes it is called the fits which is nothing but the uncontrolled shaking or suddenly becoming stiff. There are many treatment for the epilipsy such as providing the proper medication to the patient called the  anti-epileptic drugs.

>In order to treat or find out more about this disease different electroencephalogram (EEG) is used to find the brain activity . 

# Epileptic Seizure Recognition Data Set

*   Data Set characterstics : Multivariate.
*   Number of Instances: 11500
*   Number of Attributes : 179

>Characterstic of the Attribute : The time frame used to record the brain activity is 23.6 secounds. The corresponding time-series is sampled into 4097 data points. So each data point that we get represents the EEG recording at a different point in time.

>The data is divided and shuffled into 4097 data points and 23 chunks. so In a way each chunk which is 178 data points(number of columns) for a single secound has a data point of 4097 into 23 chunks and each of those chunk contains 178 data points for one second. In order to calculate the total peice of information we have multiply 23 * 500 = 11500 pieces of information or number of rows. 

>The last column of the dataset contains the information of the label that we will going to get which ranges from 1 to 5 where :

>1 - Recording of seizure activity
2 - Recording of EEG from the loction of tumor
3 - Recording the EEG activity from the healthy part of the brain area.
4 - Eyes closed, means when EEG was recorded the eyes of the patient was cloased.
5 - eyes open, means when EEG was recorded the eyes of the patient was open.

>The important part of all this features is that we need to focus on only two types of classess, that are when the seizure is occured which are labelled under '1' and all the other label can be considered as the sizure not occured which includes '2','3','4' and '5'. 

>Our task is to build the model using the machine learning and artificial intelligence to predict if the seizure is occured or not based on the given dataset.

link to dataset : http://archive.ics.uci.edu/ml/datasets/Epileptic+Seizure+Recognition

>Refrences:[1] Andrzejak RG, Lehnertz K, Mormann F, Rieke C, David P, Elger CE. Indications of nonlinear deterministic and finite-dimensional structures in time series of brain electrical activity: dependence on recording region and brain state. Phys Rev E Stat Nonlin Soft Matter Phys. 2001 Dec;64(6 Pt 1):061907. doi: 10.1103/PhysRevE.64.061907. Epub 2001 Nov 20. PMID: 11736210.




In [None]:
import os
import math
import imblearn
import logging
import warnings
import statistics
import numpy as np
import pandas as pd
import seaborn as sns
from collections import Counter
from sklearn import svm
from sklearn.model_selection import GridSearchCV
from sklearn.metrics import roc_auc_score , accuracy_score , precision_score, recall_score ,confusion_matrix
from sklearn.linear_model import LogisticRegression
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from numpy import genfromtxt
from mpl_toolkits.mplot3d import Axes3D
from imblearn.combine import SMOTEENN
%matplotlib inline




from warnings import simplefilter
simplefilter(action='ignore', category=FutureWarning)
warnings.filterwarnings('ignore')

In [None]:
data = pd.read_csv('../input/epileptic-seizure-recognition/Epileptic Seizure Recognition.csv')

In [None]:
data.head()

In [None]:
data.describe()
#finding out the summary of the dataset

In [None]:
data.shape

In [None]:
data.info()

In [None]:
data.describe(include=object)

In [None]:
null_values = data.isnull().sum()
null_values.to_numpy() #as we can see that there are no null values present on the dataset

In [None]:
data_1 = data.copy()

In [None]:
data_1.drop(['Unnamed','y'],axis=1,inplace=True)

In [None]:
data_1 

In [None]:
data['y'].value_counts()


In [None]:
#visualizing the only categorical column present in the dataset.
values = data['y'].value_counts()
plt.figure(figsize=(7,7))
values.plot(kind='pie',fontsize=17, autopct='%.2f')
plt.legend(loc="best")
plt.show()
#it means all the categorical values in our dataset contains the equal amoung of balance.

In [None]:
# plot these features in the same graph with stack plot
fig, axs = plt.subplots(5, sharex=True, sharey=True)
fig.set_size_inches(18, 24)
labels = ["X15","X30","X45","X60","X75"]
colors = ["r","g","b",'y',"k"]
fig.suptitle('Visual representation of different channels when stacked independently', fontsize = 20)
# loop over axes
for i,ax in enumerate(axs):
  axs[i].plot(data.iloc[:,0],data[labels[i]],color=colors[i],label=labels[i])
  axs[i].legend(loc="upper right")

plt.xlabel('total number of observation', fontsize = 20)
plt.show()

we can say that relatively the pattern that we got from the waves is nearly same,there is some difference we can see in the X15 and X75. It's hard to distinguish the difference by visual inspection when viewed seperately.

In [None]:
#plt.figure(figsize=(10,10))
#this can help of provide us the general idea of how the waves are behaving 
#fig, axs = plt.subplots(1, sharex=True, sharey=True)
plt.rcParams["figure.figsize"] = (20, 10)
data.loc[:,::25].plot()
plt.title("Visual representation different channels when stacked aganist each other")
plt.xlabel("total number of values of x")
plt.ylabel("range of values of y")
plt.show()

Here we have plotted all the waves in the same chart just have an idea how they differ from each other by the value of frequency. Though we can find most of them overlap each other.

# Finding out the Correlation Matrix

In [None]:
corr = data_1.corr()
ax = sns.heatmap(
    corr, 
    vmin=-1, vmax=1, center=0,
    cmap=sns.diverging_palette(20, 220, n=200),
    square=True
)

We cannot able to find any major relative correlation between different waves, as there are no very dark blue dots on the matrix which shows the high correlation.

# Solving the class imbalance problem

In [None]:
data_2 = data.drop(["Unnamed"],axis=1).copy()

In [None]:
data_2["Output"]= data_2.y == 0

In [None]:
data_2["Output"] = data_2["Output"].astype(int)

In [None]:
data_2.y.value_counts()

In [None]:
data_2['y'] = data_2['y'].replace([2,3,4,5],0)

In [None]:
data_2.y.value_counts() #we can see there is a mojor class imbalance problem in our dataset

In [None]:
plt.figure(figsize=(10,6),dpi=100)
sns.despine(left=True)
sns.scatterplot(x='X1', y='X2', hue = 'y', data=data_2)
plt.show()
#we can see the clear class imbalance problem present here

In [None]:
data_2.head()

In [None]:
data_2.y.value_counts()

In [None]:
X  = data_2.drop(['Output','y'], axis=1)
y = data_2['y']

Here we will be using the SMOTE techniques to remove the class imbalance problem from our dataset. We can see that in the output variable we have more number of class of one variable than other, it will create a problem when we will be using different machine-learning and AL algorithms they tends to more biased towards particular because of it's high presence. In order to tackle this bias we will use the SMOTE techniques, so that we can balance the number of variable in our response variable.

>**SMOTE: Synthetic Minority Over-sampling Technique**

>**ENN: Edited Nearest Neighbors**

>SMOTEENN is a combination of these two techniques. SMOTE is used for oversampling the minority class by generating synthetic samples, while ENN is used for undersampling the majority class by removing noisy and borderline samples. This hybrid approach aims to address both the overfitting and underfitting issues commonly encountered when dealing with imbalanced datasets.

In [None]:
counter = Counter(y)
#finding out the 
print('Before',counter)
# oversampling the train dataset using SMOTE + ENN
smenn = SMOTEENN()
X_train1, y_train1 = smenn.fit_resample(X, y)

counter = Counter(y_train1)
print('After',counter)

## Model Building

In [None]:
#so we will start with dividing it into two parts/because with this method we cannot divide it into three parts
X_train, X_test, y_train, y_test = train_test_split(X_train1,y_train1,test_size=0.4,random_state=42)

#now we will be dividing it into further to get the validation set
X_val, X_test, y_val, y_test = train_test_split(X_test,y_test,test_size=0.5,random_state=42)

Since this is a binary classification problem, we can use the following models : 

* Logistic Regression
* K Nearest Neighbors
* Stochastic Gradient Descent
* Naive Bayes
* Decision Trees
* Random Forest
* Extra-Trees
* Gradient Boosting
* XGBoost
* Neural Networks

In [None]:
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()

#now we will going to scale the dataset
X_train = scaler.fit_transform(X_train)
X_val = scaler.transform(X_val)
X_test = scaler.transform(X_test)
print(X_train.shape)
print(y_train)

In [None]:
print("The shape of the training set is :{}".format(X_train.shape))
print("The shape of the testing set is :{}".format(X_test.shape))
print("The shape of the validation set is :{}".format(X_val.shape))

# Logistic Regression

In [None]:
# import the class
from sklearn.linear_model import LogisticRegression
from sklearn import metrics

y_train_re = y_train.values.reshape(-1,1)
y_val_re = y_val.values.reshape(-1,1)

In [None]:
# STEP 2: train the model on the training set
logreg = LogisticRegression(solver = 'liblinear')
logreg.fit(X_train, y_train_re)
# STEP 3: make predictions on the testing set
y_pred = logreg.predict(X_val)

# compare actual response values (y_test) with predicted response values (y_pred)
print("The accuracy score of the model on the validation data is:{}.".format(metrics.accuracy_score(y_val_re, y_pred)*100))

#fiding out the confusion matrix of the dataset
Myconfusion = metrics.confusion_matrix(y_pred,y_val_re)
print("This is the required confusion matrix of the model:\n{}.".format(Myconfusion))

In [None]:
from sklearn.metrics import classification_report
print(classification_report(y_val,y_pred))

In [None]:
# calculate the FPR and TPR for all thresholds of the classification
y_pred = logreg.predict(X_val)
#y_pred = y_pred[:, 1]
logit_fpr, logit_tpr, thresholds = metrics.roc_curve(y_val, y_pred)
logit_auc = metrics.roc_auc_score(y_val, y_pred)

# K-nearest neighbors

In [None]:
from sklearn.neighbors import KNeighborsClassifier

knn = KNeighborsClassifier(n_neighbors= 100)
knn.fit(X_train,y_train)

y_valid_preds = knn.predict_proba(X_val)
#Evaluation
precision = metrics.accuracy_score(y_pred, y_val) * 100
print("Accuracy with K-NN: {0:.2f}%".format(precision))
print(classification_report(y_val,y_pred))


In [None]:
# calculate the FPR and TPR for all thresholds of the classification
y_pred = knn.predict(X_val)
#y_pred = y_pred[:, 1]
knn_fpr, knn_tpr, thresholds = metrics.roc_curve(y_val, y_pred)
knn_auc = metrics.roc_auc_score(y_val, y_pred)

# Support Vector Machine

In [None]:
from sklearn.svm import SVC

svm = SVC(gamma='auto', kernel='linear', probability=True)
svm.fit(X_train, y_train) 
y_pred = svm.predict(X_val)

#Evaluation
precision = metrics.accuracy_score(y_pred, y_val) * 100
print("Accuracy with SVM: {0:.2f}%".format(precision))


# calculate the FPR and TPR for all thresholds of the classification
probs = svm.predict_proba(X_val)
probs = probs[:, 1]
svm_fpr, svm_tpr, thresholds = metrics.roc_curve(y_val, probs)
svm_auc = metrics.roc_auc_score(y_val, probs)

In [None]:
# calculate the FPR and TPR for all thresholds of the classification
y_pred = svm.predict(X_val)
#y_pred = y_pred[:, 1]
svm_fpr, svm_tpr, thresholds = metrics.roc_curve(y_val, y_pred)
svm_auc = metrics.roc_auc_score(y_val, y_pred)

# Stochastic Gradient Descent

In [None]:
from sklearn.linear_model import SGDClassifier
stochg = SGDClassifier(loss='log',alpha=0.1,random_state=42)
stochg.fit(X_train,y_train)

print("Stochastic Gradient Descend")
print('Accuracy with stochg is:{0:.2f}%.'.format(metrics.accuracy_score(y_pred, y_val) * 100))

In [None]:
# calculate the FPR and TPR for all thresholds of the classification
y_pred = stochg.predict(X_val)
#y_pred = y_pred[:, 1]
stochg_fpr, stochg_tpr, thresholds = metrics.roc_curve(y_val, y_pred)
stochg_auc = metrics.roc_auc_score(y_val, y_pred)

# Naive Bayes Classifier

In [None]:
from sklearn.naive_bayes import GaussianNB
naive = GaussianNB()
naive.fit(X_train,y_train)

print("Naive Bayes")
print('Accuracy with naive is:{0:.2f}%.'.format(metrics.accuracy_score(y_pred, y_val) * 100))

In [None]:
# calculate the FPR and TPR for all thresholds of the classification
y_pred = naive.predict(X_val)
#y_pred = y_pred[:, 1]
naive_fpr, naive_tpr, thresholds = metrics.roc_curve(y_val, y_pred)
naive_auc = metrics.roc_auc_score(y_val, y_pred)

# Decision Trees (DTs)

In [None]:
#now checking the accuracy on the decision tree classification
from sklearn import tree
tree_eeg = tree.DecisionTreeClassifier()
tree = tree_eeg.fit(X_train,y_train)
#predicting
y_pred = tree.predict(X_val) 
#Evaluating the model
precision = metrics.accuracy_score(y_pred,y_val)* 100
#print  the accuracy
print("Accuracy of the model by using the decision tree algorithm : {0:.2f}%".format(precision))

In [None]:
# calculate the FPR and TPR for all thresholds of the classification
y_pred = tree.predict(X_val)
#y_pred = y_pred[:, 1]
tree_fpr, tree_tpr, thresholds = metrics.roc_curve(y_val, y_pred)
tree_auc = metrics.roc_auc_score(y_val, y_pred)

# Random Forest Classifier

In [None]:
from sklearn.ensemble import RandomForestClassifier
random = RandomForestClassifier(max_depth=10,random_state=69)
random.fit(X_train,y_train)

#predicting
y_pred = random.predict(X_val) 
#Evaluating the model
precision = metrics.accuracy_score(y_pred,y_val)* 100
#print  the accuracy
print("Accuracy of the model by using the random algorithm : {0:.2f}%".format(precision))


In [None]:
# calculate the FPR and TPR for all thresholds of the classification
y_pred = random.predict(X_val)
#y_pred = y_pred[:, 1]
random_fpr, random_tpr, thresholds = metrics.roc_curve(y_val, y_pred)
random_auc = metrics.roc_auc_score(y_val, y_pred)

# Extra-trees classifier
#### (Extremely Randomized Trees)

In [None]:
#In Extra Trees, both the features and the thresholds for splitting are selected randomly. 
#Unlike Random Forests, which search for the best split among the randomly selected features, 
#Extra Trees select a split point at random within the range of feature values without optimizing it.
from sklearn.ensemble import ExtraTreesClassifier
extra = ExtraTreesClassifier(bootstrap=False,criterion="entropy",max_features=1.0,
                             min_samples_leaf=3,min_samples_split=20,n_estimators=100)

extra.fit(X_train,y_train)

print("Extra Trees Classifier")
#predicting
y_pred = extra.predict(X_val) 
#Evaluating the model
precision = metrics.accuracy_score(y_pred,y_val)* 100
#print  the accuracy
print("Accuracy of the model by using the extra algorithm : {0:.2f}%".format(precision))


In [None]:
# calculate the FPR and TPR for all thresholds of the classification
y_pred = extra.predict(X_val)
#y_pred = y_pred[:, 1]
extra_fpr, extra_tpr, thresholds = metrics.roc_curve(y_val, y_pred)
extra_auc = metrics.roc_auc_score(y_val, y_pred)

# Gradient Boosting for classification.

In [None]:
from pandas.core.common import random_state
from sklearn.ensemble import GradientBoostingClassifier
gradient = GradientBoostingClassifier(
    n_estimators=100,learning_rate=1.0,max_depth=6,random_state=69)

gradient.fit(X_train,y_train)

print("Gradient Boosting for classification.")
#predicting
y_pred = gradient.predict(X_val) 
#Evaluating the model
precision = metrics.accuracy_score(y_pred,y_val)* 100
#print  the accuracy
print("Accuracy of the model by using the gradient algorithm : {0:.2f}%".format(precision))



In [None]:
# calculate the FPR and TPR for all thresholds of the classification
y_pred = gradient.predict(X_val)
#y_pred = y_pred[:, 1]
gradient_fpr, gradient_tpr, thresholds = metrics.roc_curve(y_val, y_pred)
gradient_auc = metrics.roc_auc_score(y_val, y_pred)

# XGBoost

In [None]:
#Extreme Gradient Boosting - employs tree pruning, regularisation techniques, highly scalable
from xgboost import XGBClassifier
import xgboost as xgb
xgbc = XGBClassifier()

xgbc.fit(X_train,y_train)

print("XGBoost")
#predicting
y_pred = xgbc.predict(X_val) 
#Evaluating the model
precision = metrics.accuracy_score(y_pred,y_val)* 100
#print  the accuracy
print("Accuracy of the model by using the xgbc algorithm : {0:.2f}%".format(precision))

In [None]:
# calculate the FPR and TPR for all thresholds of the classification
y_pred = xgbc.predict(X_val)
#y_pred = y_pred[:, 1]
xgbc_fpr, xgbc_tpr, thresholds = metrics.roc_curve(y_val, y_pred)
xgbc_auc = metrics.roc_auc_score(y_val, y_pred)

# Plotting the ROC Curve
* True Positive Rate (TPR) = True Positives / (True Positives + False Negatives)
* False Positive Rate (FPR) = False Positives / (False Positives + True Negatives)

In [None]:
#ROC - Receiver Operating Characteristic - CURVE
plt.title('ROC Curve')
plt.plot([0, 1], [0, 1], linestyle='--')
plt.plot(logit_fpr, logit_tpr, 'c', marker='.', label = 'logit = %0.3f' % logit_auc )
plt.plot(svm_fpr, svm_tpr, 'b', marker='.', label = 'SVM = %0.3f' % svm_auc )
plt.plot(knn_fpr, knn_tpr, 'g', marker='.', label = 'K-NN = %0.3f' % knn_auc)
plt.plot(stochg_fpr, stochg_tpr, 'y', marker='.', label = 'stochg = %0.3f' % stochg_auc)
plt.plot(naive_fpr, naive_tpr, 'm', marker='.', label = 'naive = %0.3f' % naive_auc)
plt.plot(tree_fpr, tree_tpr, 'r', marker='.',label = 'DECISION TREE = %.3f' % tree_auc)
plt.plot(random_fpr, random_tpr, 'k', marker='.',label = 'Random Forest = %.3f' % random_auc)
plt.plot(extra_fpr, extra_tpr, 'C4', marker='.',label = 'Extra tree = %.3f' % extra_auc)
plt.plot(gradient_fpr, gradient_tpr, 'C1', marker='.',label = 'Gradient = %.3f' % gradient_auc)
plt.plot(xgbc_fpr, xgbc_tpr, 'C2', marker='.',label = 'XGBoost = %.3f' % xgbc_auc)
plt.legend(loc = 'lower right')
plt.ylabel('True Positive Rate')
plt.xlabel('False Positive Rate')
plt.show()

# Deep Learning

In [None]:
from tensorflow.keras.layers import Dense, BatchNormalization,Dropout,LSTM,Dense,Activation,Flatten
from tensorflow.keras.models import Sequential
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.optimizers import Adam
from tensorflow.keras import regularizers
from tensorflow.keras.layers import Conv2D,MaxPooling2D,Dropout
from tensorflow.keras import callbacks

Building the model


1.   Initialising the ANN
2.   Defining by adding layers
3.   Compiling the ANN
4.   Train the ANN





In [None]:
X_train.shape

In [None]:
early_stopping = callbacks.EarlyStopping(
    min_delta=0.001,
    patience=20,
    restore_best_weights=True
)

#intialising the nn
model = Sequential()

#layers
model.add(Dense(units=32,kernel_initializer='uniform',activation='relu',input_dim=178))

model.add(Dense(units=64,kernel_initializer='uniform',activation='relu'))
model.add(Dense(units=32,kernel_initializer='uniform',activation='relu'))
model.add(Dropout(0.25))
model.add(Dense(units=32,kernel_initializer='uniform',activation='relu'))
model.add(Dense(units=16,kernel_initializer='uniform',activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(units=1,kernel_initializer='uniform',activation='sigmoid'))

#finding out the summary of the model
model.summary()


In [None]:
#compiling the ann
model.compile(optimizer='adam',loss='binary_crossentropy',metrics=['accuracy'])


In [None]:
#training the model
model_train = model.fit(X_train,y_train,batch_size=32,epochs=500,callbacks=[early_stopping],validation_split=0.2)

In [None]:
predictions = model.predict(X_val)
pred_labels =np.round(predictions)       
score = accuracy_score(y_val,pred_labels)
conf_mx = confusion_matrix(y_val, pred_labels)
print(score*100)

# Model selection

In [None]:
#hence we have seen how different models have performed in our dataset, now we will going to select the best performing model out of all the dataset
#the best performing model that we got is the XGBoost for classification. with the Accuracy of the model by using the gradient algorithm : 98.34%.
#so finally we will testing it on our test dataset, on which we have not done any model training and testing.
print("XGBoost")
#predicting
y_pred = xgbc.predict(X_test) 
#Evaluating the model
precision = metrics.accuracy_score(y_pred,y_test)* 100
#print  the accuracy
print("Accuracy of the model by using the xgbc algorithm : {0:.2f}%".format(precision))

# Hence we are getting an accuracy of 98.54% , we can confidently say that, we have the chances of correctly predicting the diseases 98.54%.