### **Electroencephalography is a method to record an electrogram of the spontaneous electrical activity of the brain.**

This is a dataset of EEG brainwave data that has been processed with our original strategy of statistical extraction (paper below)
The data was collected from two people (1 male, 1 female) for 3 minutes per state - positive, neutral, negative. We used a Muse EEG headband which recorded the TP9, AF7, AF8 and TP10 EEG placements via dry electrodes. Six minutes of resting neutral data is also recorded, the stimuli used to evoke the emotions are below

In [16]:
import pandas as pd
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import classification_report, accuracy_score
from sklearn.impute import SimpleImputer

In [2]:
data = pd.read_csv('emotions.csv')

In [3]:
data.head()

Unnamed: 0,# mean_0_a,mean_1_a,mean_2_a,mean_3_a,mean_4_a,mean_d_0_a,mean_d_1_a,mean_d_2_a,mean_d_3_a,mean_d_4_a,...,fft_741_b,fft_742_b,fft_743_b,fft_744_b,fft_745_b,fft_746_b,fft_747_b,fft_748_b,fft_749_b,label
0,4.62,30.3,-356.0,15.6,26.3,1.07,0.411,-15.7,2.06,3.15,...,23.5,20.3,20.3,23.5,-215.0,280.0,-162.0,-162.0,280.0,NEGATIVE
1,28.8,33.1,32.0,25.8,22.8,6.55,1.68,2.88,3.83,-4.82,...,-23.3,-21.8,-21.8,-23.3,182.0,2.57,-31.6,-31.6,2.57,NEUTRAL
2,8.9,29.4,-416.0,16.7,23.7,79.9,3.36,90.2,89.9,2.03,...,462.0,-233.0,-233.0,462.0,-267.0,281.0,-148.0,-148.0,281.0,POSITIVE
3,14.9,31.6,-143.0,19.8,24.3,-0.584,-0.284,8.82,2.3,-1.97,...,299.0,-243.0,-243.0,299.0,132.0,-12.4,9.53,9.53,-12.4,POSITIVE
4,28.3,31.3,45.2,27.3,24.5,34.8,-5.79,3.06,41.4,5.52,...,12.0,38.1,38.1,12.0,119.0,-17.6,23.9,23.9,-17.6,NEUTRAL


In [4]:
X = data.iloc[:, :-1]
y = data.iloc[:, -1]

### Preprocess the data

In [5]:
# Handle missing values
imputer = SimpleImputer(strategy='mean')
X = imputer.fit_transform(X)


In [6]:
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)


In [7]:
# Standardize the features
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

### Train the Model

In [9]:

model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)


In [10]:
# Evaluate the model
y_pred = model.predict(X_test)
print(f'Accuracy: {accuracy_score(y_test, y_pred)}')
print(f'Classification Report:\n{classification_report(y_test, y_pred)}')

Accuracy: 0.9882903981264637
Classification Report:
              precision    recall  f1-score   support

    NEGATIVE       0.97      0.99      0.98       143
     NEUTRAL       1.00      1.00      1.00       148
    POSITIVE       0.99      0.97      0.98       136

    accuracy                           0.99       427
   macro avg       0.99      0.99      0.99       427
weighted avg       0.99      0.99      0.99       427



### Separate the data based on labels

In [11]:
negative_data = data[data.iloc[:, -1] == 'NEGATIVE']
neutral_data = data[data.iloc[:, -1] == 'NEUTRAL']
positive_data = data[data.iloc[:, -1] == 'POSITIVE']


In [12]:
negative_data.to_csv('negative.csv', index=False)
neutral_data.to_csv('neutral.csv', index=False)
positive_data.to_csv('positive.csv', index=False)