# EEG Emotion Classification using Logistic Regression
## MLDC-Based Assignment (DEAP Dataset)

**Dataset:** DEAP (s01.dat)  
**Model:** Logistic Regression  
**Emotion:** Valence (High / Low)


## MLDC Step 1: Problem Definition
Classify emotional states from EEG signals using Logistic Regression following the Machine Learning Development Cycle.

## MLDC Step 2: Data Collection
EEG data is collected from the DEAP dataset which contains 32-channel EEG recordings sampled at 128 Hz.

In [None]:
import pickle
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix


## MLDC Step 3: Load Dataset

In [None]:
with open('s01.dat', 'rb') as f:
    subject = pickle.load(f, encoding='latin1')

eeg_data = subject['data'][:, :32, :]
labels = subject['labels']

print(eeg_data.shape)
print(labels.shape)

## MLDC Step 4: Data Preprocessing
EEG signals are segmented into 1-second windows (128 samples).

In [None]:
def sliding_window(eeg, window_size=128):
    windows = []
    for start in range(0, eeg.shape[1] - window_size + 1, window_size):
        windows.append(eeg[:, start:start + window_size])
    return windows

## MLDC Step 5: Feature Engineering
Statistical features (mean, standard deviation, energy) are extracted from each EEG window.

In [None]:
X_features = []
y_labels = []

for trial in range(eeg_data.shape[0]):
    eeg_trial = eeg_data[trial]
    windows = sliding_window(eeg_trial)

    valence = labels[trial][0]
    label = 1 if valence >= 5 else 0

    for w in windows:
        mean = np.mean(w, axis=1)
        std = np.std(w, axis=1)
        energy = np.sum(w ** 2, axis=1)

        features = np.concatenate([mean, std, energy])
        X_features.append(features)
        y_labels.append(label)

X = np.array(X_features)
y = np.array(y_labels)

print(X.shape, y.shape)

## MLDC Step 6: Train-Test Split

In [None]:
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)

## MLDC Step 7: Feature Scaling

In [None]:
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

## MLDC Step 8: Model Training

In [None]:
model = LogisticRegression(max_iter=3000, class_weight='balanced')
model.fit(X_train, y_train)

## MLDC Step 9: Prediction

In [None]:
y_pred = model.predict(X_test)

## MLDC Step 10: Evaluation

In [None]:
print('Accuracy:', accuracy_score(y_test, y_pred))
print(classification_report(y_test, y_pred))

In [None]:
cm = confusion_matrix(y_test, y_pred)
plt.imshow(cm)
plt.title('Confusion Matrix')
plt.colorbar()
plt.xlabel('Predicted')
plt.ylabel('Actual')
plt.show()