## MLDC Mapping (This Project)
Each step below is **working here** in this notebook.

1. **Problem Definition** → Emotion detection from EEG
2. **Data Collection** → Load EEG CSVs from `dataset/`
3. **Data Processing** → Missing values + scaling
4. **EDA** → Shape, samples, statistics
5. **Feature Engineering** → Raw EEG channels (baseline)
6. **Model Selection** → Linear + Logistic Regression
7. **Deployment** → Exported to web UI in this project


## 1) Problem Definition
We want to predict:
- **Intensity score** (0–10) using Linear Regression
- **High vs Low emotion** using Logistic Regression


## 2) Imports


In [None]:
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LinearRegression, LogisticRegression
from sklearn.metrics import mean_squared_error, accuracy_score


## 3) Data Loading
We load a single EEG file with **19 channels**.


In [None]:
data = pd.read_csv("../dataset/s00.csv", header=None)
data.columns = [f"ch{i+1}" for i in range(19)]
data.head()


## 4) EDA (Basic Exploration)
Check dataset shape and simple stats.


In [None]:
data.shape


In [None]:
data.describe().loc[["mean", "std"]].head()


## 5) Preprocessing
- Fill missing values
- Scale features (StandardScaler)


In [None]:
data = data.fillna(data.mean())
X = data.values
scaler = StandardScaler()
X = scaler.fit_transform(X)


## 6) Feature Selection / Creation
For simplicity, we use **raw scaled channels** as features.


## 7) Create Synthetic Labels (Demo Only)
- y_cont: continuous 0–10 score
- y_bin: 0 = Low, 1 = High


In [None]:
weights = np.random.randn(19)
scores = X @ weights
y_cont = (scores - scores.min()) / (scores.max() - scores.min()) * 10
y_bin = (y_cont > np.median(y_cont)).astype(int)


## 8) Train/Test Split


In [None]:
X_train, X_test, y_train, y_test = train_test_split(
    X, y_cont, test_size=0.2, random_state=42
)
X_train_b, X_test_b, y_train_b, y_test_b = train_test_split(
    X, y_bin, test_size=0.2, random_state=42
)


## 9) Train Linear Regression (Intensity)


In [None]:
lin = LinearRegression()
lin.fit(X_train, y_train)
pred = lin.predict(X_test)
print("Linear Regression MSE:", mean_squared_error(y_test, pred))


## 10) Train Logistic Regression (High vs Low)


In [None]:
log = LogisticRegression(max_iter=1000)
log.fit(X_train_b, y_train_b)
pred_b = log.predict(X_test_b)
print("Logistic Regression Accuracy:", accuracy_score(y_test_b, pred_b))
