# **Deliverable 1 – Topic & Dataset Description (WESAD Dataset)**
---
## **1. Introduction**
This notebook documents Deliverable 1 of the ML Project. The selected topic is **Stress Detection Classification** using the **WESAD (Wearable Stress and Affect Detection)** dataset.

The notebook includes:
- Dataset description
- Explanation of the file structure
- Loading the actual `.pkl` physiological signals
- Basic preprocessing

Dataset Path Used:
`C:/Users/dell/OneDrive/Desktop/WESAD DS/WESAD/`


## **2. Dataset Overview**
The WESAD dataset contains multimodal physiological signals collected from 15 subjects.

**Key components inside each subject folder (e.g., S2/):**
- `S2.pkl` → Main file used for ML. Contains all time-series sensor data.
- `*_E4_Data.zip` → Raw wrist data (not needed for ML here).
- `*_quest.csv` → Questionnaire metadata.
- `*_respiban.txt` → Chest band metadata.
- `*_readme.txt` → Subject-specific notes.

**We will use only the `.pkl` files**, because they contain the full cleaned dataset:
- Chest data: ACC, ECG, EMG, Temp, Resp, EDA
- Wrist data: ACC, TEMP, BVP, EDA
- Labels (0 = Baseline, 1 = Stress, 2 = Amusement)


## **3. Load Dataset (.pkl file)**
We select **Subject S2** as an example. The same code works for all subjects.

**Jupyter Notebook Shortcut Note:**
- `A` → insert cell above
- `B` → insert cell below
- `M` → change to Markdown
- `Y` → change to Code
- `Shift + Enter` → run cell
- `Ctrl + S` → save


In [4]:
import pickle
import pandas as pd
import numpy as np

data_path = r"C:\Users\dell\OneDrive\Desktop\WESAD DS\WESAD\S2\S2.pkl"

with open(data_path, 'rb') as f:
    data = pickle.load(f, encoding='latin1')

print(type(data))
print(data.keys())


<class 'dict'>
dict_keys(['signal', 'label', 'subject'])


## **4. Explore Signals**
WESAD signals are grouped by device:
- `data['signal']['chest']`
- `data['signal']['wrist']`

Chest sampling rate ≈ 700 Hz
Wrist sampling rate ≈ 32–64 Hz


In [5]:

chest = data['signal']['chest']
wrist = data['signal']['wrist']

print(list(chest.keys()))
print(list(wrist.keys()))


['ACC', 'ECG', 'EMG', 'EDA', 'Temp', 'Resp']
['ACC', 'BVP', 'EDA', 'TEMP']


## **5. Convert Chest Data to DataFrame**
This gives a clean table for basic exploration and preprocessing.


In [6]:
df = pd.DataFrame({
    'acc_x': chest['ACC'][:, 0],
    'acc_y': chest['ACC'][:, 1],
    'acc_z': chest['ACC'][:, 2],
    'eda': chest['EDA'].flatten(),
    'ecg': chest['ECG'].flatten(),
    'emg': chest['EMG'].flatten(),
    'resp': chest['Resp'].flatten(),
    'temp': chest['Temp'].flatten(),
})

df.head()


Unnamed: 0,acc_x,acc_y,acc_z,eda,ecg,emg,resp,temp
0,0.9554,-0.222,-0.558,5.250549,0.021423,-0.00444,-1.148987,30.120758
1,0.9258,-0.2216,-0.5538,5.267334,0.020325,0.004349,-1.124573,30.129517
2,0.9082,-0.2196,-0.5392,5.243301,0.016525,0.005173,-1.152039,30.138214
3,0.8974,-0.2102,-0.5122,5.249405,0.016708,0.007187,-1.158142,30.129517
4,0.8882,-0.2036,-0.4824,5.286407,0.011673,-0.015152,-1.161194,30.130951


## **6. Labels**
Labels correspond to segments of the experiment:
- `0` → Baseline
- `1` → Stress
- `2` → Amusement


In [7]:

labels = data['label']
np.unique(labels)


array([0, 1, 2, 3, 4, 6, 7], dtype=int32)

## **7. Basic Preprocessing**
Since WESAD signals are high-frequency, they usually require:
- Downsampling
- Sliding window segmentation
- Normalization

Here we demonstrate **simple normalization** just to complete Deliverable 1.


In [8]:
df_norm = (df - df.mean()) / (df.std() + 1e-8)
df_norm.head()

Unnamed: 0,acc_x,acc_y,acc_z,eda,ecg,emg,resp,temp
0,1.150861,-1.38394,-0.450708,3.005561,0.131175,-0.173997,-0.41261,-0.590052
1,0.981831,-1.377707,-0.436719,3.019114,0.124045,0.580307,-0.404243,-0.583215
2,0.881326,-1.346542,-0.388091,2.999708,0.099385,0.651023,-0.413656,-0.576426
3,0.819653,-1.200062,-0.298162,3.004637,0.100574,0.823885,-0.415748,-0.583215
4,0.767116,-1.097215,-0.198906,3.034516,0.067893,-1.093305,-0.416794,-0.582095


# **8. Summary**
- We selected **Stress Detection using WESAD dataset**.
- Loaded actual `.pkl` file from the provided dataset path.
- Extracted chest + wrist signals.
- Created a DataFrame from chest signals.
- Explored labels.
- Applied basic preprocessing.

This completes **Deliverable 1**.
