# Dataset description

## Dataset info

link https://www.kaggle.com/datasets/ziya07/psychological-state-identification-dataset

## Problem description

This project focuses on a comprehensive dataset capturing physiological, behavioral, and environmental data from biosensors to study students' psychological states during educational activities. It lays the foundation for advanced machine learning research, aiming to develop models capable of real-time analysis of stress, emotional engagement, and focus. Key features include heart rate variability, EEG power bands, environmental factors like noise and light, and behavioral metrics such as focus duration. By offering insights into well-being and engagement, the dataset aims to advance mental health support and optimize learning experiences through innovative applications of biosensor technology.

### Column description
| No.| Column      | Description | Type |
| -- | ----------- | ----------- | -- |
| 1. | ID | Unique identifier for each participant | Integer|
| 2. | Time | Timestamp indicating when the data was recorded| Datetime|
| 3. | HRV (ms) | Heart Rate Variability-Indicates stress and relaxation states. | Float
| 4. | Gen(GSR) (μS) | Galvanic Skin Response-Reflects stress through changes in skin conductivity. |Float |
| 5. | EEG Power Bands | Captures brain activity in Delta, Alpha, and Beta bands. | Float|
| 6. | Blood Pressure (mmHg) | Measures cardiovascular response. |Integer |
| 7. | Oxygen Saturation (%) | Indicates oxygen levels in the blood. |Float |
| 8. | Heart Rate (BPM) | Shows physical or emotional excitement. |Integer |
| 9. | Ambient Noise (dB) | Noise intensity during educational activities. |Float |
| 10. | Cognitive Load | Low, Moderate, High- reflects mental effort. |Categorical |
| 11. | Mood State | Happy, Neutral, Sad, Anxious- represents emotional conditions. |Categorical | 
| 12. | Psychological State | Stressed, Relaxed, Focused, Anxious- inferred from biosensor data.|Categorical | 
| 13. | Respiration Rate (BPM) | Measures breathing activity |Integer | 
| 14. | Skin Temp (°C) | Indicates stress or comfort levels. |Float | 
| 15. | Focus Duration (s) | Time spent in sustained attention on a task. |Integer |
| 16. | Task Type | Lecture, Group Discussion, Assignment, Exam. |Categorical |
| 17. | Age |Participant age. |Integer |
| 18. | Gender | Male, Female |Binary |
| 19. | Educational Level | High School, Undergraduate, Postgraduate |Categorical |
| 20. | Study Major |Science, Arts, Engineering |Categorical |

# Dependencies loading

In [1]:
#libraries
# data analysis and wrangling
import pandas as pd
import numpy as np
import random as rnd

# visualization
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline

# machine learning
from sklearn.preprocessing import StandardScaler
from sklearn.cluster import KMeans
from sklearn.decomposition import PCA

# dataset split
from sklearn.model_selection import train_test_split

# missing values imputing
from sklearn.impute import KNNImputer

np.random.seed(2024)

# Data loading

In [8]:
data = pd.read_csv('../data/01_raw/psychological_state_dataset.csv')
data.head()

Unnamed: 0,ID,Time,HRV (ms),GSR (μS),EEG Power Bands,Blood Pressure (mmHg),Oxygen Saturation (%),Heart Rate (BPM),Ambient Noise (dB),Cognitive Load,Mood State,Psychological State,Respiration Rate (BPM),Skin Temp (°C),Focus Duration (s),Task Type,Age,Gender,Educational Level,Study Major
0,1,2024-01-01 00:00:00,33.039739,1.031806,"[0.7583653347946298, 1.423247998317594, 0.6157...",114/79,98.433312,98,56.863054,Low,Anxious,Stressed,21,34.566484,27,Exam,22,Female,Postgraduate,Engineering
1,2,2024-01-01 00:00:01,49.914651,1.340983,"[0.5520419333516282, 1.858065835142619, 0.3766...",113/86,98.944505,70,45.34343,Low,Neutral,Stressed,21,35.358593,282,Assignment,23,Male,Undergraduate,Arts
2,3,2024-01-01 00:00:02,67.894401,1.006014,"[1.0261365005886114, 1.3504934190994182, 2.308...",124/74,95.990753,91,50.029264,High,Sad,Relaxed,17,34.359495,50,Group Discussion,18,Female,Postgraduate,Arts
3,4,2024-01-01 00:00:03,34.705373,0.84927,"[1.6075723109471591, 1.6619672129812242, 0.344...",120/73,98.173643,95,60.802104,Low,Neutral,Anxious,12,34.802638,223,Exam,28,Female,High School,Engineering
4,5,2024-01-01 00:00:04,52.896549,0.879084,"[1.055003922514022, 0.7643319894343756, 1.0745...",111/80,96.225051,65,40.696384,High,Anxious,Stressed,14,35.869862,201,Group Discussion,24,Female,High School,Engineering
