Skip to content

Ktin06/IntroML

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🧠 Stress Level Prediction via Stacking Ensemble (IT3190 - HUST)

📌 Project Overview

This project delivers a Supervised Learning solution aimed at predicting user Stress_Level based on daily behavioral and physiological indicators. Utilizing a comprehensive dataset of 55,000 samples and 18 features (including sleep duration, caffeine intake, caloric consumption, daily steps, and workout metrics), we bypass medical assumptions by mapping raw daily habits directly to stress levels.

To maximize predictive performance and eliminate risk of overfitting, the system implements a robust 2-Tier Stacking Ensemble architecture:

  • Tier 1 (Base Models): Trains three independent regressors representing diverse algorithmic paradigms: XGBoost (Boosting), Random Forest (Bagging), and SVR (Vector Space).
  • Tier 2 (Meta-Model): Employs Ridge Regression to dynamically learn the optimal weights of the Tier 1 predictions, compensating for individual model errors to yield the final Stress_Level.

🛠️ Tech Stack & Core Libraries

  • Language: Python
  • Machine Learning: Scikit-learn, XGBoost
  • Data Processing & Visualization: Pandas, NumPy, Matplotlib, Seaborn
  • Deployment / Interactive Demo: Streamlit / Gradio

👥 Team Roles & Responsibilities (4-Member Group)

  • Member 1 (Team Leader / Data Engineer): Exploratory Data Analysis (EDA), automated preprocessing pipelines (StandardScaler), handling missing values, and version control management.
  • Member 2 (ML Engineer - Tier 1): Train/Test splitting, K-Fold Cross-Validation setup, and hyperparameter tuning (GridSearchCV) for the three Base Models.
  • Member 3 (ML Engineer - Tier 2): Meta-feature extraction, Tier 2 Ridge Regression configuration, baseline-vs-stacking performance evaluation metrics ($RMSE$, $MAE$, $R^2$), and lead presenter.
  • Member 4 (Full-stack & UI Engineer): Model serialization (.pkl packaging), building the interactive web application demo, and designing the academic slide deck.

About

Stress level prediction system based on daily lifestyle metrics using a 2-tier Stacking Ensemble architecture (XGBoost, Random Forest, SVR, and Ridge Regression). Course project for IT3190 - Introduction to Machine Learning and Data Mining | HUST.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors