Multi-Class-Prediction-of-Obesity-Risk

This project is an extension of improving the models previously developed for Kaggle Competition where we placed within the top 5%. The project aims at redoing the project with added production using best practices learned from class MGSC-695-076

Meet the Team

Product Manager - Aasna
Machine Learning Engineer - Arham
ML Ops - Krishan
Data Engineer - Yash
Cloud SME - Nandani
Business Analyst - Mahrukh

Branches:

Main: For Final Product [Owner - Team]
Experiments: For ML Experiments and tracking [Owners - Arham, Krishan]
ArchDevelopment: For CICD [Owner - Nandani]
Streamlit: For front end [Owner - Nandani]
Data Engineering: For Kafka Streaming [Owner- Yash]
Backup: For Backup [Owner - Aasna, Mahrukh]

Project Phases

Phase 1: Planning and Design

During this initial phase, the team establishes the foundation of the project. The Product Manager sets the project's vision and milestones, while the Machine Learning Engineer and Business Analyst research technical feasibility and market requirements, respectively. The Data Engineer and Cloud SME lay the groundwork for data handling and cloud infrastructure, ensuring all systems align with the project’s technical needs. Access planning and design documentation.

Phase 2: Data Preparation and Infrastructure Setup

In this phase, the team focuses on setting up the necessary infrastructure and preparing the data for analysis. The Data Engineer builds data ingestion pipelines, while the Cloud SME ensures the cloud setup is optimized for scalability and security. Review the infrastructure setup.

Phase 3: Feature Engineering and Model Prototyping

Feature engineering and initial model prototyping are conducted. The Machine Learning Engineer explores and selects features that will effectively predict obesity risk, while developing initial models to test their efficacy. Access feature engineering and prototyping details.

Phase 4: Model Refinement and Experimentation

This phase is critical for refining the models through extensive experimentation and tuning. The team iterates on models, optimizing their performance through advanced analytical techniques and continuous testing. Explore model refinement experiments.

Phase 5: Deployment Preparation and Testing

Preparation for deployment involves finalizing the model, setting up continuous integration/continuous deployment (CI/CD) pipelines, and ensuring all systems are robust and secure. The team conducts final stress tests to ensure the infrastructure is ready for a smooth transition to production. Details can be found in the Docker folder.

Phase 6: Model Deployment and Monitoring

The model is deployed to a production environment. This phase includes rigorous monitoring of the model’s performance and quick resolution of any issues. The team focuses on ensuring the model operates efficiently and effectively. Monitor deployment and operations.

Technologies Used

Data Analysis/Model Training: Python, Jupyter Notebooks
Experiment Tracking: MLFlow
Model Building: PyCaret, LightGBM, XGBoost, CatBoost
Hyperparameter Optimization: Optuna
Containerization: Docker
Realtime Data Streaming: Kafka
Version Control and CI/CD: Git, GitHub Actions
Cloud Deployment: Azure Machine Learning, Azure Blob Storage
User Interface: Streamlit
Dependency and Environment Management: Poetry

Steps:

Step 1: EDA

Step 1: EDA [Owner to Update Step]

WIP

Business Case

Our solution targets healthcare providers for early identification of at-risk patients, public health officials for data-driven policy making, and insurance companies for premium adjustment based on individual risk. The economic impact includes significant healthcare cost savings and revenue generation from tailored wellness programs.

Acknowledgements

This project is an effort by the team to tackle the global health crisis of obesity by employing advanced data science and machine learning techniques, aiming to make a significant impact in the healthcare sector.

Name		Name	Last commit message	Last commit date
Latest commit History 64 Commits
.github/workflows		.github/workflows
01-Dataset		01-Dataset
02-EDA		02-EDA
03-Experiments - Error Analysis		03-Experiments - Error Analysis
03-Experiments		03-Experiments
04-Local-Architecture		04-Local-Architecture
05-Cloud-Architecture		05-Cloud-Architecture
06-FrontEnd-App		06-FrontEnd-App
08-Final-Models		08-Final-Models
10-Assets-For-Documentation		10-Assets-For-Documentation
11-Docker		11-Docker
11-Product-Manager		11-Product-Manager
12-Business Analyst		12-Business Analyst
15-Graphs		15-Graphs
16-README-Support-Files		16-README-Support-Files
Dump		Dump
Streamlit		Streamlit
mlruns/2/538a01f6b8554c4bbe5049a82b222268/artifacts/model		mlruns/2/538a01f6b8554c4bbe5049a82b222268/artifacts/model
.gitignore		.gitignore
README.md		README.md
mlflow.db		mlflow.db
new_mlflow.db		new_mlflow.db
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

McGill-MMA-EnterpriseAnalytics/Multi-Class-Prediction-of-Obesity-Risk

Folders and files

Latest commit

History

Repository files navigation

Multi-Class-Prediction-of-Obesity-Risk

This project is an extension of improving the models previously developed for Kaggle Competition where we placed within the top 5%. The project aims at redoing the project with added production using best practices learned from class MGSC-695-076

Meet the Team

Branches:

Project Phases

Phase 1: Planning and Design

Phase 2: Data Preparation and Infrastructure Setup

Phase 3: Feature Engineering and Model Prototyping

Phase 4: Model Refinement and Experimentation

Phase 5: Deployment Preparation and Testing

Phase 6: Model Deployment and Monitoring

Technologies Used

Steps:

Step 1: EDA

Step 1: EDA [Owner to Update Step]

Step 1: EDA [Owner to Update Step]

Step 1: EDA [Owner to Update Step]

Step 1: EDA [Owner to Update Step]

Step 1: EDA [Owner to Update Step]

Step 1: EDA [Owner to Update Step]

Step 1: EDA [Owner to Update Step]

Step 1: EDA [Owner to Update Step]

Step 1: EDA [Owner to Update Step]

Step 1: EDA [Owner to Update Step]

Step 1: EDA [Owner to Update Step]

Step 1: EDA [Owner to Update Step]

Step 1: EDA [Owner to Update Step]

Step 1: EDA [Owner to Update Step]

Step 1: EDA [Owner to Update Step]

Step 1: EDA [Owner to Update Step]

Step 1: EDA [Owner to Update Step]

Step 1: EDA [Owner to Update Step]

Step 1: EDA [Owner to Update Step]

Step 1: EDA [Owner to Update Step]

Step 1: EDA [Owner to Update Step]

Step 1: EDA [Owner to Update Step]

Step 1: EDA [Owner to Update Step]

Step 1: EDA [Owner to Update Step]

Step 1: EDA [Owner to Update Step]

Step 1: EDA [Owner to Update Step]

Step 1: EDA [Owner to Update Step]

Step 1: EDA [Owner to Update Step]

Step 1: EDA [Owner to Update Step]

Step 1: EDA [Owner to Update Step]

Step 1: EDA [Owner to Update Step]

Step 1: EDA [Owner to Update Step]

Step 1: EDA [Owner to Update Step]

Step 1: EDA [Owner to Update Step]

Step 1: EDA [Owner to Update Step]

Step 1: EDA [Owner to Update Step]

Business Case

Acknowledgements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 6

Languages

Packages