Skip to content

This project applies machine learning models to predict BMI categories based on individual physical attributes, utilizing MLflow for experiment tracking and model management, with integration into DagsHub for collaborative data science workflows. It showcases the power of MLflow in enhancing model lifecycle management and reproducibility.

Abhi0323/BMI-Prediction-with-MLflow-DagsHub

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BMI Prediction with XGBoost, MLflow, and DAGsHub

Overview

This project showcases the development of a BMI category prediction model using the XGBoost algorithm. It demonstrates the use of MLflow for experiment tracking and model management, alongside DAGsHub for collaboration and version control in a machine learning context. Through this README, we'll explore key code features, the role of MLflow and DAGsHub, and provide visual insights into the workflow.

Project Highlights

  • XGBoost for Classification: Utilizes the powerful XGBoost classifier to predict BMI categories based on height, weight, and gender data.

  • MLflow Experiment Tracking: Leverages MLflow to track experiments, log model parameters, and store model metrics, enhancing reproducibility and insight into model performance.

  • Hyperparameter Optimization: Demonstrates how to perform hyperparameter tuning to find the optimal model configuration.

  • DAGsHub for Collaboration: Uses DAGsHub to share the project, allowing others to view, contribute to, and replicate experiments and model training processes.

How MLflow Powers the Project

MLflow is an open-source platform for managing the end-to-end machine learning lifecycle. In this project, MLflow is instrumental for several reasons:

  • Tracking Experiments: Every model run, with its parameters and metrics, is logged for comparison and analysis. This makes understanding model improvements over time straightforward.
Screenshot 2024-04-10 at 3 00 57 PM
  • Model Versioning: MLflow's Model Registry is used to version the model, facilitating the transition of models from development to staging and production environments seamlessly.
Screenshot 2024-04-10 at 3 01 46 PM

Collaborating with DAGsHub

DAGsHub complements MLflow by providing a platform for hosting the ML project, including code, data, and MLflow tracking servers. Key features utilized include: Screenshot 2024-04-10 at 3 04 56 PM

  • Version Control for Data Science: Beyond just code, DAGsHub allows for the versioning of datasets and ML models, ensuring that every aspect of the project is reproducible.

  • Experiment Sharing: The integration with MLflow means experiments are easily shared and viewed on DAGsHub, fostering collaboration among data scientists.

Screenshot 2024-04-10 at 3 02 56 PM

The Core of the Project

At the heart of the project is the predictive model built with XGBoost. Here's a brief overview of the model training process:

  • Data Preprocessing: Includes encoding categorical variables and splitting the dataset.

  • Model Training: Involves configuring the XGBoost classifier with hyperparameters like learning rate and max depth, fitting the model to the training data, and evaluating its performance on the test set.

  • Logging with MLflow: Each training run's details, including parameters and metrics, are logged using MLflow for easy tracking and comparison.

Screenshot 2024-04-10 at 3 02 09 PM

Getting Involved

Interested in contributing or experimenting with the project? Here’s how you can get involved:

  • Explore the Project on DAGsHub: Visit the project's DAGsHub page to view the code, datasets, and ML experiments.

  • Link:[https://dagshub.com/Abhi0323/BMI-Prediction-with-MLflow-DagsHub]

  • Run the Experiments: Clone the project and follow the setup instructions to run your own experiments. Your findings and improvements can help evolve the project further.

Conclusion

This BMI prediction project exemplifies the synergy between machine learning, experiment tracking with MLflow, and collaborative version control with DAGsHub. It demonstrates not just the technical steps required to train and manage a machine learning model but also highlights the importance of reproducibility, collaboration, and open science in the field of data science.

About

This project applies machine learning models to predict BMI categories based on individual physical attributes, utilizing MLflow for experiment tracking and model management, with integration into DagsHub for collaborative data science workflows. It showcases the power of MLflow in enhancing model lifecycle management and reproducibility.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages