Skip to content

Early stage diabetes risk prediction using several supervised machine learning algorithms.

License

Notifications You must be signed in to change notification settings

AIAnytime/Early-Stage-Diabetes-Risk-Prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

Early Stage Diabetes Risk Prediction

End to End Machine Learning pipeline for early stage diabetes risk prediction model using several supervised machine learning algorithms like Decision Tree, Logistic Regression, and KNN.

Notebook contains all the steps of a Data Science Life Cycle that looks like:

tdsp-lifecycle2

Data 👇

The data is available at UCI Machine Learning Repository: https://archive.ics.uci.edu/ml/datasets/Early+stage+diabetes+risk+prediction+dataset.

Steps performed in the Notebook: 👇

  • Exploratory Data Analysis : Basic Data Analysis and Visualization with Preprocessing.
  • Feature Engineering : Finding the best features using several technqiues like SelectKBest, Recursive Elimination, and Ranking Mechanism. I also did some Outlier removal steps.
  • Machine Learning Models : I have used Logistic Regression, Decision Tree Classifier, and KNN to find the best model among them using several evaluation techniques.
  • Evaluation and Interpretation: Evaluation of models using Confusion Matrix, Classification Report, etc.
  • Interpretation: Used Lime (amazing for Iterpretation of models), Eli5, etc.

Open the Notebook in the Jupyter Notebook, Jupyter Lab, or Google Colab's Notebook to execute all the steps and re-train the models for your purpose.

Evaluation and Interpretation👇

All the models have performed well but Decision Tree leads to a better accuracy and shows promising result in evaluation steps as well. The Decision Tree Classifier has given 94% accuracy ahead of Logistic Regression (90%) and KNN (88%). But in Decision Tree, it seems to overfit and "Age" has strong correlation with Target. I have used Logistic Regression ahead of Decision Tree for this task.

decision-tree-plot

See the Confusion Matrix and Classification Report below: download

Early-Stage-Diabetes-Risk-Prediction-early-stage-diabetes-prediction-ipynb-at-main-sonucr7-Early-Sta

Deployment 👇

I built an awesome application, find it here: http://predict-smart.herokuapp.com

You can use saved models which are in the pickel files to build any kind of application i.e. Web, Serverless Rest APIs, Mobile Apps, etc. Connect me for more on it...............................

Find me here 👇

If you have any feedback, please connect:

Thanks 😇

About

Early stage diabetes risk prediction using several supervised machine learning algorithms.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published