Skip to content

This repository contains code for building a K-Nearest Neighbors (KNN) model to predict diabetes based on patient data. Includes data cleaning, hyperparameter tuning, and evaluation metrics.

License

Notifications You must be signed in to change notification settings

amangupta143/Diabetes-Prediction-KNN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 

Repository files navigation

Diabetes Prediction KNN Model and Web App

Diabetes Prediction Model

This repository provides a K-Nearest Neighbors (KNN) model trained_model.sav for diabetes prediction using patient data, along with a Streamlit web application Diabetes_Prediction_Web_App.py for interactive risk assessment.

Table of Contents

  • Objective
  • Data
  • Features
  • Model
  • Benefits
  • Getting Started (Jupyter Notebook)
  • Using the Web App
  • Contributing
  • Licence

Objective

This repository aims to:

  • Develop a KNN model to predict diabetes based on patient data.
  • Provide a clear and well-structured approach to data exploration, cleaning, preprocessing, model selection, and evaluation using a Jupyter Notebook.
  • Serve as a learning resource for anyone interested in KNN for diabetes prediction and machine learning for healthcare applications.

Data

The project utilizes publicly available product and store data, accessible on Kaggle: Pima Indians Diabetes Database.

The datasets consists of several medical predictor variables and one target variable, Outcome. Predictor variables includes the number of pregnancies the patient has had, their BMI, insulin level, age, and so on.

Features

  • Pregnancies : Number of times a woman has been pregnant
  • Glucose : Plasma Glucose concentration of 2 hours in an oral glucose tolerance test
  • BloodPressure : Diastollic Blood Pressure (mm hg)
  • SkinThickness : Triceps skin fold thickness(mm)
  • Insulin : 2 hour serum insulin(mu U/ml)
  • BMI : Body Mass Index ((weight in kg/height in m)^2)
  • Age : Age(years)
  • DiabetesPedigreeFunction : scores likelihood of diabetes based on family history)
  • Outcome : 0(doesn't have diabetes) or 1 (has diabetes)

Model

This repository focuses on building and evaluating a KNN model. KNN classifies data points based on their similarity (distance) to labeled data points in the training set. The model considers the "k" nearest neighbors of a new data point (patient) and predicts the class (diabetic or non-diabetic) based on the majority vote of those neighbors.

Benefits

This diabetes prediction model offers several advantages:

  • Early Detection: Identify people at high risk for diabetes, enabling earlier intervention.
  • Improved Management: Help healthcare professionals tailor treatment plans based on risk.
  • Reduced Costs: Early detection can potentially lower healthcare costs associated with diabetes.
  • Increased Awareness: Raise awareness about diabetes risk factors and encourage healthier habits.

These models are not meant for sole diagnosis but can be a valuable tool for risk assessment.

Getting Started (Jupyter Notebook)

  1. Clone this repository:
    git clone https://github.com/amangupta143/Diabetes-Prediction-KNN.git
  2. Install required dependencies:
    pip install pandas numpy matplotlib seaborn scikit-learn
  3. Run the analysis script:
    jupyter notebook Diabetes-Prediction-Model.ipynb
    

Running the Notebook

  • Open a terminal or command prompt and navigate to the directory containing the notebook and the "diabetes.csv" file.
  • Start Jupyter Notebook: jupyter notebook
  • In the Jupyter Notebook interface, open the Diabetes-Prediction-Model.ipynb file.
  • Run each code cell (block of code) by pressing Shift + Enter. The output of the code will be displayed below the cell.

Using the Web App:

This project also includes a web application built with Streamlit. Here's how to use it:

  1. Ensure you have Python and Streamlit installed.
  2. Open a terminal or command prompt and navigate to the project directory.
  3. Run the web application:
    streamlit run Diabetes_Prediction_Web_App.py
  4. A web interface will open in your default browser, allowing you to enter patient data and receive a diabetes risk prediction.

Contributing

I welcome contributions to this repository! If you have ideas for improvement, bug fixes, or want to explore different aspects of the model, feel free to create a pull request.

Licence

This project is licensed under the MIT License.

Happy coding! 🚀

Portfolio Github Linkedin

About

This repository contains code for building a K-Nearest Neighbors (KNN) model to predict diabetes based on patient data. Includes data cleaning, hyperparameter tuning, and evaluation metrics.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published