Diabetes Prediction using KNN Classification

Project Overview

This project uses the K-Nearest Neighbors algorithm to predict whether a person is likely to have diabetes based on health-related features such as glucose level, BMI, blood pressure, insulin, age, and other medical measurements.

The main goal of this project is to understand how KNN classification works and how feature scaling affects distance-based machine learning models.

Dataset

The dataset used in this project is the Pima Indians Diabetes Dataset from Kaggle.

Target column:

Outcome
- 0 = Not Diabetic
- 1 = Diabetic

Features

The dataset contains the following features:

Pregnancies
Glucose
BloodPressure
SkinThickness
Insulin
BMI
DiabetesPedigreeFunction
Age
Outcome

Technologies Used

Python
Pandas
Scikit-learn
StandardScaler
K-Nearest Neighbors
Streamlit
Joblib

Project Workflow

Loaded the dataset
Separated input features and target column
Split the data into training and testing sets
Applied feature scaling using StandardScaler
Built a KNN classification model
Tested different K values
Selected the best K value
Evaluated the final model using accuracy, confusion matrix, and classification report
Built a simple Streamlit UI for prediction

Model Performance

The KNN model was evaluated using:

Accuracy Score
Confusion Matrix
Precision
Recall
F1-score

The model achieved good performance for a beginner-level classification project after testing different K values.

Streamlit App

A simple Streamlit web app was created where users can enter health-related values and get a prediction.

To run the app:

streamlit run app.py

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md
app.py		app.py
diabetes.csv		diabetes.csv
knn_diabetes_model.pkl		knn_diabetes_model.pkl
pima-indians-diabetes-ml-model.ipynb		pima-indians-diabetes-ml-model.ipynb
scaler.pkl		scaler.pkl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Diabetes Prediction using KNN Classification

Project Overview

Dataset

Features

Technologies Used

Project Workflow

Model Performance

Streamlit App

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Diabetes Prediction using KNN Classification

Project Overview

Dataset

Features

Technologies Used

Project Workflow

Model Performance

Streamlit App

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages