✒️ About Project

✍️ Diabetes Prediction Project

🗒️Project Overview

This project focuses on classification task for predicting diabetes using the Pima Indians Diabetes Dataset. The goal is to accurately identify individuals who are likely to have diabetes, aiding early diagnosis and preventive healthcare.

Hugging Face Link: Click

🗒️Dataset

Source: Pima Indians Diabetes Dataset
Features: Numeric and categorical features including:
- Pregnancies, Glucose, BloodPressure, SkinThickness, Insulin, BMI, DiabetesPedigreeFunction, Age
Target: Outcome (0 = non-diabetic, 1 = diabetic)
Total samples: ~768
Imbalance: Fewer positive cases (~268) than negative (~500)

🗒️Data Preprocessing

Missing and zero values handled with median imputation.
Outliers in numeric features were clipped (1%–99% quantiles).
Numeric features scaled using StandardScaler.
Categorical features encoded using OneHotEncoder.
Pipelines used to integrate preprocessing and ensure reproducibility.

🗒️Model Selection

Primary model: Logistic Regression
Alternative models evaluated: SVC, KNN
Evaluation metric: Recall for positive class (Outcome=1) prioritized due to medical significance.

🗒️Model Training and Evaluation

Pipeline created using ColumnTransformer and preprocessing.
Models trained on stratified train-test split to handle class imbalance.
Cross-validation (5-fold) applied to assess model robustness.
Metrics recorded: Accuracy, Precision, Recall, F1-score (positive class).

🗒️Key Findings

Logistic Regression with class_weight="balanced" achieved highest recall for diabetic cases.
KNN showed higher F1 but lower recall.
Weighted metrics can be misleading due to dataset imbalance; positive class recall is primary metric.

🗒️Deployment

Final pipeline saved as .pkl file.
Front-end can pass input as dictionary or DataFrame.
Column order flexibility maintained; column names must match pipeline requirements.

🗒️Technologies & Libraries

Python 3.x
Pandas, NumPy
scikit-learn
Matplotlib / Seaborn (EDA & visualization)
Joblib / Pickle (pipeline serialization)

✍️Project Setup & Run Instructions

This guide explains how to set up the Python environment and run the project.

1. Create Python Environment

Create a new virtual environment for this project:

python -m venv aiml_env

2. Activate Environment

Activate the newly created environment:

aiml_env\Scripts\activate

3. Install Dependencies

Install all required packages from requirements.txt:

pip install -r requirements.txt

4. Run the application

Start projet by typing:

python app.py

🗒️References

Pima Indians Diabetes Dataset – Kaggle
scikit-learn documentation: Pipeline & ColumnTransformer

If you faced any kind of issue feel free to Report:
Name: MD Rohan Mulla
🎓University: Rabindra Maitree University
📨E-mail: mdrohanislam444@gmail.com
facebook: https://www.facebook.com/MullaRohan

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.vscode		.vscode
Final_ML.ipynb		Final_ML.ipynb
README.md		README.md
app.py		app.py
diabetes_classification.pkl		diabetes_classification.pkl
diabetes_pima_indians.csv		diabetes_pima_indians.csv
fix.py		fix.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

✒️ About Project

✍️ Diabetes Prediction Project

🗒️Project Overview

🗒️Dataset

🗒️Data Preprocessing

🗒️Model Selection

🗒️Model Training and Evaluation

🗒️Key Findings

🗒️Deployment

🗒️Technologies & Libraries

✍️Project Setup & Run Instructions

1. Create Python Environment

2. Activate Environment

3. Install Dependencies

4. Run the application

🗒️References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

✒️ About Project

✍️ Diabetes Prediction Project

🗒️Project Overview

🗒️Dataset

🗒️Data Preprocessing

🗒️Model Selection

🗒️Model Training and Evaluation

🗒️Key Findings

🗒️Deployment

🗒️Technologies & Libraries

✍️Project Setup & Run Instructions

1. Create Python Environment

2. Activate Environment

3. Install Dependencies

4. Run the application

🗒️References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages