Skip to content

This project focuses on developing a predictive system for analyzing the Pima Indians Diabetes dataset and predicting the likelihood of individuals having diabetes

License

Notifications You must be signed in to change notification settings

sanidhyajadaun/Predictive-System-on-Pima-Indians-Diabetes-Data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Predictive System on Pima Indians Diabetes Data

This project focuses on developing a predictive system for analyzing the Pima Indians Diabetes dataset and predicting the likelihood of individuals having diabetes. The goal is to leverage machine learning techniques to assist in early detection and proactive management of diabetes.

Table of Contents

ProjectOverview

In this project, we aim to build a predictive system using machine learning algorithms to predict diabetes outcomes based on the Pima Indians Diabetes dataset. We employ Support Vector Machines (SVM) as our primary classification algorithm for modeling.

The project involves comprehensive data analysis, visualization, feature preprocessing, and model training. The trained model is then used to make predictions on new, unseen data. The system's accuracy is evaluated using various metrics, including accuracy score and confusion matrix.

Installation

To run the project locally, follow these steps:

  1. Clone the repository: git clone https://github.com/sanidhyajadaun/Predictive-System-on-Pima-Indians-Diabetes-Data
  2. Install the required dependencies: pip install -r requirements.txt
  3. Run the project: python main.py

Usage

Once the project is set up, you can explore and analyze the Pima Indians Diabetes dataset using Jupyter Notebooks or any Python IDE. The main code file (main.ipynb) contains the pipeline for data preprocessing, model training, and evaluation. Feel free to modify and experiment with the code as needed.

Dataset

The Pima Indians Diabetes dataset used in this project contains various health-related attributes for individuals. The dataset is available at link.

Features

  • Data preprocessing and cleaning
  • Exploratory data analysis (EDA)
  • Feature engineering and selection
  • Machine learning modeling using Support Vector Machines (SVM)
  • Model evaluation using accuracy score and confusion matrix
  • Data visualization using Matplotlib and Seaborn

TechStack

The tech stack for this project includes:

  • Python
  • Pandas
  • Matplotlib
  • Seaborn
  • Scikit-learn (SVM)

Contributing

Contributions to this project are welcome. Feel free to open issues for any bug fixes or feature enhancements. You can also fork the repository, make your changes, and submit a pull request.

About

This project focuses on developing a predictive system for analyzing the Pima Indians Diabetes dataset and predicting the likelihood of individuals having diabetes

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published