Skip to content

juniorcl/diabetes-prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Diabetes Prediction

A data science projecto to predict diabetes

Dataset Information


This dataset is at Kaggle Pima Indians Diabetes Database dataset. It is originally from the National Institute of Diabetes and Digestive and Kidney Diseases. The objective of the dataset is to diagnostically predict whether or not a patient has diabetes, based on certain diagnostic measurements included in the dataset. Several constraints were placed on the selection of these instances from a larger database. In particular, all patients here are females at least 21 years old of Pima Indian heritage.

Attributes

  1. Pregnancies | Number of times pregnant

  2. Glucose | Plasma glucose concentration a 2 hours in an oral glucose tolerance test

  3. BloodPressure | Diastolic blood pressure (mm Hg)

  4. SkinThickness | Triceps skin fold thickness (mm)

  5. Insulin | 2-Hour serum insulin (mu U/ml)

  6. | Body mass index (weight in kg/(height in m^2)

  7. DiabetesPedigreeFunction | Diabetes pedigree function

  8. Age | Age (years)

  9. Outcome | Class variable (0 or 1) 268 of 768 are 1, the others are 0

About

This project is part of the Machine Health project. It aims to create a app to predict some dicease using data science studies. WARNING: The app is just for study purposes. Therefore you cannot use it instead of going to the doctor.

Methodology

This project will be based on Cross-industry standard process for data mining (CRISP-DM). A standard idea about data science project may be linear: data preparation, modeling, evaluation and deployment. However, when we use CRISP-DM methodology a data science project become circle-like form. Even when it ends in Deployment, the project can restart again by Business Understanding. How might it help?

Kitten

It may help to avoid the data scietist to stop in one specific step and wast time on it. When all the project is completed the data scientist can return to initial step and do every step again. Therefore, the main goal it is to follow circles as it needs.

References

License

This project is licensed under the MIT License - see the LICENSE file for details.

Releases

No releases published

Packages

No packages published