Skip to content

The goal of this repository is to clean the diabetes dataset from 130 US hospitals.

Notifications You must be signed in to change notification settings

rischanlab/Cleaning_diabetes_130_US_hospital_dataset

Repository files navigation

Clean diabtes dataset from 130 US hospital

The dataset represents 10 years (1999-2008) of clinical care at 130 US hospitals. It has 50 features representing 101766 diabetes patient and hospital outcomes.

Finally after cleaning, we have 98052 rows and 21 columns (dimensions). Check my comments inside clean_diab_dataset.py to see how I clean the data.

Dataset source: https://archive.ics.uci.edu/ml/datasets/Diabetes+130-US+hospitals+for+years+1999-2008

About

The goal of this repository is to clean the diabetes dataset from 130 US hospitals.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages