Skip to content

An ensemble model for predicting diabetes onset using NHANES Data

Notifications You must be signed in to change notification settings

semerj/NHANES-diabetes

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

44 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

An Ensemble Model for Predicting the Onset of Diabetes using NHANES Data

By John Semerdjian & Spencer Frank

Code

Our models are contained in the NHANES.ipynb notebook. In order to run the notebook, create a virtual environment and install the required modules.

# create a virtual environment, "nhanes"
$ mkvirtualenv --python=/usr/local/bin/python3 nhanes
$ workon nhanes

# install required modules
$ pip install -r requirements.txt

# download/merge data
$ python ./bootstrap.py

# start ipython notebook
$ ipython notebook

Video & Report

You can find our report here.

Abstract

Prediction of disease onset from patient survey and lifestyle data is quickly becoming an important tool for diagnosing a disease before it progresses. In this study data from the National Health and Nutrition Examination Survey (NHANES) questionnaire is used to predict the onset of diabetes. An ensemble model using the output of several classification algorithms was developed to predict the onset on diabetes based on 16 features. The ensemble model had an AUC of 0.834 indicating high performance.

Features and Descriptions

Additional Variables

About

An ensemble model for predicting diabetes onset using NHANES Data

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published