Heart Risk Key Indicators EDA and Machine Learning Modeling

In this repository a data analysis of the heart risk key indicators using data from the 2020 annual CDC survey of 400k adults related to their health status is performed. Data was gathered using Kaggle from the Personal Key Indicators of Heart Disease | Kaggle dataset.

The main purpose of this project is to be able to detect the heart risk of a person given information about its physical and mental health. Therefore, the problem I will be solving is a binary classification one.

The process of developing a model consisted of many different parts some of which included exploratory data analysis, model selection, validation and interpretability. The following list includes a Jupyter Notebook for each step:

This project sets the foundations for deploying this model in a Web Server using Flask. The following Jupyter Notebook will describe how to do so.

At the end of this project, it is a good idea to have a script that can train each or all models at once. The script called train.py serves for this purpose.

You can do it by typing the following command to train all models at once.

python train.py --model all

If you want to train one model instead, you can set the model parameter as logistic, random_forrest or xgboost. When the training is done, you will have a series of bin files that includes the machine learning models and a dict_vectorizer which will be useful to transform our inputs into the desired format for our models.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.ipynb_checkpoints		.ipynb_checkpoints
Data_Exploration.ipynb		Data_Exploration.ipynb
LICENSE		LICENSE
Model_Selection.ipynb		Model_Selection.ipynb
Model_Use.ipynb		Model_Use.ipynb
README.md		README.md
heart_2020_cleaned.csv		heart_2020_cleaned.csv
project_1-LogisticRegression.ipynb		project_1-LogisticRegression.ipynb
project_1-Trees.ipynb		project_1-Trees.ipynb
project_1-XGBoost.ipynb		project_1-XGBoost.ipynb
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.ipynb_checkpoints

.ipynb_checkpoints

Data_Exploration.ipynb

Data_Exploration.ipynb

LICENSE

LICENSE

Model_Selection.ipynb

Model_Selection.ipynb

Model_Use.ipynb

Model_Use.ipynb

README.md

README.md

heart_2020_cleaned.csv

heart_2020_cleaned.csv

project_1-LogisticRegression.ipynb

project_1-LogisticRegression.ipynb

project_1-Trees.ipynb

project_1-Trees.ipynb

project_1-XGBoost.ipynb

project_1-XGBoost.ipynb

train.py

train.py

Repository files navigation

Heart Risk Key Indicators EDA and Machine Learning Modeling

About

Releases

Packages

Languages

License

mriosrivas/heart-risk-prediction-machinelearning

Folders and files

Latest commit

History

Repository files navigation

Heart Risk Key Indicators EDA and Machine Learning Modeling

About

Resources

License

Stars

Watchers

Forks

Languages