Skip to content

Statistical Learning Project, Data Science @ UniPD. Prediction of Coronary Artery Disease using Statistical Learning Models

Notifications You must be signed in to change notification settings

marcouderzo/CoronaryHeartDisease-Prediction

Repository files navigation

🫀 Coronary Heart Disease Prediction Project 🫀

Statistical Learning Project, Data Science @ UniPD

Authors: Marco Uderzo, Francesco Vo

Context

Cardiovascular diseases (CVDs) are the #1 cause of death globally, taking an estimated 17.9 million lives each year, which accounts for 31% of all deaths worldwide. Four out of five CVD deaths are due to heart attacks and strokes, and one-third of these deaths occur prematurely in people under 70 years of age.

Coronary Heart Disease (or Coronary Artery Disease) is a common and very deadly occurrence, and this dataset contains 11 features that can be used to predict it. People with cardiovascular diseases or who are at high cardiovascular risk need early detection wherein statistical learning models can be of great help.

Dataset Source

This dataset was created by combining different datasets already available independently but not combined before. In this dataset, 5 heart datasets are combined over 11 common features which makes it the largest heart disease dataset available so far for research purposes. The five datasets used for its curation are:

  • Cleveland: 303 observations
  • Hungarian: 294 observations
  • Switzerland: 123 observations
  • Long Beach VA: 200 observations
  • Stalog (Heart) Dataset: 270 observations

Observations:

  • Total: 1190 observations

  • Duplicated: 272 observations

  • Final dataset: 918 observations

About

Statistical Learning Project, Data Science @ UniPD. Prediction of Coronary Artery Disease using Statistical Learning Models

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published