Skip to content

Classification of Heart Disease using a variety of supervised learning classifiers

Notifications You must be signed in to change notification settings

rrasheed/Heart-Disease-ML-Classifer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Heart_Disease_ML1

Classification of Heart Disease using a variety of supervised learning classifiers

Authors: Rayhaan Rasheed, Solomon Mekonnen, Sam Aboagye

Date: 11/26/2018

Data:

The data used in this project is the Heart Disease Dataset generated by Robert Detrano, M.D., Ph.D. at the V.A. Medical Center, Long Beach and Cleveland Clinic Foundation. The full database was pulled from the Machine Learning Repository created by the University of California Irvine.

There are 76 attributes in total, but the literature prefers using only 14 of them:

  1. Age
  2. Sex
  3. Chest Pain Rating
  4. Resting Blood Pressure
  5. Serum Cholestoral in mg/dl
  6. Blood Sugar Level While Fasting
  7. Resting EKG
  8. Maximum Heart Rate Achieved
  9. Exercise Induced Angina
  10. ST Depression Induced by Exercise Relative to Rest
  11. Slope of the Peak Exercise ST Segment
  12. Number of Major Vessels Colored by Flourosopy
  13. HR Type (Normal, Fixed Defect, or Reversible Defect)
  14. Class(target)

Link to the Data: https://github.com/rrasheed/Heart_Disease_ML1/blob/master/Heart.csv

Overview:

This project aims to evaluate and compare different classifers using the Heart Disease database. Instead on turning this into a multi-class classification problem, the target values will be changed to either 0 or 1. Any target value that has a value greater or equal to 1 will be a 1 in the new target column; likewise, anything with a 0 will stay 0. The reason for this is to focus on the fundamental issue of whether a patient qualifies for having any sign or heart disease (OnevsAll)

The classifiers used: Logistic Regression, Random Forest, & Support Vector Machine

References

[1] Emelia, B. J., MD, ScM, FAHA. et. al (2018). Correction to: Heart Disease and Stroke Statistics—2018 Update: A Report From the American Heart Association. Circulation, 137(12), 67-492. doi:10.1161/cir.0000000000000573

[2] Whitley, D., & Watson, J. P. (1970). Complexity Theory and the No Free Lunch Theorem. Search Methodologies, 317-339. doi:10.1007/0-387-28356-0_11

[3] Bethel, G. B., Rajinikanth, T., PhD, & Raju, S. V., PhD. (2016). A Knowledge driven Approach for Efficient Analysis of Heart Disease Dataset. International Journal of Computer Applications, 147(9), 39-46. doi:10.5120/ijca2016911187

[4] Detrano R (1988), Heart Disease Data Set, V.A. Medical Center, Long Beach, and Cleveland Clinic Foundation, Retrieved from https://archive.ics.uci.edu/ml/datasets/Heart+Disease.

Acknowledgement

We would like to acknowledge Dr. Yuxio Huang whose code was very useful in building and evaluating the classifiers

About

Classification of Heart Disease using a variety of supervised learning classifiers

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published