Skip to content


Folders and files

Last commit message
Last commit date

Latest commit



8 Commits

Repository files navigation


The Heart Disease data set comes from the Machine Learning Repository at the Center for Machine Learning and Intelligent Systems at the University of California Irvine.
The dataset can be found here:

The Heart Disease data set is available from four different databases. This project uses the data set from the Cleveland database. The original database contains 76 attributes, but the Cleveland database contains a preprocessed version with these 14 features.:

AGE: years

SEX: 1 = male, 0 = female

CP: Chest pain
    1: Typical angina
    2: Atypical angina
    3: Non-anginal pain
    4: Asymptomatic

TRESTBPS: Resting blood pressure (in mg Hg on admission to the hospital)

CHOL: serum cholestoral in mg/dl

FBS: (fasting blood sugar > 120 mg/dl) (1 = true; 0 = false)

RESTECG: resting electrocardiographic results

THALACH: maximum heart rate achieved

EXANG: exercise induced angina (1 = yes; 0 = no)

OLDPEAK: ST depression induced by exercise relative to rest

SLOPE: the slope of the peak exercise ST segment
    1: upsloping
    2: flat
    3: downsloping

CA: number of major vessels (0-3) colored by flouroscopy

THAL: 3 = normal; 6 = fixed defect; 7 = reversable defect

    0: < 50% diameter narrowing
    1: > 50% diameter narrowing

It is important to note that the diagnosis values in the original database contain values 0 through 4, with 1 through 4 measuring the severity positive diagnosis. For research purposes, it is standard practice to only predict binary values for this feature.


No description, website, or topics provided.






No releases published
