Machine Learning Classification Model based on Decision Tree Algorithm using UCI heart Disease Dataset.
One of the major tasks on this dataset is to predict based on the given attributes of a patient whether that particular person has heart disease or not and the other is the experimental task to diagnose and find out various insights from this dataset which could help in understanding the problem more.
This is a multivariate type of dataset which means providing or involving a variety of separate mathematical or statistical variables, multivariate numerical data analysis. There are total 606 rows and 16 columns.
Column Descriptions:
- age: (Age of the patient in years)
- sex: (Male/Female)
- cp: chest pain type ([typical angina, atypical angina, non-anginal, asymptomatic])
- trestbps: resting blood pressure (resting blood pressure (in mm Hg on admission to the hospital))
- chol: (serum cholesterol in mg/dl)
- fbs: (if fasting blood sugar > 120 mg/dl)
- restecg: resting electrocardiographic results ([normal, stt abnormality, lv hypertrophy])
- thalach: maximum heart rate achieved
- exang: exercise-induced angina (True/ False)
- oldpeak: ST depression induced by exercise relative to rest
- slope: the slope of the peak exercise ST segment
- ca: number of major vessels (0-3) colored by fluoroscopy
- thal: [normal; fixed defect; reversible defect]
- target the predicted attribute
A decision tree is a supervised learning algorithm that uses a tree-like structure to make decisions based on input data. It divides data into branches and assigns outcomes to leaf nodes.
Decision trees are used for classification and regression tasks, providing efficient, accurate and easy-to-understand models.
Language: Python 3.11.4
Library: sklearn