# Project Code : PRCP-1016- Heart Disease Prediction

## Business Case: Create a model predicting potential heart diseases in people using machine learning algorithm.

### Importing Libraries

In [None]:
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline 

import warnings
warnings.filterwarnings('ignore')

### Loading Data Set

In [None]:
label = pd.read_csv('label_data.csv')

In [None]:
val = pd.read_csv('heart disease.csv')

In [None]:
merge = pd.merge(val, label, on = 'patient_id')

In [None]:
merge

In [None]:
# Copying data set to keep as Back-up

new_merge = merge.copy()

In [None]:
new_merge

### Domain Analysis

 1. **patient_id :** Unique ID/values given to each patient.
 
 2. **slope_of_peak_exercise_st_segment :** It is a reading given by an electrocardiogram which indicates the quality of blood flow to the heart.
 3. **thal :** This is the result of the Thallium stress test which provides the measure of the blood flow to the heart.
 4. **resting_blood_pressure :** It gives the measure of the patient's blood pressure in resting state.
 5. **chest_pain_type :** It gives the type of chest pain the patient encounters. It has input rating values i.e., 1 to 4.
 6. **num_major_vessels :** It gives the number of major vessels (0 - 3) which are labelled by flouroscopy.
 7. **fasting_blood_sugar_gt_120_mg_per_dl :** It gives whether the patient blood sugar lever in fasting stage is 120 mg/dL or not. Input value is either 1 (Yes) or 0 (No).
 8. **resting_ekg_results :** It provides the measure of resting electrogardiograpic results and it's value ranges from 0 to 2.
 9. **serum_cholesterol_mg_per_dl :** It give the cholesterol amount detected in the serum in mg/dL.
 10. **oldpeak_eq_st_depression :** It is the measure of the ST depression which is induced by exercise in comparison to resting state. This gives the measure of abnormality in electrocardiogram
 11. **sex :** It gives the gender of the patient. The values are '0' for Female patients and '1' for Male patients
 12. **age :** It gives age of patients in years.
 13. **max_heart_rate_achieved :** It give the maximum heart rate achieved by a patient (beats per minute).
 14. **exercise_induced_angina :** It gives the value (0 - Not induced; 1 - Induced) of chest pain induced by exercising 
 15. **heart_disease_present :** Output result whether heart disease is present or not in patient.

### Basic Checks and Statistical Analysis

In [None]:
merge

In [None]:
merge.shape

In [None]:
merge.info()

**Observation:**
- No Null values
- 2 Categorical/text data, 1 float data type, 12 int data type

In [None]:
merge.describe()

**Observation:**
- No Constant data
- No Corrupt data

### Exploratory Data Analysis

#### Univariate Anlysis - Sweetviz Report

In [None]:
# Installation

!pip install sweetviz

In [None]:
import sweetviz as sv

report = sv.analyze(merge)
report.show_html()

**File path -** file:///C:/Users/shaik%20sadiq/SWEETVIZ_REPORT.html

##### Count plot

In [None]:
plt.figure(figsize = (20, 25), facecolor = 'white')
plotnum = 1
for column in merge:
    plt.subplot(5, 3, plotnum)
    sns.histplot(merge[column], kde = True)
    plotnum = plotnum + 1
plt.tight_layout()
plt.show()

#### Bivariate Analysis

##### Histogram plot

In [None]:
plt.figure(figsize = (20, 25), facecolor = 'white') 
plotnumber = 1      

for column in merge:         
    if plotnumber <= 16 :
        ax = plt.subplot(5, 3, plotnumber)
        sns.histplot(x = merge[column], hue = merge['heart_disease_present'])
        plt.xlabel(column, fontsize = 20)                    
        plt.ylabel('heart_disease_present', fontsize = 20)
    plotnumber += 1                
plt.tight_layout()
plt.show()

##### Scatter Plot

In [None]:
plt.figure(figsize=(15,15))
plotnumber=1
for column in merge:
    plt.subplot(5, 3, plotnumber)
    sns.scatterplot(x=column,y='heart_disease_present',data=merge)
    plotnumber=plotnumber+1
plt.tight_layout()
plt.show()

##### Multivariate Analysis

##### Pair plot

In [None]:
sns.pairplot(merge)
plt.show()

## Data preprocessing

##### Checking for Duplicate Data

In [None]:
merge.duplicated().sum()

**Observation:**
- No duplicate data

##### Checking Missing values

In [None]:
merge.isnull().sum()

**Observation:**
- No missing values

##### Converting Categorical Data into Numerical Data

In [None]:
merge.thal.unique()

In [None]:
# Manual Encoding

merge.thal = merge.thal.map({'normal':0, 'reversible_defect':1, 'fixed_defect':2})

##### Droppint the Unwanted column 'patient_id'

In [None]:
merge = merge.drop(['patient_id'], axis = 1)

In [None]:
merge

##### Checking for Outliers

In [None]:
plotnum = 1 
plt.figure(figsize = (20, 25), facecolor = 'white')

for column in merge:
    plt.subplot(5, 3, plotnum)
    sns.boxplot(x = merge[column])
    
    plt.xlabel(column, fontsize = 20)
    plt.ylabel('Count', fontsize = 20)
    
    plotnum = plotnum + 1
plt.tight_layout()

**Observation:**
- Outliers are present in:
1. resting_blood_pressure
2. chest_pain_type
3. num_major_vessels
4. serum_cholesterol_mg_per_dl
5. oldpeak_eq_st_depression

##### Handling Outliers

In [None]:
# resting_blood_pressure

from scipy import stats

IQR = stats.iqr(merge.resting_blood_pressure, interpolation = 'midpoint')
Q1_rbp = merge.resting_blood_pressure.quantile(0.25)
Q3_rbp = merge.resting_blood_pressure.quantile(0.75)

lower_limit_rbp = Q1_rbp - 1.5 * IQR
upper_limit_rbp = Q3_rbp + 1.5 * IQR
lower_limit_rbp, upper_limit_rbp

merge = merge[(merge.resting_blood_pressure < upper_limit_rbp)]
merge

In [None]:
# Rechecking for Outliers

sns.boxplot(x = merge['resting_blood_pressure'])

In [None]:
# chest_pain_type

from scipy import stats

IQR = stats.iqr(merge.chest_pain_type, interpolation = 'midpoint')
Q1_rbp = merge.chest_pain_type.quantile(0.25)
Q3_rbp = merge.chest_pain_type.quantile(0.75)

lower_limit_rbp = Q1_rbp - 1.5 * IQR
upper_limit_rbp = Q3_rbp + 1.5 * IQR
lower_limit_rbp, upper_limit_rbp

merge = merge[(merge.chest_pain_type < upper_limit_rbp)]
merge

In [None]:
# Rechecking for Outliers

sns.boxplot(x = merge['chest_pain_type'])

In [None]:
# num_major_vessels

from scipy import stats

IQR = stats.iqr(merge.num_major_vessels, interpolation = 'midpoint')
Q1_rbp = merge.num_major_vessels.quantile(0.25)
Q3_rbp = merge.num_major_vessels.quantile(0.75)

lower_limit_rbp = Q1_rbp - 1.5 * IQR
upper_limit_rbp = Q3_rbp + 1.5 * IQR
lower_limit_rbp, upper_limit_rbp

merge = merge[(merge.num_major_vessels < upper_limit_rbp)]
merge

In [None]:
# Rechecking for Outliers

sns.boxplot(x = merge['num_major_vessels'])

In [None]:
# serum_cholesterol_mg_per_dl

from scipy import stats

IQR = stats.iqr(merge.serum_cholesterol_mg_per_dl, interpolation = 'midpoint')
Q1_rbp = merge.serum_cholesterol_mg_per_dl.quantile(0.25)
Q3_rbp = merge.serum_cholesterol_mg_per_dl.quantile(0.75)

lower_limit_rbp = Q1_rbp - 1.5 * IQR
upper_limit_rbp = Q3_rbp + 1.5 * IQR
lower_limit_rbp, upper_limit_rbp

merge = merge[(merge.serum_cholesterol_mg_per_dl < upper_limit_rbp)]
merge

In [None]:
# Rechecking for Outliers

sns.boxplot(x = merge['serum_cholesterol_mg_per_dl'])

In [None]:
# oldpeak_eq_st_depression

from scipy import stats

IQR = stats.iqr(merge.oldpeak_eq_st_depression, interpolation = 'midpoint')
Q1_rbp = merge.oldpeak_eq_st_depression.quantile(0.25)
Q3_rbp = merge.oldpeak_eq_st_depression.quantile(0.75)

lower_limit_rbp = Q1_rbp - 1.5 * IQR
upper_limit_rbp = Q3_rbp + 1.5 * IQR
lower_limit_rbp, upper_limit_rbp

merge = merge[(merge.oldpeak_eq_st_depression < upper_limit_rbp)]
merge

In [None]:
# Rechecking for Outliers

sns.boxplot(x = merge['oldpeak_eq_st_depression'])

### Feature Engineering

In [None]:
merge.corr()

In [None]:
plt.figure(figsize = (15,15))
sns.heatmap(merge.corr(), annot = True)

**Observation**
- No highly correlated data

##### Scaling of Data Set

In [None]:
merge

In [None]:
merge.columns

##### MinMax Scalar

In [None]:
from sklearn.preprocessing import MinMaxScaler
 
scale = MinMaxScaler()
merge[['resting_blood_pressure', 'serum_cholesterol_mg_per_dl', 'max_heart_rate_achieved']] = scale.fit_transform(merge[['resting_blood_pressure', 'serum_cholesterol_mg_per_dl', 'max_heart_rate_achieved']])

In [None]:
merge

### Model Creation

##### Creating Training And Testing Data

In [None]:
X = merge[['slope_of_peak_exercise_st_segment','thal', 'resting_blood_pressure',
       'chest_pain_type','num_major_vessels',
       'fasting_blood_sugar_gt_120_mg_per_dl','resting_ekg_results',
       'serum_cholesterol_mg_per_dl','oldpeak_eq_st_depression','sex','age',
       'max_heart_rate_achieved','exercise_induced_angina']]

y = merge[['heart_disease_present']]

In [None]:
X

In [None]:
y

In [None]:
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 40)

In [None]:
X_train.shape

In [None]:
y_train.shape

In [None]:
X_test.shape

In [None]:
y_test.shape

In [None]:
y_test

#### 1. LOGISTIC REGRESSION ALOGORITHM

In [None]:
from sklearn.linear_model import LogisticRegression

lr = LogisticRegression()

lr.fit(X_train, y_train)

In [None]:
y_pred_LR = lr.predict(X_test)

In [None]:
y_pred_LR

##### Model Evaluation for Logistic Regression Algorithm

In [None]:
from sklearn.metrics import confusion_matrix, accuracy_score, recall_score, precision_score, f1_score, classification_report

In [None]:
# Confusion Matrix
cm_LR = confusion_matrix(y_test, y_pred_LR)

# Accuracy Score
acc_LR = accuracy_score(y_test, y_pred_LR)

# Recall
recall_LR = recall_score(y_test, y_pred_LR)

# Precision
precision_LR = precision_score(y_test, y_pred_LR)

# f1 Score
f1_LR = f1_score(y_test, y_pred_LR)

# Classification Report
cr_LR = classification_report(y_test, y_pred_LR)

In [None]:
print(cm_LR)
print(acc_LR)
print(recall_LR)
print(precision_LR)
print(f1_LR)
print(cr_LR)

#### 2. SUPPORT VECTOR MACHINE ALGORITHM

In [None]:
from sklearn.svm import SVC

svclassifier = SVC(kernel = 'rbf', C = 30, gamma = 'auto') 
svclassifier.fit(X_train, y_train)

In [None]:
y_predSVM = svclassifier.predict(X_test)

In [None]:
y_predSVM

##### Model Evaluation for SVM

In [None]:
from sklearn.metrics import confusion_matrix, accuracy_score, recall_score, precision_score, f1_score, classification_report

In [None]:
# Confusion Matrix
cm_SVM = confusion_matrix(y_test, y_predSVM)

# Accuracy Score
acc_SVM = accuracy_score(y_test, y_predSVM)

# Recall
recall_SVM = recall_score(y_test, y_predSVM)

# Precision
precision_SVM = precision_score(y_test, y_predSVM)

# f1 Score
f1_SVM = f1_score(y_test, y_predSVM)

# Classification Report
cr_SVM = classification_report(y_test, y_predSVM)

In [None]:
print(cm_SVM)
print(acc_SVM)
print(recall_SVM)
print(precision_SVM)
print(f1_SVM)
print(cr_SVM)

#### 3. NAIVE-BAYES ALGORITHM

In [None]:
from sklearn.naive_bayes import GaussianNB

gnb = GaussianNB()
gnb.fit(X_train, y_train)

In [None]:
y_pred_NB = gnb.predict(X_test)

In [None]:
y_pred_NB

##### Model Evaluation for Naive-Bayes Algorithm 

In [None]:
from sklearn.metrics import confusion_matrix, accuracy_score, recall_score, precision_score, f1_score, classification_report

In [None]:
# Confusion Matrix
cm_NB = confusion_matrix(y_test, y_pred_NB)

# Accuracy Score
acc_NB = accuracy_score(y_test, y_pred_NB)

# Recall
recall_NB = recall_score(y_test, y_pred_NB)

# Precision
precision_NB = precision_score(y_test, y_pred_NB)

# f1 Score
f1_NB = f1_score(y_test, y_pred_NB)

# Classification Report
cr_NB = classification_report(y_test, y_pred_NB)

In [None]:
print(cm_NB)
print(acc_NB)
print(recall_NB)
print(precision_NB)
print(f1_NB)
print(cr_NB)

#### 4. XGBOOST ALGORITHM

In [None]:
from xgboost import XGBClassifier

xgb_r = XGBClassifier()
xgb_r.fit(X_train, y_train)

y_pred_XGB = xgb_r.predict(X_test)

In [None]:
y_pred_XGB

##### Model Evaluation for XGBoost Algorithm

In [None]:
from sklearn.metrics import confusion_matrix, accuracy_score, recall_score, precision_score, f1_score, classification_report

In [None]:
# Confusion Matrix
cm_XGB = confusion_matrix(y_test, y_pred_XGB)

# Accuracy Score
acc_XGB = accuracy_score(y_test, y_pred_XGB)

# Recall
recall_XGB = recall_score(y_test, y_pred_XGB)

# Precision
precision_XGB = precision_score(y_test, y_pred_XGB)

# f1 Score
f1_XGB = f1_score(y_test, y_pred_XGB)

# Classification Report
cr_XGB = classification_report(y_test, y_pred_XGB)

In [None]:
print(cm_XGB)
print(acc_XGB)
print(recall_XGB)
print(precision_XGB)
print(f1_XGB)
print(cr_XGB)

#### 5. DECISION TREE ALGORITHM

In [None]:
# from sklearn.tree import DecisionTreeClassifier  

# DT = DecisionTreeClassifier(max_depth = 5, criterion = 'entropy') 

# DT.fit(X, y)

In [None]:
from sklearn.tree import DecisionTreeClassifier

DT = DecisionTreeClassifier(max_depth = 5, criterion = 'entropy')

m = DT.fit(X, y)

In [None]:
y_pred_DT = m.predict(X_test)
y_pred_DT

##### Model Evaluation for Decision Tree Algorithm

In [None]:
from sklearn.metrics import confusion_matrix, accuracy_score, recall_score, precision_score, f1_score, classification_report

In [None]:
# Confusion Matrix
cm_DT = confusion_matrix(y_test, y_pred_DT)

# Accuracy Score
acc_DT = accuracy_score(y_test, y_pred_DT)

# Recall
recall_DT = recall_score(y_test, y_pred_DT)

# Precision
precision_DT = precision_score(y_test, y_pred_DT)

# f1 Score
f1_DT = f1_score(y_test, y_pred_DT)

# Classification Report
cr_DT = classification_report(y_test, y_pred_DT)

In [None]:
print(cm_DT)
print(acc_DT)
print(recall_DT)
print(precision_DT)
print(f1_DT)
print(cr_DT)

##### Hyperparameter Tuning 

In [None]:
from sklearn.model_selection import GridSearchCV

In [None]:
params_DT = {
    "criterion" : ("gini", "entropy"),        
    "splitter" : ("best", "random"),         
    "max_depth" : (list(range(1, 20))),      
    "min_samples_split" : [2, 3, 4],         
    "min_samples_leaf" : list(range(1, 20)),   
}

clf_DT = DecisionTreeClassifier(random_state = 3)    
cv_DT = GridSearchCV(clf_DT, params_DT, scoring = "f1", n_jobs =- 1, verbose = 5, cv = 3)

In [None]:
cv_DT.fit(X_train, y_train)         
best_params_DT = cv_DT.best_params_        
print(f"Best paramters: {best_params_DT})") 

In [None]:
Bestparamters: ({'criterion': 'gini', 'max_depth': 1, 'min_samples_leaf': 1, 'min_samples_split': 2, 'splitter': 'best'})

In [None]:
cv_DT.best_params_

In [None]:
cv_DT.best_score_

In [None]:
# Passing best parameter to decision tree

new_DT = DecisionTreeClassifier( criterion = 'gini', max_depth = 1, min_samples_leaf = 1, min_samples_split = 2, splitter = 'best')

In [None]:
new_DT.fit(X_train, y_train)

In [None]:
new_y_pred_DT = new_DT.predict(X_test)
new_y_pred_DT

##### Model Evaluation for New Decision Tree Algorithm after Hyperparameter tuning

In [None]:
# Confusion Matrix
new_cm_DT = confusion_matrix(y_test, new_y_pred_DT)

# Accuracy Score
new_acc_DT = accuracy_score(y_test, new_y_pred_DT)

# Recall
new_recall_DT = recall_score(y_test, new_y_pred_DT)

# Precision
new_precision_DT = precision_score(y_test, new_y_pred_DT)

# f1 Score
new_f1_DT = f1_score(y_test, new_y_pred_DT)

# Classification Report
new_cr_DT = classification_report(y_test, new_y_pred_DT)

In [None]:
print(new_cm_DT)
print(new_acc_DT)
print(new_recall_DT)
print(new_precision_DT)
print(new_f1_DT)
print(new_cr_DT)

#### 6. RANDOM FOREST ALGORITHM

In [None]:
from sklearn.ensemble import RandomForestClassifier   

RF = RandomForestClassifier(n_estimators = 100)   

RF.fit(X_train, y_train)

In [None]:
y_pred_RF = RF.predict(X_test)
y_pred_RF

##### Model Evaluation for Random Forest

In [None]:
# Confusion Matrix
cm_RF = confusion_matrix(y_test, y_pred_RF)

# Accuracy Score
acc_RF = accuracy_score(y_test, y_pred_RF)

# Recall
recall_RF = recall_score(y_test, y_pred_RF)

# Precision
precision_RF = precision_score(y_test, y_pred_RF)

# f1 Score
f1_RF = f1_score(y_test, y_pred_RF)

# Classification Report
cr_RF = classification_report(y_test, y_pred_RF)

In [None]:
print(cm_RF)
print(acc_RF)
print(recall_RF)
print(precision_RF)
print(f1_RF)
print(cr_RF)

##### Hyperparameter Tuning 

In [None]:
from sklearn.model_selection import RandomizedSearchCV

n_estimators = [int(x) for x in np.linspace(start = 200, stop = 2000, num = 10)] 

max_features = ['auto', 'sqrt']
max_depth = [int(x) for x in np.linspace(10, 110, num = 11)] 
max_depth.append(None)
min_samples_split = [2, 5, 10] 
min_samples_leaf = [1, 2, 4]
bootstrap = [True, False]

In [None]:
random_grid = {'n_estimators': n_estimators, 'max_features': max_features,
               'max_depth': max_depth, 'min_samples_split': min_samples_split,
               'min_samples_leaf': min_samples_leaf, 'bootstrap': bootstrap}

clf_RF = RandomForestClassifier(random_state = 42) 

cv_RF = RandomizedSearchCV(estimator = clf_RF, scoring = 'f1', param_distributions = random_grid, n_iter = 100, cv = 3, 
        verbose = 2, random_state = 42, n_jobs = -1)

In [None]:
cv_RF.fit(X_train, y_train) 

best_params_RF = cv_RF.best_params_ 
print(f"Best paramters: {best_params_RF})") 

In [None]:
clf_RF = RandomForestClassifier(n_estimators = 600, min_samples_split = 2, min_samples_leaf = 4, max_features = 'sqrt', max_depth = 60, bootstrap = False) 
clf_RF.fit(X_train, y_train) 

In [None]:
new_y_pred_RF = clf_RF.predict(X_test)
new_y_pred_RF

##### Model Evaluation for New Random Forest Algorithm after Hyperparameter tuning

In [None]:
# Confusion Matrix
new_cm_RF = confusion_matrix(y_test, new_y_pred_RF)

# Accuracy Score
new_acc_RF = accuracy_score(y_test, new_y_pred_RF)

# Recall
new_recall_RF = recall_score(y_test, new_y_pred_RF)

# Precision
new_precision_RF = precision_score(y_test, new_y_pred_RF)

# f1 Score
new_f1_RF = f1_score(y_test, new_y_pred_RF)

# Classification Report
new_cr_RF = classification_report(y_test, new_y_pred_RF)

In [None]:
print(new_cm_RF)
print(new_acc_RF)
print(new_recall_RF)
print(new_precision_RF)
print(new_f1_RF)
print(new_cr_RF)

#### 7. K-NEAREST NEIGHBOURS ALGORITHM (K-NN)

In [None]:
from sklearn.neighbors import KNeighborsClassifier

model = KNeighborsClassifier(n_neighbors = 12)
model.fit(X_train, y_train)

In [None]:
y_pred_KNN = model.predict(X_test)
y_pred_KNN

##### Model Evaluation for K-NN

In [None]:
# Confusion Matrix
cm_KNN = confusion_matrix(y_test, y_pred_KNN)

# Accuracy Score
acc_KNN = accuracy_score(y_test, y_pred_KNN)

# Recall
recall_KNN = recall_score(y_test, y_pred_KNN)

# Precision
precision_KNN = precision_score(y_test, y_pred_KNN)

# f1 Score
f1_KNN = f1_score(y_test, y_pred_KNN)

# Classification Report
cr_KNN = classification_report(y_test, y_pred_KNN)

In [None]:
print(cm_KNN)
print(acc_KNN)
print(recall_KNN)
print(precision_KNN)
print(f1_KNN)
print(cr_KNN)

### Displaying the Accuracy Score for the Algorithms

In [None]:
scores = [acc_LR, acc_SVM, acc_NB, acc_XGB, acc_DT, new_acc_DT, acc_RF, new_acc_RF, acc_KNN]
algorithms = ["Logistic Regression","Support Vector Machine","Naive Bayes","XGBoost","Decision Tree","Decision Tree_HT","Random Forest","Random Forest_HT"]    

for i in range(len(algorithms)): 
    print("The accuracy score achieved using "+algorithms[i]+" is: "+str(scores[i])+" %")


### Displaying the f1 Score for the Algorithms

In [None]:
scores = [f1_LR, f1_SVM, f1_NB, f1_XGB, f1_DT, new_f1_DT, f1_RF, new_f1_RF, f1_KNN]
algorithms = ["Logistic Regression","Support Vector Machine","Naive Bayes","XGBoost","Decision Tree","Decision Tree_HT","Random Forest","Random Forest_HT","K-Nearest Neighbors"]    

for i in range(len(algorithms)): 
    print("The f1 score achieved using "+algorithms[i]+" is: "+str(scores[i])+" %")

### Tabulate the Results for all the Algorithms

In [None]:
!pip install tabulate

In [None]:
from tabulate import tabulate

In [None]:
data = [["LR", acc_LR, recall_LR, precision_LR, f1_LR], 
        ["SVM", acc_SVM, recall_SVM, precision_SVM, f1_SVM],
        ["Naive-Bayes", acc_NB, recall_NB, precision_NB, f1_NB],
        ["XGBoost", acc_XGB, recall_XGB, precision_XGB, f1_XGB],
        ["Decision Tree", acc_DT, recall_DT, precision_DT, f1_DT],
        ["Decision Tree_HT", new_acc_DT, new_recall_DT, new_precision_DT, new_f1_DT],
        ["Random Forest", acc_RF, recall_RF, precision_RF, f1_RF],
        ["Random Forest_HT", new_acc_RF, new_recall_RF, new_precision_RF, new_f1_RF],
         ["K-NN", acc_KNN, recall_KNN, precision_KNN, f1_KNN]]

col_names = ["Algorithm", "Accuracy Score", "Recall Score", "Precision Score", "F1 Score"]

print(tabulate(data, headers = col_names, tablefmt = "fancy_grid"))


### Conclusion

After comparing multiple supervised classification models for heart disease prediction, the Decision Tree classifier achieved the highest accuracy (93.55%) and F1-score (0.92), indicating superior overall performance. Although SVM showed strong recall performance, the Decision Tree model provided the best balance between precision and recall. Therefore, Decision Tree was selected as the final model for heart disease prediction.

- **Highest Accuracy:**
Decision Tree – 93.55%

- **Highest F1-Score:**
Decision Tree – 0.92
(Best balance between precision and recall)

- **Perfect Precision (1.0):**
Decision Tree correctly predicted all positive cases without false positives.

- **Highest Recall:**
SVM – 92.86%
(Important in medical diagnosis to reduce false negatives)