<h1 style="color:red">Heart Disease Prediction System – Machine Learning Cycle</h1>

# Machine Learning Cycle

### Four phases of a Machine Learning Cycle are

### Training Phase

    Build the Model using Training Data

### Testing Phase

     Evaluate the performance of Model using Testing Data

### Application Phase

     Deploy the Model in the Real-world, to predict Real-time unseen Data

### Feedback Phase

    Take Feedback from the Users and Domain Experts to improve the Model


<h1 style="color:red">Executing Machine Learning Cycle Using a Single File</h1>

### In Sha Allah, we will follow the following steps to execute the Machine Learning Cycle Using a Single File

#### Step 1: Import Libraries

#### Step 2: Load Sample Data

#### Step 3: Understand and Pre-process Sample Data
    
    Step 3.1: Understand Sample Data
    
    Step 3.2: Pre-process Sample Data

#### Step 4: Feature Extraction 

#### Step 5: Label Encoding (Input and Output is converted in Numeric Representation)

    Step 5.1: Train the Label Encoder

    Step 5.2: Label Encode the Output

    Step 5.3: Label Encode the Input 

#### Step 6: Execute the Training Phase

    Step 6.1: Splitting Sample Data into Training Data and Testing Data 

    Step 6.2: Splitting Input Vectors and Outputs / Labels of Training Data

    Step 6.3: Train the Support Vector Classifier

    Step 6.4: Save the Trained Model

#### Step 7: Execute the Testing Phase 

    Step 7.1: Splitting Input Vectors and Outputs/Labels of Testing Data
    
    Step 7.2: Load the Saved Model
    
    Step 7.3: Evaluate the Performance of Trained Model

        Step 7.3.1: Make Predictions with the Trained Model on Testing Data

    Step 7.4: Calculate the Accuracy Score

#### Step 8: Execute the Application Phase 

    Step 8.1: Take Input from User 

    Step 8.2: Convert User Input into Feature Vector (Exactly Same as Feature Vectors of Sample Data)

    Step 8.3: Label Encoding of Feature Vector (Exactly Same as Label Encoded Feature Vectors of Sample Data)

    Step 8.4: Load the Saved Model

    Step 8.5: Model Prediction

         Step 8.5.1: Apply Model on the Label Encoded Feature Vector of unseen instance and return Prediction to the User

#### Step 9: Execute the Feedback Phase 

#### Step 10: Improve the Model based on Feedback

# Step 1: Import Libraries

In [3]:
pip install numpy pandas scikit-learn prettytable astropy


Note: you may need to restart the kernel to use updated packages.


In [28]:
# Import Libraries

import numpy as np
import pandas as pd
import pickle

from sklearn.model_selection import train_test_split
from sklearn import svm
from sklearn.metrics import accuracy_score

from prettytable import PrettyTable   
from astropy.table import Table, Column

# Step 2: Load Sample Data

In [29]:
# Load Sample Data

''' 
*---------------------- LOAD_SAMPLE_DATA ------------------------*
|     Function: read_csv()                                       |
|             Purpose: Read a dataset in CSV file format         |
|     Arguments:                                                 |
|             path: Path to dataset file                         |
|             dataset: Dataset file name                         |
|     Return:                                                    |
|             dataset: Dataset in DataFrame format               |
*----------------------------------------------------------------*
'''
 
sample_data = pd.read_csv("heart-disease-sample-data.csv")

print("\n\nSample Data:")
print("============\n")
pd.set_option("display.max_rows", None, "display.max_columns", None)
print(sample_data)



Sample Data:

    age  sex  trestbps  chol  target
0    63    1       145   233       1
1    37    1       130   250       1
2    41    0       130   204       1
3    56    1       120   236       1
4    57    0       120   354       1
5    57    1       140   192       1
6    56    0       140   294       1
7    44    1       120   263       1
8    52    1       172   199       1
9    57    1       150   168       1
10   54    1       140   239       1
11   48    0       130   275       1
12   49    1       130   266       1
13   64    1       110   211       1
14   58    0       150   283       1
15   50    0       120   219       1
16   58    0       120   340       1
17   66    0       150   226       1
18   43    1       150   247       1
19   69    0       140   239       1
20   59    1       135   234       1
21   44    1       130   233       1
22   42    1       140   226       1
23   61    1       150   243       1
24   40    1       140   199       1
25   71    0       160

# Step 3: Understand and Pre-process Sample Data

## Step 3.1: Understand Sample Data

In [30]:
# Understand Sample Data

print("\n\nAttributes in Sample Data:")
print("==========================\n")

print(sample_data.columns)

print("\n\nNumber of Instances in Sample Data:",sample_data["age"].count())
print("========================================\n")



Attributes in Sample Data:

Index(['age', 'sex', 'trestbps', 'chol', 'target'], dtype='object')


Number of Instances in Sample Data: 100



## Step 3.2: Pre-process Sample Data
    o	Sample Data is already Preprocessed
    o	No Preprocessing needs to be Performed 

# Step 4: Feature Extraction
    o	Features are already Extracted
    o	No Feature Extraction needs to be Performed

# Step 5: Label Encoding the Sample Data (Input and Output is converted in Numeric Representation)

## Step 5.1: Train the Label Encoder

## Step 5.2: Label Encode the Output

## Step 5.3: Label Encode the Input

Data already in numerical form so don't apply label encoding

# Step 6: Execute the Training Phase 

## Step 6.1: Splitting Sample Data into Training Data and Testing Data

In [60]:
# Splitting Sample Data into Training Data and Testing Data

''' 
*------------------- SPLIT_SAMPLE_DATA ---------------------*
|        Function: train_test_split()                       |
|              Purpose: Split arrays or matrices into       |
|                       random train and test subsets       |
|        Arguments:                                         |
|              arrays: sequence of indexables               |
|              test_size: float or int                      |
|        Return:                                            |
|              splitting: list                              |
*-----------------------------------------------------------*
'''

training_data, testing_data = train_test_split( sample_data , test_size=0.2 , random_state=0 , shuffle = True)

# Save the Training and Testing Data into CSV File 

training_data.to_csv(r'training-data.csv', index = False, header = True)
testing_data.to_csv(r'testing-data.csv', index = False, header = True)

# print Training and Testing Data

print("\n\nTraining Data:")
print("==============\n")
pd.set_option("display.max_rows", None, "display.max_columns", None)
print(training_data)
print("\n\nTesting Data:")
print("==============\n")
pd.set_option("display.max_rows", None, "display.max_columns", None)
print(testing_data)



Training Data:

    age  sex  trestbps  chol  target
43   53    0       130   264       1
62   64    1       140   335       0
3    56    1       120   236       1
71   60    1       130   253       0
45   52    1       120   325       1
48   53    0       128   216       1
6    56    0       140   294       1
99   56    1       125   249       0
82   67    1       125   254       0
76   58    1       128   216       0
60   40    1       110   167       0
80   59    1       170   326       0
90   52    1       128   255       0
68   58    1       112   230       0
51   67    1       120   229       0
27   51    1       110   175       1
18   43    1       150   247       1
56   48    1       110   229       0
63   43    1       120   177       0
74   41    1       110   172       0
1    37    1       130   250       1
61   60    1       117   230       0
42   45    1       104   208       1
41   48    1       130   245       1
4    57    0       120   354       1
15   50    0       1

## Step 6.2: Splitting Input Vectors and Outputs / Labels of Training Data

In [61]:
# Splitting Input Vectors and Outputs / Labels of Training Data

'''
*---------------- SPLIT_INPUT_VECTORS_AND_LABELS --------------*
|        Function: iloc()                                      |
|            Purpose: Splitting Input Vector and Labels        |
|        Arguments:                                            |
|            Attribute: Name or Location Attribute to Split    |
|        Return:                                               |
|            Attribute: Split Attributes                       |
*--------------------------------------------------------------*
'''

print("\n\nInputs Vectors (Feature Vectors) of Training Data:")
print("==================================================\n")
input_vector_train = training_data.iloc[: , :-1]
print(input_vector_train)

print("\n\nOutputs/Labels of Training Data:")
print("================================\n")
print("  Disease")
output_label_train = training_data.iloc[: ,-1]
print(output_label_train)



Inputs Vectors (Feature Vectors) of Training Data:

    age  sex  trestbps  chol
43   53    0       130   264
62   64    1       140   335
3    56    1       120   236
71   60    1       130   253
45   52    1       120   325
48   53    0       128   216
6    56    0       140   294
99   56    1       125   249
82   67    1       125   254
76   58    1       128   216
60   40    1       110   167
80   59    1       170   326
90   52    1       128   255
68   58    1       112   230
51   67    1       120   229
27   51    1       110   175
18   43    1       150   247
56   48    1       110   229
63   43    1       120   177
74   41    1       110   172
1    37    1       130   250
61   60    1       117   230
42   45    1       104   208
41   48    1       130   245
4    57    0       120   354
15   50    0       120   219
17   66    0       150   226
40   51    0       140   308
38   65    0       155   269
5    57    1       140   192
91   59    1       110   239
59   60    1      

## 6.3: Train the Support Vector Classifier

In [62]:
# Train the Support Vector Classifier

''' 
*--------------- TRAIN_SUPPORT_VECTOR_CLASSIFIER ------------------*
|       Function: svm.SVC()                                        |
|           Purpose: Train the Algorithm on Training Data          |
|       Arguments:                                                 |
|           Training Data: Provide Training Data to the Model      |
|       Return:                                                    |
|           Parameter: Model return the Training Parameters        |
*------------------------------------------------------------------*
'''

print("\n\nTraining the Support Vector Classifier on Training Data")
print("========================================================\n")
print("\nParameters and their values:")
print("============================\n")
svc_model = svm.SVC(gamma='auto',random_state=0)
svc_model.fit(input_vector_train,np.ravel(output_label_train))
print(svc_model)



Training the Support Vector Classifier on Training Data


Parameters and their values:

SVC(gamma='auto', random_state=0)


## Step 6.4: Save the Trained Model

In [63]:
# Save the Trained Model

''' 
*--------------------- SAVE_THE_TRAINED_MODEL ---------------------*
|        Function: dump()                                          |
|             Purpose: Save the Trained Model on your Hard Disk    |
|        Arguments:                                                |
|             Model: Model Objects                                 |
|        Return:                                                   |
|             File: Trained Model will be Saved on Hard Disk       |
*------------------------------------------------------------------* 
'''

# Save the Model in a Pkl File

pickle.dump(svc_model, open('svc_trained_model_heart_disease.pkl', 'wb'))

# Step 7: Execute the Testing Phase 

## Step 7.1: Splitting Input Vectors and Outputs/Labels of Testing Data

In [64]:
# Splitting Input Vectors and Outputs/Labels of Testing Data

'''
*---------------- SPLIT_INPUT_VECTORS_AND_LABELS --------------*
|        Function: iloc()                                      |
|            Purpose: Splitting Input Vector and Labels        |
|        Arguments:                                            |
|            Attribute: Name or Location Attribute to Split    |
|        Return:                                               |
|            Attribute: Split Attributes                       |
*--------------------------------------------------------------*
'''

print("\n\nInputs Vectors (Feature Vectors) of Testing Data:")
print("=================================================\n")
input_vector_test = testing_data.iloc[: , :-1]
print(input_vector_test)

print("\n\nOutputs/Labels of Testing Data:")
print("==============================\n")
print("  Disease")
output_label_test = testing_data.iloc[: ,-1]
print(output_label_test)



Inputs Vectors (Feature Vectors) of Testing Data:

    age  sex  trestbps  chol
26   59    1       150   212
86   60    1       125   258
2    41    0       130   204
55   56    1       130   256
75   51    0       130   305
93   49    1       120   188
16   58    0       120   340
73   50    1       140   233
54   53    1       140   203
95   57    1       128   229
53   63    1       130   254
92   60    0       150   258
78   60    1       145   282
13   64    1       110   211
7    44    1       120   263
30   41    0       105   198
22   42    1       140   226
24   40    1       140   199
33   54    1       125   273
8    52    1       172   199


Outputs/Labels of Testing Data:

  Disease
26    1
86    0
2     1
55    0
75    0
93    0
16    1
73    0
54    0
95    0
53    0
92    0
78    0
13    1
7     1
30    1
22    1
24    1
33    1
8     1
Name: target, dtype: int64


## Step 7.2: Load the Saved Model

In [65]:
# Load the Saved Model

''' 
*------------------- LOAD_SAVED_MODEL --------------------------*
|         Function: load()                                      |
|               Purpose: Method to Load Previously Saved Model  |
|         Arguments:                                            |
|               Model: Trained Model                            |
|         Return:                                               |
|               File: Saved Model will be Loaded in Memory      |
*---------------------------------------------------------------*
'''

# Load the Saved Model

model = pickle.load(open('svc_trained_model_heart_disease.pkl', 'rb'))

## Step 7.3: Evaluate the Machine Learning Model
### Step 7.3.1: Make Predictions with the Trained Models on Testing Data

In [66]:
# Evaluate the Machine Learning Model

''' 
*--------------------- EVALUATE_MACHINE_LEARNING_MODEL ----------------------*
|       Function: Predict()                                                  |
|             Purpose: Make a Prediction using Algorithm on Test Data        |
|       Arguments:                                                           |
|            Testing Data: Provide Test data to the Trained Model            |
|       Return:                                                              |
|            Predictions: Model return Predictions                           |
*----------------------------------------------------------------------------* 
'''

# Provide Test data to the Trained Model

model_predictions = model.predict(input_vector_test)
testing_data.copy(deep=True)
pd.options.mode.chained_assignment = None
testing_data["Predictions"] = model_predictions

# Save the Predictions into CSV File

testing_data.to_csv(r'model-predictions_heart_disease.csv', index = False, header = True)

model_predictions = testing_data
print("\n\nPredictions Returned by svc_trained_model:")
print("==========================================\n")
print(model_predictions)



Predictions Returned by svc_trained_model:

    age  sex  trestbps  chol  target  Predictions
26   59    1       150   212       1            0
86   60    1       125   258       0            0
2    41    0       130   204       1            0
55   56    1       130   256       0            0
75   51    0       130   305       0            0
93   49    1       120   188       0            0
16   58    0       120   340       1            0
73   50    1       140   233       0            1
54   53    1       140   203       0            0
95   57    1       128   229       0            0
53   63    1       130   254       0            0
92   60    0       150   258       0            0
78   60    1       145   282       0            1
13   64    1       110   211       1            0
7    44    1       120   263       1            0
30   41    0       105   198       1            0
22   42    1       140   226       1            0
24   40    1       140   199       1            0
33  

## Step 7.4: Calculate the Accuracy Score

In [67]:
# Calculate the Accuracy Score

''' 
/*------------------------ CALCULATE_ACCURACY_SCORE -------------------*
|          Function: accuracy_score()                                  |
|                Purpose: Evaluate the algorithm on Testing data       |
|          Arguments:                                                  |
|                Prediction: Predicted values                          |
|                Label: Actual values                                  |
|          Return:                                                     |
|                Accuracy: Accuracy Score                              |
*----------------------------------------------------------------------*
'''

# Calculate the Accuracy

model_accuracy_score = accuracy_score(model_predictions["target"],model_predictions["Predictions"])

print("\n\nAccuracy Score:")
print("===============\n")
print(round(model_accuracy_score,2))



Accuracy Score:

0.4


# Step 8: Execute the Application Phase

## Step 8.1: Take Input from User

In [71]:
# Take Input from User

''' 
*---------------- TAKE_USER_INPUT ----------------*
'''

age_input = input("\nPlease enter your age here : ")
gender_input = input("\nPlease enter your Gender here (1 for Male, 0 for Female) : ")
trestbps_input = input("\nPlease enter your resting blood pressure : ")
chol_input = input("\nPlease enter cholestrol number : ")


Please enter your age here : 33

Please enter your Gender here (1 for Male, 0 for Female) : 1

Please enter your resting blood pressure : 120

Please enter cholestrol number : 220


## Step 8.2: Convert User Input into Feature Vector (Exactly Same as Feature Vectors of Sample Data)

In [77]:
# Convert User Input into Feature Vector

user_input = pd.DataFrame({ 'age': [age_input],'sex': [gender_input],'trestbps': [trestbps_input],'chol': [chol_input]})

print("\n\nUser Input Feature Vector:")
print("==========================\n")
print(user_input)



User Input Feature Vector:

  age sex trestbps chol
0  33   1      120  220


## Step 8.3: Label Encoding of Feature Vector (Exactly Same as Label Encoded Feature Vectors of Sample Data)

## Step 8.4: Load the Saved Model

In [78]:
# Load the Saved Model

''' 
*----------------------- LOAD_SAVED_MODEL --------------------------*
|         Function: load()                                          |
|             Purpose: Method to Load Previously Saved Model        |
|         Arguments:                                                |
|               Model: Trained Model                                |
|         Return:                                                   |
|               File: Saved Model will be Loaded in Memory          |
*-------------------------------------------------------------------*
'''

# Load the Saved Model

model = pickle.load(open('svc_trained_model_heart_disease.pkl', 'rb'))

## Step 8.5: Model Prediction
### Step 8.5.1: Apply Model on the Label Encoded Feature Vector of unseen instance and return Prediction to the User

In [80]:
# Prediction of Unseen Instance

''' 
*----------------------------  MODEL_PREDICTION --------------------------*
|           Function: predict()                                           |
|                 Purpose: Use Trained Model to Predict the Output        |
|                          of Unseen Instances                            |
|           Arguments:                                                    |
|                 User Data: Label Encoded Feature Vector of              |
|                            Unseen Instances                             |
|           Return:                                                       |
|                 Survival: Survived or Not Survived                      |
*-------------------------------------------------------------------------*
'''

# Make a Prediction on Unseen Data

predicted_survival = model.predict(user_input)

if(predicted_survival == 1): 
    prediction = "Have Heart Disease"
if(predicted_survival == 0):
    prediction = "No Heart Disease"

# Add the Prediction in a Pretty Table

pretty_table = PrettyTable()
pretty_table.add_column("       ** Prediction **       ",[prediction])
print(pretty_table)

+--------------------------------+
|        ** Prediction **        |
+--------------------------------+
|        No Heart Disease        |
+--------------------------------+


# Step 9: Execute the Feedback Phase
## A Two-Step Process
### Step 01: After some time, take Feedback from
    o	Domain Experts and Users on deployed Titanic Passenger Survival Prediction System
### Step 02: Make a List of Possible Improvements based on Feedback received

# Step 10: Improve Model based on Feedback
### There is Always Room for Improvement
### Based on Feedback from Domain Experts and Users
    o	Improve your Model