### Importing Necessary Modules
For this part of the model building process, the modules to be imported would be "joblib",

"sklearn", "pandas" and "time". The "joblib" module would be used to load the saved machine learning models.

The "sklearn" module would now be used to test the accuracy, precision and recall of the selected models

in order to determine the best option. The "time" module would be used to calculate the speed/latency of the 

models when making predictions.

In [1]:
import joblib
import pandas as pd
from sklearn.metrics import accuracy_score, precision_score, recall_score
from time import time

### Read in the Data
The validation and test set's features and labels would be used to further evaluate the model

and also test its performance on unseen data.

They are brought in using the pandas "read_csv" function.

In [2]:
validation_features = pd.read_csv(r'...\...\...\titanic_EDA\Split_data\validation_features.csv')

validation_labels = pd.read_csv(r'...\...\...\titanic_EDA\Split_data\validation_labels.csv')

test_features = pd.read_csv(r'...\...\...\titanic_EDA\Split_data\test_features.csv')

test_labels = pd.read_csv(r'...\...\...\titanic_EDA\Split_data\test_labels.csv')

### Loading the Models
The Logistic Regression and Random Forest models already created and saved previosuly would be

loaded into the Notebook with the stored hyperparameter settings using "joblib.load()" 

and placed in a dictionary as key-value pairs.

In [3]:
models = {}

for model in ['LR', 'RF']:
    models[model] = joblib.load(r'...\...\...\titanic_EDA\Models\{}_model.pkl'.format(model))

In [4]:
models

{'LR': LogisticRegression(C=10),
 'RF': RandomForestClassifier(max_depth=4, n_estimators=50)}

### Evaluation Function
The function below was created to assess the models and calculate their accuracy, precision, recall and speed.

It would also print the results in a nice readable format.

In [5]:
def evaluate_model(name, model, features, labels):
    start = time()
    prediction = model.predict(features)
    end = time()
    accuracy = round(accuracy_score(labels, prediction), 3)
    precision = round(precision_score(labels, prediction), 3)
    recall = round(recall_score(labels, prediction), 3)
    print('{} -- Accuracy: {} / Precision: {} / Recall: {} / Latency: {}ms'.format(name, accuracy, precision, recall, 
                                                                                    round((end - start)*1000, 1)))

### Evaluating Validation Set
The "evaluate_model" function created above would now be used to judge the performance of the selected models

on the validation set's features and labels.

From the output it is clear that the Random Forest model performs better in terms of accuracy and precision,

however, it takes much longer than the Logistic Regression model to make predictions.

Since there is no real time constraint in this case, it is safe to say that the Random Forest model

is the better option and so this would be chosen to evaluate the Test set.

In [6]:
for name, model in models.items():
    evaluate_model(name, model, validation_features, validation_labels)

LR -- Accuracy: 0.787 / Precision: 0.654 / Recall: 0.63 / Latency: 3.0ms
RF -- Accuracy: 0.792 / Precision: 0.667 / Recall: 0.63 / Latency: 11.0ms


### Evaluating Test Set
Now that the model with the best performance on the Validation set has been chosen, it is time to 

see how well it does on the Test set's features and labels. The "test_features" would be the input

that the model would make predictions on. Those predictions would be compared with the actual outcomes

in the "test_labels" and that will tell how well the model performs.

In [7]:
evaluate_model('Random Forest', models['RF'], test_features, test_labels)

Random Forest -- Accuracy: 0.799 / Precision: 0.87 / Recall: 0.618 / Latency: 11.0ms
