# XGBoost

### Plain English summary
Machine learning algorithms (such as XGBoost) were devised to deal with enormous and complex datasets, with the approach that the more data that you can throw at them, the better, and let the algorithms work it out themselves.

However this approach can make it tricky to be able to explain a coherent story about how the models are working, the relationships that they have found, and how they have made their predictions.

Our machine learning work has taken on an additional focus - to make our work as explainable as possible. Both in terms of being able to explain how the models have arrived at their outcome, and in the ease at which we can disseminate our work to a wider audience. For us to have explainable models we want to have a balance between model complexity and model accuracy in order to be able to explain our models, whilst maintaining model performance.

In this notebook we create a model to predict if a patient should receive thrombolysis using just a single input feature, chosen as the feature that gave the model it's best performance. The single feature that gave the best model performance was "Arrival-to-scan time". Fixing this feature in the model, we repeated the process to chose the next single feature to add to the model. The best single feature to include next was "Stroke type". We repeated this process, choosing the next feature to add to the model until 25 features were included (it was limited to 25 features for computational time purposes).

We found that a model with eight features is able to provide 99% of the accuracy obtained when all 84 features are used, and that these eight features are also independent of each other (refer to section Check correlation between selected features to confirm this).

When disseminating the initial 8 feature model outputs to clinicians we observed how, when they were discussing whether a particular patient was suitable to recieve thrombolysis, they would often discuss the patients age. Patient age was the 10th feature to be selected by this process. We decided to extend the feature selected list to include the 9th and 10th selected features: onset during sleep and patient age. This model provided >99% of the accuracy obtained when all 84 features are used. These ten features are also largely independent of each other (refer to section Check correlation between selected features to confirm this).

This is not saying that these are the 10 most important features, as another highly correlated feature may also have been important, but it is now not needed to be included in the model.

We will train future models using these ten features.

NOTE: This experiment was performed using data where time from onset to arrival, and tiem from arrival to scan, were rounded to the nearest 5 minutes. When more precise data is used feature order varies slightly after feature 8.

### Model and data
XGBoost models were trained on stratified k-fold cross-validation data. The full dataset contains 84 features that describe the patient (in terms of their clinical characteristics, the stroke pathway, and the stroke team that they attended). Features to be included in the model were sequentially selected as the single best feature to add to the model in terms of performance from the area under the receiver operating characteristic (ROC AUC) curve. When included, the hospital feature is included as a one-hot encoded feature.

### Aims
Select up to 25 features (from the full set of 84 features) using forward feature selection. Features are selected sequentially (using the greedy approach), choosing the feature that leads to most improvement in ROC AUC score.
Decide on the number of features to include in future models

### Observations
Ten features are able to provide a ROC AUC of 0.919 out of a maximum of 0.922. These features are also largely independent of each other.

Our best model with 1, 2, 10 & 84 features had a ROC AUC of 0.715, 0.792, 0.919 & 0.922.

## Import libraries

In [None]:
# Turn warnings off to keep notebook tidy
import warnings
warnings.filterwarnings("ignore")

import os
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

import scipy

from xgboost import XGBClassifier
from sklearn.metrics import auc
from sklearn.metrics import roc_curve

import json

from dataclasses import dataclass

import seaborn as sns

from sklearn.metrics import roc_auc_score

from sklearn.metrics import confusion_matrix

from matplotlib.lines import Line2D

import pickle
import shap

from os.path import exists

import math

import importlib
# Import local package
from utils import waterfall
# Force package to be reloaded
importlib.reload(waterfall);

import time

Report the time duration to run notebook

In [None]:
start_time = time.time()

## Set up paths and filenames

In [None]:
@dataclass(frozen=True)
class Paths:
    '''Singleton object for storing paths to data and database.'''

  #  data_path: str = '../'
  #  data_filename: str = 'SAMueL ssnap extract v2.csv'
  #  data_save_path: str = './'
  #  data_save_filename: str = 'reformatted_data.csv'
  #  database_filename: str = 'samuel.db'
  #  notebook: str = '01'
  #  kfold_folder: str = 'data/kfold_5fold/'

    data_read_path: str = '../data/'
    data_read_filename: str = '02_reformatted_data_ml_230612.csv'
 #   data_save_path: str = './kfold_5fold'
#    data_save_filename: str = 'train.csv'
    notebook: str = '230620_'
    model_text: str = 'xgb_all_data_5_features'

paths = Paths()

## Import data

Data has previously been split into 5 stratified k-fold splits.

In [None]:
filename = paths.data_read_path + paths.data_read_filename
data = pd.read_csv(filename)

In [None]:
class_names = data['discharge_disability'].unique()
class_names = np.sort(class_names)
n_classes = len(class_names)

temp_set = set(train_data[0].columns)

if 'weekday' in temp_set:
    print ("weekday")
if 'discharge_destination' in temp_set:
    print ("distination")

for feature, prefix in zip(features_to_one_hot, list_prefix):
    train_data[0] = convert_feature_to_one_hot(train_data[0], feature, prefix)

temp_set = set(train_data[0].columns)
if 'weekday' in temp_set:
    print ("weekday")
if 'discharge_destination' in temp_set:
    print ("dich")

df_feature = pd.get_dummies(
    df[feature_name], prefix = prefix)
df = pd.concat([df, df_feature], axis=1)
df.drop(feature_name, axis=1, inplace=True)

Get list of features

In [None]:
features = list(data)
print(f"There are {len(features)} features")

Want to use onset to thrombolysis time in the model. Define function to calculate the feature.

In [None]:
def calculate_onset_to_thrombolysis(row):
    # Set default value of onset to thrombolysis of -100 (no thrombolysis given)
    onset_to_thrombolysis = -100
    # Set value if thrombolysis given
    if  row['scan_to_thrombolysis_time'] != -100:
        onset_to_thrombolysis = (row['onset_to_arrival_time'] + 
        row['arrival_to_scan_time'] + row['scan_to_thrombolysis_time'])
    return onset_to_thrombolysis

In [None]:
# Calculate onset to thgrombolysis (but set to -100 if no thrombolysis given)
data['onset_to_thrombolysis_time'] = data.apply(calculate_onset_to_thrombolysis, axis=1)
data.drop(['scan_to_thrombolysis_time', 'arrival_to_scan_time',
        'onset_to_arrival_time'], axis=1, inplace=True)

Only include the elected 5 features

In [None]:
selected_features = ['prior_disability','stroke_severity','stroke_team',
                     'onset_to_thrombolysis_time','age']
selected_features.append('discharge_disability')
data = data[selected_features]

## One hot the categorical features

Convert some categorical features to one hot encoded features.

Define a function

In [None]:
def convert_feature_to_one_hot(df, feature_name, prefix):
    """
    df [dataframe]: training or test dataset
    feature_name [str]: feature to convert to ont hot encoding
    prefix [str]: string to use on new feature
    """

    # One hot encode a feature
    df_feature = pd.get_dummies(
        df[feature_name], prefix = prefix)
    df = pd.concat([df, df_feature], axis=1)
    df.drop(feature_name, axis=1, inplace=True)

    return(df)

Set up two lists for the one hot encoding. 

A list of the feature names that are categorical and to be converted using one hot encoding.
A list of the prefixes to use for these features.

In [None]:
features_to_one_hot = ["stroke_team"]
list_prefix = ["team"]

For each feature in the list, for each train and test dataset, convert to one hot encoded.

In [None]:
for feature, prefix in zip(features_to_one_hot, list_prefix):
    data = convert_feature_to_one_hot(data, feature, prefix)

Get X and y

In [None]:
X_data = data.drop('discharge_disability', axis=1)
y_data = data['discharge_disability']

Get list of features in dataset, post one hot encoding.

In [None]:
features_ohe = list(X_data)

## Fit XGBoost model

Train model with all data

In [None]:
filename = f"{paths.notebook}{paths.model_text}.p"

# Check if exists
file_exists = exists(filename)

if file_exists:
    # load model
    with open(filename, 'rb') as filehandler:
        model = pickle.load(filehandler)
else:        

    # Define model
    model = XGBClassifier(verbosity = 0, seed=42, learning_rate=0.5)

    # Fit model
    model.fit(X_data, y_data)

    # Save model
    with open(filename, 'wb') as filehandler:
        pickle.dump(model, filehandler)

# Get predicted probabilities
y_probs = model.predict_proba(X_data)
y_pred = model.predict(X_data)

# Calculate error
y_error = y_data - y_pred

Show accuracy (identity)

In [None]:
accuracy = np.mean(y_error==0)
print (f'Accuracy: {accuracy:0.2f}')

error_within_one = np.mean(np.abs(y_error)<=1)
print (f'Error within 1: {error_within_one:0.2f}')

## Feature importance

In [None]:
# Get and store feature importances
feature_importance = model.feature_importances_

# Store in DataFrame
feature_importance_df = pd.DataFrame(data = feature_importance, index=features_ohe)
feature_importance_df.columns = ['importance']

# Sort by importance (weight)
feature_importance_df.sort_values(by='importance', 
                                  ascending=False, inplace=True)

# Save
#feature_importance_df.to_csv(f'output/{notebook}_{model_type}_feature_importance.csv')

# Display top 25
feature_importance_df.head(25)

Create a bar chart for the XGBoost feature importance values

In [None]:
# Set up figure
fig = plt.figure(figsize=(8,8))
ax = fig.add_subplot(111)

# Get labels and values
labels = feature_importance_df.index.values[0:25]
pos = np.arange(len(labels))
val = feature_importance_df['importance'].values[0:25]

# Plot
ax.bar(pos, val)
ax.set_ylabel('Feature importance')
ax.set_xticks(np.arange(len(labels)))
ax.set_xticklabels(labels)

# Rotate the tick labels and set their alignment.
plt.setp(ax.get_xticklabels(), rotation=90, ha="right",
         rotation_mode="anchor")

plt.tight_layout()
#plt.savefig(f'output/{notebook}_{model_type}_feature_weights_bar.jpg', dpi=300)
plt.show()

Same data in another display

In [None]:
n_show = 20
indices = np.argsort(feature_importance)
indices = indices[-n_show:]
features = X_data.columns
plt.title('Feature Importances')
plt.barh(range(len(indices)), feature_importance[indices], color='g', align='center')
plt.yticks(range(len(indices)), [features[i] for i in indices])
plt.xlabel('Relative Importance')
plt.show()

## SHAP values
SHAP values give the contribution that each feature has on the models prediction, per instance. A SHAP value is returned for each feature, for each instance.

We will use the shap library: https://shap.readthedocs.io/en/latest/index.html

'Raw' SHAP values from XGBoost model are log odds ratios. A SHAP value is returned for each feature, for each instance, for each model (one per k-fold)

## Get SHAP values
TreeExplainer is a fast and exact method to estimate SHAP values for tree models and ensembles of trees. Using this we can calculate the SHAP values.

Either load from pickle (if file exists), or calculate.

In [None]:
filename = (f'{paths.notebook}{paths.model_text}_shap_values_extended.p')
# Check if exists
file_exists = exists(filename)

if file_exists:

    # Load shap values
    with open(filename, 'rb') as filehandler:
        shap_values_extended = pickle.load(filehandler)
        shap_values = shap_values_extended.values

    # Load explainer
    explainer_filename = (f'{paths.notebook}{paths.model_text}_shap_explainer.p')
    with open(explainer_filename, 'rb') as filehandler:
        explainer = pickle.load(filehandler)
else:


    # Set up explainer using the model and feature values from training set
    explainer = shap.TreeExplainer(model, X_data)

    # Get (and store) Shapley values along with base and feature values
    shap_values_extended = explainer(X_data)

    # Shap values exist for each classification in a Tree
    # We are interested in 1=give thrombolysis (not 0=not give thrombolysis)
    shap_values = shap_values_extended.values

    explainer_filename = (f'{paths.notebook}{paths.model_text}_shap_explainer.p')

    # Save explainer using pickle
    with open(explainer_filename, 'wb') as filehandler:
        pickle.dump(explainer, filehandler)
        
    # Save shap values extendedr using pickle
    with open(filename, 'wb') as filehandler:
        pickle.dump(shap_values_extended, filehandler)

Making the 3D numpy array (shap_values) match the format required by shap.summary_plot - a list of 7 arrays (shap_values_list).

Only include the number of features want for the plot.

In [None]:
st_display = 0
end_display = 4

shap_values_list = []
for i in range(n_classes):
    shap_values_list.append(shap_values[:,st_display:end_display,i])

In [None]:
fig, ax = plt.subplots(1,1)
#fig.legend(loc=4)
ax = shap.summary_plot(shap_values_list, X_data.iloc[:,st_display:end_display].values, 
                       plot_type="bar", 
                       class_names=model.classes_, 
                       feature_names = X_data.iloc[:,st_display:end_display].columns, 
                       class_inds="original",
                       show=False)
#fig.legend(loc=4)
#ax.legend(loc=4)
plt.tight_layout()

In [None]:
st_display = 4
end_display = 50

shap_values_list = []
for i in range(n_classes):
    shap_values_list.append(shap_values[:,st_display:end_display,i])

In [None]:
fig, ax = plt.subplots(1,1)
#fig.legend(loc=4)
ax = shap.summary_plot(shap_values_list, X_data.iloc[:,st_display:end_display].values, 
                       plot_type="bar", 
                       class_names=model.classes_, 
                       feature_names = X_data.iloc[:,st_display:end_display].columns, 
                       class_inds="original",
                       show=False)
#fig.legend(loc=4)
#ax.legend(loc=4)
plt.tight_layout()

Can look at an individual class. Here each subplot shows the results for a class.

In [None]:
st_display = 0
end_display = 4

shap_values_list = []
for i in range(n_classes):
    shap_values_list.append(shap_values[:,st_display:end_display,i])

In [None]:
fig = plt.figure(figsize=(10,50))
for i in range(n_classes):  
    ax = fig.add_subplot(1,n_classes,i+1)
    ax = shap.summary_plot(shap_values_list[i], 
                           X_data.iloc[:,st_display:end_display].values, 
                           feature_names=X_data.iloc[:,st_display:end_display].columns, 
                           show=False, auto_size_plot=False)
    #ax.title(f"Class {model.classes_[i]}")

plt.tight_layout()

In [None]:
fig, axes = plt.subplots(nrows=3, ncols=3,figsize=(10,10))
row=1
col=1
for i in range(n_classes):  
    shap.summary_plot(shap_values_list[i], X_data.iloc[:,st_display:end_display].values, ax=axes[row,col],
                      feature_names=X_data.iloc[:,st_display:end_display].columns, show=False)
    col+=1
    if col==4:
        col=1
        row+=1
    #ax.title(f"Class {model.classes_[i]}")

plt.tight_layout()

In [None]:
fig, axes = plt.subplots(nrows=3, ncols=3,figsize=(15,15))
row=0
col=0
for i in range(n_classes):  #(1):
    shap.summary_plot(shap_values_list[i], X_data.iloc[:,st_display:end_display].values, 
                      feature_names=X_data.iloc[:,st_display:end_display].columns, show=False)#matplotlib=True)
    f = plt.gcf()
    #ax = axes[0,0]#row,col]
    #ax = f
   # col+=1
    #if col==3:
   #     col=0
   #     row+=1
    #ax.title(f"Class {model.classes_[i]}")

plt.tight_layout()

In [None]:
fig, axes = plt.subplots(nrows=3,
                         ncols=3)
axes = axes.ravel()

fig.suptitle(f'Summary plot for all classes', fontsize=15)

count = 0
for i in range(1):#n_classes):  
        
    shap.summary_plot(shap_values_list[i], X_data.values, 
                      feature_names=X_data.columns, show=False,
                      ax=axes[count])
    count+=1

# Visual propoerties of figure
dimension = 5 * 5
fig.set_figheight(dimension)
fig.set_figwidth(dimension)
fig.tight_layout(pad=2)
plt.show()

## SHAP dependency plot

The partial dependence plot (short PDP or PD plot) shows the marginal effect one or two features have on the predicted outcome of a machine learning model (J. H. Friedman 2001 [3]). A partial dependence plot can show whether the relationship between the target and a feature is linear, monotonic or more complex. The partial dependence plot is a global method: The method considers all instances and gives a statement about the global relationship of a feature with the predicted outcome. An assumption of the PDP is that the first feature are not correlated with the second feature. If this assumption is violated, the averages calculated for the partial dependence plot will include data points that are very unlikely or even impossible. A dependence plot is a scatter plot that shows the effect a single feature has on the predictions made by the model. In this example the property value increases significantly when the average number of rooms per dwelling is higher than 6. Each dot is a single prediction (row) from the dataset. The x-axis is the actual value from the dataset. The y-axis is the SHAP value for that feature, which represents how much knowing that feature’s value changes the output of the model for that sample’s prediction. The color corresponds to a second feature that may have an interaction effect with the feature we are plotting (by default this second feature is chosen automatically). If an interaction effect is present between this other feature and the feature we are plotting it will show up as a distinct vertical pattern of coloring.

In [None]:
X_data.columns

In [None]:
shap_values_list = []
for i in range(n_classes):
    shap_values_list.append(shap_values[:,:,i])

In [None]:
# If we pass a numpy array instead of a data frame then we
# need pass the feature names in separately
shap.dependence_plot(ind=features_ohe[0], 
                     interaction_index=features_ohe[1], 
                     shap_values=shap_values_list[0], features=X_data.values, 
                     feature_names=X_data.columns)

In [None]:
max_display = 4

# Create a matrix of subplots per class. Each showing the relationship between
# each combination of features on the SHAP value.
for c in range(n_classes):
    # setup matrix of subplots
    fig, axes = plt.subplots(nrows=max_display,
                             ncols=max_display)
    axes = axes.ravel()

    # Set overall title
    fig.suptitle(f'Class {model.classes_[c]}', fontsize=30)

    # Initialise subplot counter
    count = 0

    # Loop through the features to display
    for i in range(max_display):
        # Loop through the features to display
        for j in range(max_display):
            # Create the plot. Pass the axes
            shap.dependence_plot(ind=features_ohe[i], 
                                interaction_index=features_ohe[j], 
                                shap_values=shap_values_list[c], 
                                features=X_data.values, 
                                feature_names=X_data.columns,
                                show=False, ax=axes[count])
            
            # Add line as shap=0
            axes[count].plot([-1, X_data[features_ohe[i]].max()+1],
                             [0,0],c='0.5')
            
            # Increase subplot counter
            count+=1
    
    # Change font size for each subplot
    for ax in axes:
        ax.set_xlabel(ax.get_xlabel(), fontsize=15)
        ax.set_ylabel(ax.get_ylabel(), fontsize=15)
        ax.tick_params(axis='both',which='major',labelsize=15)

    # Visual propoerties of figure
    dimension = 5 * 5
    fig.set_figheight(dimension)
    fig.set_figwidth(dimension)
    fig.tight_layout(pad=2)
    plt.show()

## SHAP Force plot

Force plot gives us the explainability of a single model prediction. In this plot we can see how features contributed to the model’s prediction for a specific observation. It is very convenient to use for error analysis or for a deep understanding of a particular case.

In [None]:
row = 90

In [None]:
shap.initjs()

In [None]:
shap.force_plot(explainer.expected_value[0], shap_values_list[0][row], 
                X_data.values[row], feature_names = X_data.columns)

## SHAP waterfall plot

Waterfall is another local analysis plot of a single instance prediction. Let’s take instance number 8 as an example:

In [None]:
X_data.columns.tolist()

In [None]:
plot_class = 0
shap.waterfall_plot(shap.Explanation(values=shap_values_list[plot_class][row], 
                                        base_values=explainer.expected_value[plot_class], data=X_data.iloc[row],  
                                        feature_names=X_data.columns.tolist()))

In [None]:
plot_class = 1
shap.waterfall_plot(shap.Explanation(values=shap_values_list[plot_class][row], 
                                        base_values=explainer.expected_value[plot_class], data=X_data.iloc[row],  
                                        feature_names=X_data.columns.tolist()))

In [None]:
# Create a matrix of subplots per class. Each showing the relationship between
# each combination of features on the SHAP value.
# setup matrix of subplots
fig, axes = plt.subplots(nrows=3,
                         ncols=3)
axes = axes.ravel()

# Set overall title
fig.suptitle(f'Waterfall for instance {row}', fontsize=30)

# Initialise subplot counter
count = 0

for c in range(n_classes):
    shap.waterfall_plot(shap.Explanation(values=shap_values_list[plot_class][row], 
                                        base_values=explainer.expected_value[plot_class], data=X_data.iloc[row],  
                                        feature_names=X_data.columns.tolist()), show=False, ax=axes[count])
    
    # Increase subplot counter
    count+=1
    
# Change font size for each subplot
for ax in axes:
    ax.set_xlabel(ax.get_xlabel(), fontsize=15)
    ax.set_ylabel(ax.get_ylabel(), fontsize=15)
    ax.tick_params(axis='both',which='major',labelsize=15)

# Visual propoerties of figure
dimension = 5 * 5
fig.set_figheight(dimension)
fig.set_figwidth(dimension)
fig.tight_layout(pad=2)
plt.show()




In [None]:
# Create a matrix of subplots per class. Each showing the relationship between
# each combination of features on the SHAP value.
# setup matrix of subplots
fig = plt.figure()

# Set overall title
fig.suptitle(f'Waterfall for instance {row} '
             f'(observed dischange mRS = {y_data.iloc[row]})', 
             fontsize=30)

# Initialise subplot counter
count = 1

for c in range(n_classes):
    ax = fig.add_subplot(3,3,count)
    shap.waterfall_plot(shap.Explanation(values=shap_values_list[c][row], 
                                        base_values=explainer.expected_value[c], 
                                        data=X_data.iloc[row],  
                                        feature_names=X_data.columns.tolist()), 
                                        show=False)
    ax. set_title(f"Predicted discharge mRS {c}")
    # Increase subplot counter
    count+=1
    
plt.gcf().set_size_inches(20,15)
plt.tight_layout()
# Change font size for each subplot
#for ax in axes:
#    ax.set_xlabel(ax.get_xlabel(), fontsize=15)
#    ax.set_ylabel(ax.get_ylabel(), fontsize=15)
#    ax.tick_params(axis='both',which='major',labelsize=15)

# Visual propoerties of figure
#dimension = 5 * 5
#fig.set_figheight(dimension)
#fig.set_figwidth(dimension)
#fig.tight_layout(pad=2)
plt.show()


In [None]:
def create_waterfall_multiclass_grid(row, y_data_row, n_classes, shap_values_list_row,
                                     base_values, data, feature_names):

    # Create a matrix of subplots per class. Each showing the relationship between
    # each combination of features on the SHAP value.
    # setup matrix of subplots
    fig = plt.figure()

    # Set overall title
    fig.suptitle(f'Waterfall for instance {row} '
                f'(observed dischange mRS = {y_data_row})', 
                fontsize=30)

    # Initialise subplot counter
    count = 1

    for c in range(n_classes):
        ax = fig.add_subplot(4,2,count)
        shap.waterfall_plot(shap.Explanation(values=shap_values_list_row[c], 
                                            base_values=base_values[c], 
                                            data=data,  
                                            feature_names=feature_names), 
                                            show=False)
        ax. set_title(f"Predicted discharge mRS {c}")
        # Increase subplot counter
        count+=1
        
    plt.gcf().set_size_inches(20,15)
    plt.tight_layout()
    # Change font size for each subplot
    #for ax in axes:
    #    ax.set_xlabel(ax.get_xlabel(), fontsize=15)
    #    ax.set_ylabel(ax.get_ylabel(), fontsize=15)
    #    ax.tick_params(axis='both',which='major',labelsize=15)

    # Visual propoerties of figure
    #dimension = 5 * 5
    #fig.set_figheight(dimension)
    #fig.set_figwidth(dimension)
    #fig.tight_layout(pad=2)
    plt.show()

    return()

In [None]:
row = 1
create_waterfall_multiclass_grid(row, y_data.iloc[row], n_classes, 
                                 shap_values_list[:][row],
                                 explainer.expected_value, X_data.iloc[row], 
                                 X_data.columns.tolist())

In [None]:
row = 0
create_waterfall_multiclass_grid(row, y_data.iloc[row], n_classes, 
                                 shap_values_list[:][row],
                                 explainer.expected_value, X_data.iloc[row], 
                                 X_data.columns.tolist())

In [None]:
waterfall.waterfall(shap_values_extended[plot_class][row], 
                    show=False, y_reverse=True, rank_absolute=False, 
                    raw_ascending=False)
#                    base_values=explainer.expected_value[plot_class], 
#                                        data=X_data.iloc[row],  
#                                        feature_names=X_data.columns.tolist()))

In [None]:
shap_values_extended[row].base_values[plot_class]

In [None]:
shap_values_extended.shape

In [None]:
shap_values_extended[row].values[:,plot_class].shape

In [None]:
shap_values_extended[row].

In [None]:
shap_values_extended[row]

In [None]:
end_time = time.time()

print(f'Time taken: {end_time - start_time}')