# LECTURE: Machine Learning with SageMaker
---
## Overview
The purpose of this lecture is to run built-in XGBoost algorithm for classification of breast cancer dataset, evaluate the model, do hypertuning and deploy it.
  

### Install and import required Libraries

In [None]:
# If you have an error with role creation, try to upgrade boto3
#%pip install --upgrade boto3
#! pip install -U numpy
#! pip install -U pandas

! conda upgrade pandas

Collecting package metadata (current_repodata.json): done
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): done
Solving environment: \ 

Updating pandas is constricted by 

anaconda -> requires pandas==1.0.5=py38h959d312_0

If you are sure you want an update of your package either try `conda update --all` or install a specific version of the package you want using `conda install <pkg>=<version>`

done

## Package Plan ##

  environment location: /opt/anaconda3

  added / updated specs:
    - pandas


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    backports.functools_lru_cache-1.6.4|     pyhd3eb1b0_0           9 KB
    backports.tempfile-1.0     |     pyhd3eb1b0_1          11 KB
    cctools-949.0.1            |      h9abeeb2_23          20 KB
    cctools_osx-64-949.0.1     |      hc7db

In [None]:
import pandas as pd
import numpy as np
import boto3
import urllib.request, json, os, sagemaker
from sagemaker import get_execution_role
from time import gmtime, strftime
from sagemaker.predictor import csv_serializer
from sagemaker.tuner import (
    IntegerParameter,
    CategoricalParameter,
    ContinuousParameter,
    HyperparameterTuner,
)
import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt

Get region

In [None]:
my_region = boto3.session.Session().region_name
prefix = 'sagemaker/MLI-DEMO-xgboost-dm'

print("Region: {}".format(my_region))

Region: us-west-2


Create boto3 object

In [None]:
s3 = boto3.resource('s3')

### Create bucket for model artifacts

Create bucket if you do not have already one.

Firstly, set bucket name:

In [None]:
bucket_name = 'bah-bucket-sagemaker-course-2023'

There are some AWS rules how you should name bucket names, here they are:

### Bukcet names rules
    
- Bucket names must be between 3 (min) and 63 (max) characters long.

- Bucket names can consist only of lowercase letters, numbers, dots (.), and hyphens (-).

- Bucket names must begin and end with a letter or number.

- Bucket names must not contain two adjacent periods.

- Bucket names must not be formatted as an IP address (for example, 192.168.5.4).

- Bucket names must not start with the prefix xn--

- Bucket names must not end with the suffix -s3alias.

- Bucket names must be unique across all AWS accounts in all the AWS Regions within a partition

- A bucket name cannot be used by another AWS account in the same partition until the bucket is deleted.

- Buckets used with Amazon S3 Transfer Acceleration can't have dots (.) in their names.

In [None]:
try:
    if  my_region == 'us-east-1':
        if not s3.Bucket(bucket_name).creation_date:
            s3.create_bucket(Bucket=bucket_name)
            print('S3 bucket created successfully')
        else:
            print('Bucket already exists!')
    else: 
        if not s3.Bucket(bucket_name).creation_date:
            s3.create_bucket(Bucket=bucket_name, CreateBucketConfiguration={'LocationConstraint': my_region})
            print('S3 bucket created successfully')
        else:
            print('Bucket already exists!')
except Exception as e:
    print('S3 error: ', e)

Bucket already exists!


NOTE: You can create bucket directly on Amazon S3 UI

### Read data from S3.

You can download your data from any location online. If you have dataset locally, upload it to S3 and then read it here.

NOTE

We will download data from bucket that was previously created where datasets are stored for this course.

**More about dataset:**

Breast cancer is the most common cancer amongst women in the world. It accounts for 25% of all cancer cases, and affected over 2.1 Million people in 2015 alone. It starts when cells in the breast begin to grow out of control. These cells usually form tumors that can be seen via X-ray or felt as lumps in the breast area.

The key challenges against it’s detection is how to classify tumors into malignant (cancerous) or benign(non cancerous).

In [None]:
# This dataset can be found on Kaggle as well: https://www.kaggle.com/datasets/yasserh/breast-cancer-dataset
dataset_name = 'breast-cancer.csv'
bucket_data_name = 'bah-data'
data_location = 's3://{}/{}'.format(bucket_data_name, dataset_name)

data = pd.read_csv(data_location)

In [None]:
data.head()

Unnamed: 0,id,diagnosis,radius_mean,texture_mean,perimeter_mean,area_mean,smoothness_mean,compactness_mean,concavity_mean,concave points_mean,...,radius_worst,texture_worst,perimeter_worst,area_worst,smoothness_worst,compactness_worst,concavity_worst,concave points_worst,symmetry_worst,fractal_dimension_worst
0,842302,M,17.99,10.38,122.8,1001.0,0.1184,0.2776,0.3001,0.1471,...,25.38,17.33,184.6,2019.0,0.1622,0.6656,0.7119,0.2654,0.4601,0.1189
1,842517,M,20.57,17.77,132.9,1326.0,0.08474,0.07864,0.0869,0.07017,...,24.99,23.41,158.8,1956.0,0.1238,0.1866,0.2416,0.186,0.275,0.08902
2,84300903,M,19.69,21.25,130.0,1203.0,0.1096,0.1599,0.1974,0.1279,...,23.57,25.53,152.5,1709.0,0.1444,0.4245,0.4504,0.243,0.3613,0.08758
3,84348301,M,11.42,20.38,77.58,386.1,0.1425,0.2839,0.2414,0.1052,...,14.91,26.5,98.87,567.7,0.2098,0.8663,0.6869,0.2575,0.6638,0.173
4,84358402,M,20.29,14.34,135.1,1297.0,0.1003,0.1328,0.198,0.1043,...,22.54,16.67,152.2,1575.0,0.1374,0.205,0.4,0.1625,0.2364,0.07678


## Prepare dataset

#### Check for missing data in all columns

In [None]:
data.isnull().any()

id                         False
diagnosis                  False
radius_mean                False
texture_mean               False
perimeter_mean             False
area_mean                  False
smoothness_mean            False
compactness_mean           False
concavity_mean             False
concave points_mean        False
symmetry_mean              False
fractal_dimension_mean     False
radius_se                  False
texture_se                 False
perimeter_se               False
area_se                    False
smoothness_se              False
compactness_se             False
concavity_se               False
concave points_se          False
symmetry_se                False
fractal_dimension_se       False
radius_worst               False
texture_worst              False
perimeter_worst            False
area_worst                 False
smoothness_worst           False
compactness_worst          False
concavity_worst            False
concave points_worst       False
symmetry_w

No column has missing data. Great.

In [None]:
data.dtypes

id                           int64
diagnosis                   object
radius_mean                float64
texture_mean               float64
perimeter_mean             float64
area_mean                  float64
smoothness_mean            float64
compactness_mean           float64
concavity_mean             float64
concave points_mean        float64
symmetry_mean              float64
fractal_dimension_mean     float64
radius_se                  float64
texture_se                 float64
perimeter_se               float64
area_se                    float64
smoothness_se              float64
compactness_se             float64
concavity_se               float64
concave points_se          float64
symmetry_se                float64
fractal_dimension_se       float64
radius_worst               float64
texture_worst              float64
perimeter_worst            float64
area_worst                 float64
smoothness_worst           float64
compactness_worst          float64
concavity_worst     

In [None]:
data.nunique()

id                         569
diagnosis                    2
radius_mean                456
texture_mean               479
perimeter_mean             522
area_mean                  539
smoothness_mean            474
compactness_mean           537
concavity_mean             537
concave points_mean        542
symmetry_mean              432
fractal_dimension_mean     499
radius_se                  540
texture_se                 519
perimeter_se               533
area_se                    528
smoothness_se              547
compactness_se             541
concavity_se               533
concave points_se          507
symmetry_se                498
fractal_dimension_se       545
radius_worst               457
texture_worst              511
perimeter_worst            514
area_worst                 544
smoothness_worst           411
compactness_worst          529
concavity_worst            539
concave points_worst       492
symmetry_worst             500
fractal_dimension_worst    535
dtype: i

Check for label column distribution

In [None]:
data['diagnosis'].value_counts()

B    357
M    212
Name: diagnosis, dtype: int64

Convert target column into numerical representation using LabelEncoder from sklearn

Drop id column

In [None]:
data.drop('id', axis=1, inplace=True)

Convert target column into numberical representation

In [None]:
from sklearn.preprocessing import LabelEncoder

le = LabelEncoder()
label = le.fit_transform(data['diagnosis'])

data.drop('diagnosis', axis=1, inplace=True)
data["label"] = label

In [None]:
data.head()

Unnamed: 0,radius_mean,texture_mean,perimeter_mean,area_mean,smoothness_mean,compactness_mean,concavity_mean,concave points_mean,symmetry_mean,fractal_dimension_mean,...,texture_worst,perimeter_worst,area_worst,smoothness_worst,compactness_worst,concavity_worst,concave points_worst,symmetry_worst,fractal_dimension_worst,label
0,17.99,10.38,122.8,1001.0,0.1184,0.2776,0.3001,0.1471,0.2419,0.07871,...,17.33,184.6,2019.0,0.1622,0.6656,0.7119,0.2654,0.4601,0.1189,1
1,20.57,17.77,132.9,1326.0,0.08474,0.07864,0.0869,0.07017,0.1812,0.05667,...,23.41,158.8,1956.0,0.1238,0.1866,0.2416,0.186,0.275,0.08902,1
2,19.69,21.25,130.0,1203.0,0.1096,0.1599,0.1974,0.1279,0.2069,0.05999,...,25.53,152.5,1709.0,0.1444,0.4245,0.4504,0.243,0.3613,0.08758,1
3,11.42,20.38,77.58,386.1,0.1425,0.2839,0.2414,0.1052,0.2597,0.09744,...,26.5,98.87,567.7,0.2098,0.8663,0.6869,0.2575,0.6638,0.173,1
4,20.29,14.34,135.1,1297.0,0.1003,0.1328,0.198,0.1043,0.1809,0.05883,...,16.67,152.2,1575.0,0.1374,0.205,0.4,0.1625,0.2364,0.07678,1


### Upload train/trest data to s3

Firstly, split dataset into train/test and then upload those two files to your s3 bucket

In [None]:
train_data, validation_data, test_data = np.split(
    data.sample(frac=1, random_state=1729),
    [int(0.7 * len(data)), int(0.9 * len(data))],
)


print(train_data.shape, test_data.shape, validation_data.shape)

(398, 31) (57, 31) (114, 31)


This code reformats the header and first column of the training data and then loads the data from the S3 bucket. This step is required to use the Amazon SageMaker pre-built XGBoost algorithm.

Prepare Train data and upload it to s3 bucket from where XGBoost algorithm will read train dataset

In [None]:
label_column = train_data['label']
train_data = train_data.drop(['label'], axis=1)
train_data = pd.concat([label_column, train_data], axis=1)

train_data.to_csv('train.csv', index=False, header=False)

boto3.Session().resource('s3').Bucket(bucket_name).Object(os.path.join(prefix, 'train/train.csv')).upload_file('train.csv')
s3_input_train = sagemaker.TrainingInput(s3_data='s3://{}/{}/train'.format(bucket_name, prefix), content_type='csv')

In [None]:
train_data.head()

Unnamed: 0,label,radius_mean,texture_mean,perimeter_mean,area_mean,smoothness_mean,compactness_mean,concavity_mean,concave points_mean,symmetry_mean,...,radius_worst,texture_worst,perimeter_worst,area_worst,smoothness_worst,compactness_worst,concavity_worst,concave points_worst,symmetry_worst,fractal_dimension_worst
240,0,13.64,15.6,87.38,575.3,0.09423,0.0663,0.04705,0.03731,0.1717,...,14.85,19.05,94.11,683.4,0.1278,0.1291,0.1533,0.09222,0.253,0.0651
337,1,18.77,21.43,122.9,1092.0,0.09116,0.1402,0.106,0.0609,0.1953,...,24.54,34.37,161.1,1873.0,0.1498,0.4827,0.4634,0.2048,0.3679,0.0987
65,1,14.78,23.94,97.4,668.3,0.1172,0.1479,0.1267,0.09029,0.1953,...,17.31,33.39,114.6,925.1,0.1648,0.3416,0.3024,0.1614,0.3321,0.08911
152,0,9.731,15.34,63.78,300.2,0.1072,0.1599,0.4108,0.07857,0.2548,...,11.02,19.49,71.04,380.5,0.1292,0.2772,0.8216,0.1571,0.3108,0.1259
304,0,11.46,18.16,73.59,403.1,0.08853,0.07694,0.03344,0.01502,0.1411,...,12.68,21.61,82.69,489.8,0.1144,0.1789,0.1226,0.05509,0.2208,0.07638


Prepare Validation data and upload it to s3 bucket from where XGBoost algorithm will read validation dataset

In [None]:
label_column = validation_data['label']
validation_data = validation_data.drop(['label'], axis=1)
validation_data = pd.concat([label_column, validation_data], axis=1)

validation_data.to_csv('validation.csv', index=False, header=False)

boto3.Session().resource("s3").Bucket(bucket_name).Object(os.path.join(prefix, "validation/validation.csv")).upload_file("validation.csv")

s3_input_validation = sagemaker.TrainingInput(s3_data='s3://{}/{}/validation'.format(bucket_name, prefix), content_type='csv')

In [None]:
validation_data.head()

Unnamed: 0,label,radius_mean,texture_mean,perimeter_mean,area_mean,smoothness_mean,compactness_mean,concavity_mean,concave points_mean,symmetry_mean,...,radius_worst,texture_worst,perimeter_worst,area_worst,smoothness_worst,compactness_worst,concavity_worst,concave points_worst,symmetry_worst,fractal_dimension_worst
18,1,19.81,22.15,130.0,1260.0,0.09831,0.1027,0.1479,0.09498,0.1582,...,27.32,30.88,186.8,2398.0,0.1512,0.315,0.5372,0.2388,0.2768,0.07615
409,0,12.27,17.92,78.41,466.1,0.08685,0.06526,0.03211,0.02653,0.1966,...,14.1,28.88,89.0,610.2,0.124,0.1795,0.1377,0.09532,0.3455,0.06896
384,0,13.28,13.72,85.79,541.8,0.08363,0.08575,0.05077,0.02864,0.1617,...,14.24,17.37,96.59,623.7,0.1166,0.2685,0.2866,0.09173,0.2736,0.0732
92,0,13.27,14.76,84.74,551.7,0.07355,0.05055,0.03261,0.02648,0.1386,...,16.36,22.35,104.5,830.6,0.1006,0.1238,0.135,0.1001,0.2027,0.06206
76,0,13.53,10.94,87.91,559.2,0.1291,0.1047,0.06877,0.06556,0.2403,...,14.08,12.49,91.36,605.5,0.1451,0.1379,0.08539,0.07407,0.271,0.07191


## Train the model using SageMaker built-in algorithm

Set up the Amazon SageMaker session, create an instance of the XGBoost model (an estimator), and define the model’s hyperparameters.

Create SageMaker session

In [None]:
sess = sagemaker.Session()

Define IAM role

In [None]:
role = get_execution_role()

Specify XGBoost ECR container

In [None]:
xgboost_container = sagemaker.image_uris.retrieve("xgboost", my_region, "latest")

Create XGBoost Estimator



In [None]:
xgb = sagemaker.estimator.Estimator(xgboost_container,
                                    role, 
                                    train_instance_count=1, 
                                    train_instance_type='ml.m4.xlarge',
                                    output_path='s3://{}/{}/output'.format(bucket_name, prefix),
                                    sagemaker_session=sess)

train_instance_count has been renamed in sagemaker>=2.
See: https://sagemaker.readthedocs.io/en/stable/v2.html for details.
train_instance_type has been renamed in sagemaker>=2.
See: https://sagemaker.readthedocs.io/en/stable/v2.html for details.


Set initial hyperparameters

In [None]:
xgb.set_hyperparameters(max_depth=5,
                        eta=0.2,
                        gamma=4,
                        min_child_weight=6,
                        subsample=0.8,
                        silent=0,
                        objective='binary:logistic',
                        num_round=100)

Fit the model

This code trains the model using gradient optimization on a ml.m4.xlarge instance. After a few minutes, you should see the training logs being generated in your Jupyter notebook.

In [None]:
xgb.fit({'train': s3_input_train})

INFO:sagemaker:Creating training-job with name: xgboost-2023-03-16-14-35-29-280


2023-03-16 14:35:29 Starting - Starting the training job......
2023-03-16 14:36:07 Starting - Preparing the instances for training......
2023-03-16 14:37:15 Downloading - Downloading input data...
2023-03-16 14:37:40 Training - Downloading the training image...
2023-03-16 14:38:15 Training - Training image download completed. Training in progress..[34mArguments: train[0m
[34m[2023-03-16:14:38:29:INFO] Running standalone xgboost training.[0m
[34m[2023-03-16:14:38:29:INFO] Path /opt/ml/input/data/validation does not exist![0m
[34m[2023-03-16:14:38:29:INFO] File size need to be processed in the node: 0.08mb. Available memory size in the node: 8603.4mb[0m
[34m[2023-03-16:14:38:29:INFO] Determined delimiter of CSV input is ','[0m
[34m[14:38:29] S3DistributionType set as FullyReplicated[0m
[34m[14:38:29] 398x30 matrix with 11940 entries loaded from /opt/ml/input/data/train?format=csv&label_column=0&delimiter=,[0m
[34m[14:38:29] src/tree/updater_prune.cc:74: tree pruning end, 1

### Hyperparameters tuninig

- We will do some hyperparameter tuning using:
    - `Bayesian` optimization 
    - `Linear` scaling
    - Number of jobs is set to be 5 for the purpose of this lecture.
    - `Area under the ROC Curve` or `auc` would be used objective metric function for validating the models

We will user `alpha` and `lambda` hyperparameters.

`alpha` hyperparameter:
- It is L1 regularization term on weights (analogous to Lasso regression).
- It can be used in case of very high dimensionality so that the algorithm runs faster when implemented.
- Increasing this value will make model more conservative.

`lambda` hyperparameter:
- It is L2 regularization term on weights (analogous to Ridge regression).
- This is used to handle the regularization part of XGBoost.
- Increasing this value will make model more conservative.

In [None]:
objective_metric_name = "validation:auc"
MAX_JOBS = 5
MAX_PARALLEL_JOBS = 4
STRATEGY = 'Bayesian'
SCALING_TYPE = 'Linear'

In [None]:
tuninig_job_name = "xgb-linsearch-" + strftime("%Y%m%d-%H-%M-%S", gmtime())

hyperparameter_ranges_linear = {
    "alpha": ContinuousParameter(0.05, 1, scaling_type=SCALING_TYPE),
    "lambda": ContinuousParameter(0.05, 1, scaling_type=SCALING_TYPE),
}

tuner_linear = HyperparameterTuner(
    xgb,
    objective_metric_name,
    hyperparameter_ranges_linear,
    max_jobs=MAX_JOBS,
    max_parallel_jobs=MAX_PARALLEL_JOBS,
    strategy=STRATEGY,
)

tuner_linear.fit(
    {"train": s3_input_train, "validation": s3_input_validation},
    include_cls_metadata=False,
    job_name=tuninig_job_name)

INFO:sagemaker:Creating hyperparameter tuning job with name: xgb-linsearch-20230316-14-39-17


..................................................!


Check of the hyperparameter tuning jobs status

In [None]:
boto3.client("sagemaker").describe_hyper_parameter_tuning_job(
    HyperParameterTuningJobName=tuner_linear.latest_tuning_job.job_name
)["HyperParameterTuningJobStatus"]

'Completed'

#### Fetch all results as DataFrame

We can list hyperparameters and objective metrics of all training jobs and pick up the training job with the best objective metric.

In [None]:
tuner = sagemaker.HyperparameterTuningJobAnalytics(tuninig_job_name)

full_df = tuner.dataframe()

if len(full_df) > 0:
    df = full_df[full_df["FinalObjectiveValue"] > -float("inf")]
    if len(df) > 0:
        df = df.sort_values("FinalObjectiveValue", ascending=False)
        print("Number of training jobs with valid objective: %d" % len(df))
        print({"lowest": min(df["FinalObjectiveValue"]), "highest": max(df["FinalObjectiveValue"])})
        pd.set_option("display.max_colwidth", None)  # Don't truncate TrainingJobName
    else:
        print("No training jobs have reported valid results yet.")

df

Number of training jobs with valid objective: 5
{'lowest': 0.9771130084991455, 'highest': 0.977446973323822}


Unnamed: 0,alpha,lambda,TrainingJobName,TrainingJobStatus,FinalObjectiveValue,TrainingStartTime,TrainingEndTime,TrainingElapsedTimeSeconds
1,0.133595,0.135777,xgb-linsearch-20230316-14-39-17-004-e8080599,Completed,0.977447,2023-03-16 14:40:42+00:00,2023-03-16 14:42:14+00:00,92.0
2,0.126246,0.148364,xgb-linsearch-20230316-14-39-17-003-30c3b2fa,Completed,0.977447,2023-03-16 14:41:32+00:00,2023-03-16 14:43:20+00:00,108.0
3,0.12044,0.141841,xgb-linsearch-20230316-14-39-17-002-99da3aa1,Completed,0.977447,2023-03-16 14:40:56+00:00,2023-03-16 14:42:44+00:00,108.0
0,0.074075,0.171807,xgb-linsearch-20230316-14-39-17-005-67721c88,Completed,0.977113,2023-03-16 14:41:02+00:00,2023-03-16 14:42:35+00:00,93.0
4,0.101557,0.12456,xgb-linsearch-20230316-14-39-17-001-1e824bc4,Completed,0.977113,2023-03-16 14:41:24+00:00,2023-03-16 14:43:06+00:00,102.0


### Deploy the model

Wait a little bit..

In [None]:
xgb_predictor = xgb.deploy(initial_instance_count=1, instance_type='ml.m4.xlarge')

INFO:sagemaker:Creating model with name: xgboost-2023-03-16-14-52-39-726
INFO:sagemaker:Creating endpoint-config with name xgboost-2023-03-16-14-52-39-726
INFO:sagemaker:Creating endpoint with name xgboost-2023-03-16-14-52-39-726


-------!

NOTE

Only specific size machines are allowed to be used for deployment

### Testing and evaluating the model

Load the test data into an array

In [None]:
test_data_array = test_data.drop(['label'], axis=1).values

Set the serializer type

In [None]:
xgb_predictor.serializer = csv_serializer

Predict on testing data

In [None]:
predictions = xgb_predictor.predict(test_data_array).decode('utf-8')
predictions_array = np.fromstring(predictions[1:], sep=',')
print(predictions_array.shape)

See: https://sagemaker.readthedocs.io/en/stable/v2.html for details.


(57,)


### Evaluate the model performance

To remind:

The key challenges was is how to classify tumors into malignant (cancerous) or benign(non cancerous).

In [None]:
from sklearn.metrics import classification_report
print(classification_report(test_data['label'], np.round(predictions_array), target_names=['benign', 'malignant']))

              precision    recall  f1-score   support

      benign       0.97      0.97      0.97        40
   malignant       0.94      0.94      0.94        17

    accuracy                           0.96        57
   macro avg       0.96      0.96      0.96        57
weighted avg       0.96      0.96      0.96        57



### Terminate your resources

Do not forget!

In [None]:
sess.delete_endpoint(endpoint_name=xgb_predictor.endpoint_name)
bucket_to_delete = boto3.resource('s3').Bucket(bucket_name)
bucket_to_delete.objects.all().delete()

INFO:sagemaker:Deleting endpoint with name: xgboost-2023-03-16-14-52-39-726


[{'ResponseMetadata': {'RequestId': '9X2SY0C7WBT3AA73',
   'HostId': 'S0QOK0c5tF70NnbjqWX7Jyau1Zp7oB09pWeSPfpGkI5+2clXy57CTk1yLZIx/yJCXUoJnqp7eiM=',
   'HTTPStatusCode': 200,
   'HTTPHeaders': {'x-amz-id-2': 'S0QOK0c5tF70NnbjqWX7Jyau1Zp7oB09pWeSPfpGkI5+2clXy57CTk1yLZIx/yJCXUoJnqp7eiM=',
    'x-amz-request-id': '9X2SY0C7WBT3AA73',
    'date': 'Thu, 16 Mar 2023 14:56:14 GMT',
    'content-type': 'application/xml',
    'transfer-encoding': 'chunked',
    'server': 'AmazonS3',
    'connection': 'close'},
   'RetryAttempts': 0},
  'Deleted': [{'Key': 'sagemaker/MLI-DEMO-xgboost-dm/output/xgboost-2023-03-16-14-35-29-280/profiler-output/framework/training_job_end.ts'},
   {'Key': 'sagemaker/MLI-DEMO-xgboost-dm/output/xgb-linsearch-20230316-14-39-17-002-99da3aa1/output/model.tar.gz'},
   {'Key': 'sagemaker/MLI-DEMO-xgboost-dm/output/xgb-linsearch-20230316-14-39-17-001-1e824bc4/output/model.tar.gz'},
   {'Key': 'sagemaker/MLI-DEMO-xgboost-dm/output/xgboost-2023-03-16-14-35-29-280/profiler-outpu