# Build, Train and Deploy an XgBoost model on SageMaker

In this notebook, we will walk through how to process data, train a model and deploy a trained model endpoint. 

We will also demonstrate, how to use SageMaker Processing to build a custom data processing pipeline, and use SageMaker Experiments to track the lineage of the trained models. 

### Download and import necessary libraries

In [None]:
!pip install sagemaker-experiments xgboost -Uqq
!conda install -c conda-forge -q -y shap > /dev/null

### Import libraries

In [None]:
# Let's inspect the role we have created for our notebook here:
import boto3
import sagemaker
from sagemaker import get_execution_role

role = get_execution_role()
sess = sagemaker.Session()
region = boto3.session.Session().region_name
print("Region = {}".format(region))
sm = boto3.Session().client('sagemaker')

Import other libraries such as sagemaker experiments

In [None]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import os
from time import sleep, gmtime, strftime
import json
import time
import s3fs

In [None]:
# Import SageMaker Experiments 

from sagemaker.analytics import ExperimentAnalytics
from smexperiments.experiment import Experiment
from smexperiments.trial import Trial
from smexperiments.trial_component import TrialComponent
from smexperiments.tracker import Tracker

### Specify buckets for storing data

Here you specify the S3 bucket and file paths where a number of artifacts will be stored.

In [None]:
# Use our custom bucket here. 
# Modify the bucket to grant acess to your bucket
rawbucket= sess.default_bucket() 
prefix = 'sagemaker-modelmonitor-cc-default' # use this prefix to store all files pertaining to this workshop.

dataprefix = prefix + '/data'
traindataprefix = prefix + '/train_data'
testdataprefix = prefix + '/test_data'
testdatanolabelprefix = prefix + '/test_data_no_label'
trainheaderprefix = prefix + '/train_headers'

s3 = s3fs.S3FileSystem(anon=False)

# Pre-processing and Feature Engineering

A key part of the data science lifecyle is data exploration, pre-processing and feature engineering. We will demonstrate how to use SM notebooks for data exploration and SM Processing for feature engineering and pre-processing data

### Download and Import the data

In [None]:
!wget https://github.com/stefannatu/ModelMonitor/raw/master/credit_card_default_data.xls

In [None]:
data = pd.read_excel('./credit_card_default_data.xls', header=1)
data = data.drop(columns = ['ID'])
data.head()

In [None]:
# Note that the categorical columns SEX, Education and Marriage have been Integer Encoded in this case.
# For example:
data.SEX.value_counts()

In [None]:
data.rename(columns={"default payment next month": "Label"}, inplace=True)
lbl = data.Label
data = pd.concat([lbl, data.drop(columns=['Label'])], axis = 1)
data.head()

### Data Exploration

In [None]:
import seaborn as sns
sns.countplot(data.Label)
plt.title('Counts of Default versus Non Default Labels')
plt.show()

In [None]:
## Corr plot
f = plt.figure(figsize=(19, 15))
plt.matshow(data.corr(), fignum=f.number)
plt.xticks(range(data.shape[1]), data.columns, fontsize=14, rotation=45)
plt.yticks(range(data.shape[1]), data.columns, fontsize=14)
cb = plt.colorbar()
cb.ax.tick_params(labelsize=14)
plt.title('Correlation Matrix', fontsize=16);

In [None]:
from pandas.plotting import scatter_matrix
SCAT_COLUMNS = ['BILL_AMT1', 'BILL_AMT2', 'PAY_AMT1', 'PAY_AMT2']
scatter_matrix(data[SCAT_COLUMNS],figsize=(10, 10), diagonal ='kde')
plt.show()

### Preprocessing and Feature Engineering in Notebook

For small datasets or testing out your ETL script, you can run pre-processing inside a notebook directly. 

In [None]:
if not os.path.exists('rawdata/rawdata.csv'):
    !mkdir ./rawdata
    data.to_csv('rawdata/rawdata.csv', index=None)
else:
    pass

In [None]:
# Upload the raw dataset
raw_data_location = sess.upload_data('rawdata', bucket=rawbucket, key_prefix=dataprefix)
print(raw_data_location)

In [None]:
# Run the preprocessing job in the notebook and upload the training and validation datasets
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, MinMaxScaler
from sklearn.compose import make_column_transformer

COLS = data.columns
X_train, X_test, y_train, y_test = train_test_split(data.drop('Label', axis=1), data['Label'], 
                                                       test_size=0.2, random_state=0)

newcolorder = ['PAY_AMT1','BILL_AMT1'] + list(COLS[1:])[:11] + list(COLS[1:])[12:17] + list(COLS[1:])[18:]

preprocess = make_column_transformer(
        (StandardScaler(), ['PAY_AMT1']),
        (MinMaxScaler(),['BILL_AMT1']),
    remainder='passthrough')
    
print('Running preprocessing and feature engineering transformations')
train_features = pd.DataFrame(preprocess.fit_transform(X_train), columns = newcolorder)
test_features = pd.DataFrame(preprocess.transform(X_test), columns = newcolorder)

train_full = pd.concat([pd.DataFrame(y_train.values, columns=['Label']), pd.DataFrame(train_features)], axis=1)
test_full = pd.concat([pd.DataFrame(y_test.values, columns=['Label']), pd.DataFrame(test_features)], axis=1)
train_full.to_csv('train_data.csv', index=False, header=False)
test_full.to_csv('test_data.csv', index=False, header=False)                                                   
train_full.to_csv('train_data_with_headers.csv', index=False)                                                    

In [None]:
# Upload data
train_data_location = sess.upload_data('train_data_with_headers.csv', bucket=rawbucket, key_prefix=trainheaderprefix)
train_data_location = sess.upload_data('train_data.csv', bucket=rawbucket, key_prefix=traindataprefix)
test_data_location = sess.upload_data('test_data.csv', bucket=rawbucket, key_prefix=testdataprefix)

### Secure Feature Processing pipeline using SageMaker Processing

While you can pre-process small amounts of data directly in a notebook as shown above, SageMaker Processing offloads the heavy lifting of pre-processing larger datasets by provisioning the underlying infrastructure, downloading the data from an S3 location to the processing container, running the processing scripts, storing the processed data in an output directory in Amazon S3 and deleting the underlying transient resources needed to run the processing job. Once the processing job is complete, the infrastructure used to run the job is wiped, and any temporary data stored on it is deleted.

In [None]:
## Use SageMaker Processing with Sklearn. -- combine data into train and test at this stage if possible.
from sagemaker.sklearn.processing import SKLearnProcessor
sklearn_processor = SKLearnProcessor(framework_version='0.20.0',
                                     role=role,
                                     instance_type='ml.c4.xlarge',
                                     instance_count=1
                                    )

### Write a preprocessing script (same as above)

In [None]:
%%writefile preprocessing.py

import argparse
import os
import warnings

import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, MinMaxScaler
from sklearn.exceptions import DataConversionWarning
from sklearn.compose import make_column_transformer

warnings.filterwarnings(action='ignore', category=DataConversionWarning)

if __name__=='__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument('--train-test-split-ratio', type=float, default=0.3)
    parser.add_argument('--random-split', type=int, default=0)
    args, _ = parser.parse_known_args()
    
    print('Received arguments {}'.format(args))

    input_data_path = os.path.join('/opt/ml/processing/input', 'rawdata.csv')
    
    print('Reading input data from {}'.format(input_data_path))
    df = pd.read_csv(input_data_path)
    df.sample(frac=1)
    
    COLS = df.columns
    newcolorder = ['PAY_AMT1','BILL_AMT1'] + list(COLS[1:])[:11] + list(COLS[1:])[12:17] + list(COLS[1:])[18:]
    
    split_ratio = args.train_test_split_ratio
    random_state=args.random_split
    
    X_train, X_test, y_train, y_test = train_test_split(df.drop('Label', axis=1), df['Label'], 
                                                        test_size=split_ratio, random_state=random_state)
    
    preprocess = make_column_transformer(
        (['PAY_AMT1'], StandardScaler()),
        (['BILL_AMT1'], MinMaxScaler()),
    remainder='passthrough')
    
    print('Running preprocessing and feature engineering transformations')
    train_features = pd.DataFrame(preprocess.fit_transform(X_train), columns = newcolorder)
    test_features = pd.DataFrame(preprocess.transform(X_test), columns = newcolorder)
    
    # concat to ensure Label column is the first column in dataframe
    train_full = pd.concat([pd.DataFrame(y_train.values, columns=['Label']), train_features], axis=1)
    test_full = pd.concat([pd.DataFrame(y_test.values, columns=['Label']), test_features], axis=1)
    
    print('Train data shape after preprocessing: {}'.format(train_features.shape))
    print('Test data shape after preprocessing: {}'.format(test_features.shape))
    
    train_features_headers_output_path = os.path.join('/opt/ml/processing/train_headers', 'train_data_with_headers.csv')
    
    train_features_output_path = os.path.join('/opt/ml/processing/train', 'train_data.csv')
    
    test_features_output_path = os.path.join('/opt/ml/processing/test', 'test_data.csv')
    
    print('Saving training features to {}'.format(train_features_output_path))
    train_full.to_csv(train_features_output_path, header=False, index=False)
    print("Complete")
    
    print("Save training data with headers to {}".format(train_features_headers_output_path))
    train_full.to_csv(train_features_headers_output_path, index=False)
                 
    print('Saving test features to {}'.format(test_features_output_path))
    test_full.to_csv(test_features_output_path, header=False, index=False)
    print("Complete")
    

In [None]:
# Copy the preprocessing code over to the s3 bucket
codeprefix = prefix + '/code'
codeupload = sess.upload_data('preprocessing.py', bucket=rawbucket, key_prefix=codeprefix)
print(codeupload)

In [None]:
train_data_location = rawbucket + '/' + traindataprefix
test_data_location = rawbucket+'/'+testdataprefix
print("Training data location = {}".format(train_data_location))
print("Test data location = {}".format(test_data_location))

In [None]:
from sagemaker.processing import ProcessingInput, ProcessingOutput

sklearn_processor.run(code=codeupload,
                      inputs=[ProcessingInput(
                        source=raw_data_location,
                        destination='/opt/ml/processing/input')],
                      outputs=[ProcessingOutput(output_name='train_data',
                                                source='/opt/ml/processing/train',
                               destination='s3://' + train_data_location),
                               ProcessingOutput(output_name='test_data',
                                                source='/opt/ml/processing/test',
                                               destination="s3://"+test_data_location),
                               ProcessingOutput(output_name='train_data_headers',
                                                source='/opt/ml/processing/train_headers',
                                               destination="s3://" + rawbucket + '/' + prefix + '/train_headers')],
                      arguments=['--train-test-split-ratio', '0.2'] # specify any arguments needed here
                     )

preprocessing_job_description = sklearn_processor.jobs[-1].describe()

output_config = preprocessing_job_description['ProcessingOutputConfig']
for output in output_config['Outputs']:
    if output['OutputName'] == 'train_data':
        preprocessed_training_data = output['S3Output']['S3Uri']
    if output['OutputName'] == 'test_data':
        preprocessed_test_data = output['S3Output']['S3Uri']

# Model development and Training

## Traceability and Auditability 

We use SageMaker Experiments for data scientists to track the lineage of the model from the raw data source to the preprocessing steps and the model training pipeline. With SageMaker Experiments, data scientists can compare, track and manage multiple diferent model training jobs, data processing jobs, hyperparameter tuning jobs and retain a lineage from the source data to the training job artifacts to the model hyperparameters and any custom metrics that they may want to monitor as part of the model training.

In [None]:
# Create a SageMaker Experiment
cc_experiment = Experiment.create(
    experiment_name=f"Sagemaker-Workshop-Morgan-{int(time.time())}", 
    description="Predict credit card default from payments data", 
    sagemaker_boto_client=sm)
print(cc_experiment)

In [None]:
# Start Tracking parameters used in the Pre-processing pipeline.
with Tracker.create(display_name="Preprocessing", sagemaker_boto_client=sm) as tracker:
    tracker.log_parameters({
        "train_test_split_ratio": 0.2,
        "random_state":0
    })
    # we can log the s3 uri to the dataset we just uploaded
    tracker.log_input(name="ccdefault-raw-dataset", media_type="s3/uri", value=raw_data_location)
    tracker.log_input(name="ccdefault-train-dataset", media_type="s3/uri", value=train_data_location)
    tracker.log_input(name="ccdefault-test-dataset", media_type="s3/uri", value=test_data_location)  
preprocessing_trial_component = tracker.trial_component

### Train the Model

The same security postures we applied previously during SM Processing apply to training jobs. We will also have SageMaker experiments track the training job and store metadata such as model artifact location, training/validation data location, model hyperparameters etc.

Now you will kick off an XGBoost model training job directly on the processed data. You will use the Built in SageMaker algorithm.

In [None]:
# define the container image from th ebjl
image = sagemaker.image_uris.retrieve("xgboost", region, "1.2-1") #path to your ECR docker file

In [None]:
s3_input_train = sagemaker.session.s3_input(s3_data=f's3://{rawbucket}/{prefix}/train_data/train_data.csv', content_type='csv')
inputs = {'train': s3_input_train}

In [None]:
# specify some parameters for SageMaker Experiments to track your Experiments

trial_name = f"cc-fraud-training-job-{int(time.time())}"
cc_trial = Trial.create(
        trial_name=trial_name, 
            experiment_name=cc_experiment.experiment_name,
        sagemaker_boto_client=sm
    )

cc_trial.add_trial_component(preprocessing_trial_component)
cc_training_job_name = "cc-training-job-{}".format(int(time.time()))

experiment_config={
            "TrialName": cc_trial.trial_name, #log training job in Trials for lineage
            "TrialComponentDisplayName": "Train", #"Training",
        }

In [None]:
hyperparameters={
    "objective": "binary:logistic",
    "num_round": "100",
    "eval_metric": "auc"
}

xgb = sagemaker.estimator.Estimator(image,
                                    role, 
                                    instance_count=1, 
                                    instance_type='ml.m4.xlarge',
                                    output_path='s3://{}/{}/models'.format(rawbucket, prefix),
                                    hyperparameters=hyperparameters,
                                    sagemaker_session=sess,
                                    disable_profiler=True) # set to true for distributed training

xgb.fit(inputs = inputs, job_name=cc_training_job_name, experiment_config=experiment_config)

##  5. Traceability and Auditability from source control to Model artifacts

Having used SageMaker Experiments to track the training runs, we can now extract model metadata to get the entire lineage of the model from the source data to the model artifacts and the hyperparameters.

To do this, simply call the **describe_trial_component** API.

In [None]:
# Present the Model Lineage as a dataframe
from sagemaker.session import Session
session = boto3.Session()
lineage_table = ExperimentAnalytics(
    sagemaker_session=Session(session, sm), 
    search_expression={
        "Filters":[{
            "Name": "Parents.TrialName",
            "Operator": "Equals",
            "Value": trial_name
        }]
    },
    sort_by="CreationTime",
    sort_order="Ascending",
)
lineagedf= lineage_table.dataframe()

lineagedf

In [None]:
# get detailed information about a particular trial
sm.describe_trial_component(TrialComponentName=lineagedf.TrialComponentName[1])

##  Model Explainability

Model artifacts generated by a SageMaker Training job can be easily downloaded for further analysis. In this section, we'll use the [shap](https://github.com/slundberg/shap) python library to explain the output of the machine learning model that we've trained

In [None]:
# get model artifact from S3
import tarfile
import pickle
print(f"Downloading model artifact from {xgb.model_data}")
with s3.open(xgb.model_data, "rb") as f:
    tar = tarfile.open(fileobj=f, mode='r:gz')
    for item in tar:
        model = pickle.load(tar.extractfile(item.name))

In [None]:
import shap
shap.initjs()
from xgboost import DMatrix

model.feature_names = test_full.columns[1:]
explainer = shap.TreeExplainer(model)

shap_values = explainer.shap_values(train_full.iloc[:, 1:])

__MODEL FEATURES__
- LIMIT_BAL: Amount of given credit in NT dollars (includes individual and family/supplementary credit)
- SEX: Gender (1=male, 2=female)
- EDUCATION: (1=graduate school, 2=university, 3=high school, 4=others)
- MARRIAGE: Marital status (1=married, 2=single, 3=others)
- AGE: Age in years
- PAY_0: Repayment status in September, 2005 (0=pay duly, 1=payment delay for one month, 2=payment delay for two months, … 8=payment delay for eight months, 9=payment delay for nine months and above)
- PAY_2: Repayment status in August, 2005 (scale same as above)
- PAY_3: Repayment status in July, 2005 (scale same as above)
- PAY_4: Repayment status in June, 2005 (scale same as above)
- PAY_5: Repayment status in May, 2005 (scale same as above)
- PAY_6: Repayment status in April, 2005 (scale same as above)
- BILL_AMT1: Amount of bill statement in September, 2005 (NT dollar)
- BILL_AMT2: Amount of bill statement in August, 2005 (NT dollar)
- BILL_AMT3: Amount of bill statement in July, 2005 (NT dollar)
- BILL_AMT4: Amount of bill statement in June, 2005 (NT dollar)
- BILL_AMT5: Amount of bill statement in May, 2005 (NT dollar)
- BILL_AMT6: Amount of bill statement in April, 2005 (NT dollar)
- PAY_AMT1: Amount of previous payment in September, 2005 (NT dollar)
- PAY_AMT2: Amount of previous payment in August, 2005 (NT dollar)
- PAY_AMT3: Amount of previous payment in July, 2005 (NT dollar)
- PAY_AMT4: Amount of previous payment in June, 2005 (NT dollar)
- PAY_AMT5: Amount of previous payment in May, 2005 (NT dollar)
- PAY_AMT6: Amount of previous payment in April, 2005 (NT dollar)

In [None]:
# summary plot to get the most impactful features
shap.summary_plot(shap_values, train_full.iloc[:,1:], plot_type="bar")

In [None]:
# summary plot showing the marginal contribution of each feature to every prediction 
shap.summary_plot(shap_values, train_full.iloc[:,1:])

In [None]:
# depence plot with an automactially slected interaction feature
shap.dependence_plot("SEX", shap_values, train_full.iloc[:,1:])

In [None]:
# explain inidvidual prediction
shap.force_plot(explainer.expected_value, shap_values[0,:], train_full.iloc[0,1:], link="logit")

# Model Deployment

Now you will deploy the model to a hosted endpoint as well as run offline inference using SageMaker Batch Transform

## Batch Transform

Let's first use Batch Transform to run through inference for the test dataset. This is used for offline inference. 

In [None]:
test_data_no_label = test_full.drop(columns = ['Label'], axis=1)
label = test_full['Label']
test_data_no_label.to_csv('test_data_no_label.csv', index=False, header=False)
test_data_no_label.shape

In [None]:
test_data_nohead_location = sess.upload_data('test_data_no_label.csv', bucket=rawbucket, key_prefix=testdatanolabelprefix)

To run Batch Transform -- you simply need to call the Transfomer API below!

In [None]:
%%time

sm_transformer = xgb.transformer(1, 'ml.m5.xlarge', accept = 'text/csv')

# start a transform job
sm_transformer.transform(test_data_nohead_location, split_type='Line', content_type='text/csv',
                        experiment_config={
            "TrialName": cc_trial.trial_name, #log training job in Trials for lineage
            "TrialComponentDisplayName": "Batch-Transform", #"Training",
        })
sm_transformer.wait()

In [None]:
import json
import io
from urllib.parse import urlparse

def get_csv_output_from_s3(s3uri, file_name):
    parsed_url = urlparse(s3uri)
    bucket_name = parsed_url.netloc
    prefix = parsed_url.path[1:]
    s3 = boto3.resource('s3')
    obj = s3.Object(bucket_name, '{}/{}'.format(prefix, file_name))
    return obj.get()["Body"].read().decode('utf-8')

In [None]:
output = get_csv_output_from_s3(sm_transformer.output_path, 'test_data_no_label.csv.out')
output_df = pd.read_csv(io.StringIO(output), sep=",", header=None)
output_df.head(8)

In [None]:
from sklearn.metrics import confusion_matrix, accuracy_score

In [None]:
output_df['result'] = np.round(output_df[0])

In [None]:
print("Baseline Accuracy = {}".format(1- np.unique(data['Label'], return_counts=True)[1][1]/(len(data['Label']))))
print("Accuracy Score = {}".format(accuracy_score(label, output_df['result'])))

In [None]:
output_df['Predicted']=output_df['result'].values
output_df['Label'] = label
confusion_matrix = pd.crosstab(output_df['Predicted'], output_df['Label'], rownames=['Actual'], colnames=['Predicted'], margins = True)
confusion_matrix

## Model Deployment as Hosted Endpoint

Now we will deploy the model as a hosted endpoint. Again this is a single line of code!

In [None]:
# Import Model Monitor API
from sagemaker.model_monitor import DataCaptureConfig
from sagemaker.predictor import RealTimePredictor
from sagemaker.predictor import csv_serializer

### Create a SageMaker Model object

SageMaker model will package your trained artifacts into a container.

In [None]:
sm_client = boto3.client('sagemaker')

latest_training_job = sm_client.list_training_jobs(
    MaxResults=1,
    SortBy='CreationTime',
    SortOrder='Descending')

training_job_name = TrainingJobName = latest_training_job['TrainingJobSummaries'][0]['TrainingJobName']

training_job_description = sm_client.describe_training_job(TrainingJobName=training_job_name)

model_data = training_job_description['ModelArtifacts']['S3ModelArtifacts']
container_uri = training_job_description['AlgorithmSpecification']['TrainingImage']


In [None]:
xgb_model = sagemaker.model.Model(image_uri=container_uri,
                                 model_data = model_data,
                                 role =role,
                                 )

In [None]:
s3_capture_upload_path = 's3://{}/{}/monitoring/datacapture'.format(rawbucket, prefix)

data_capture_configuration = DataCaptureConfig(enable_capture =True,                                          
                                               sampling_percentage=100,
                                               destination_s3_uri=s3_capture_upload_path,
                                             capture_options=["Input", "Output"],
                                            csv_content_types= ["text/csv"],
                                            json_content_types= ["application/json"]
                                              )

### Deploy!

You can deploy your model with a single line of code

In [None]:
xgb_model.deploy(1, 'ml.m4.xlarge', data_capture_config=data_capture_configuration)

### Check the model deployed properly.

Replace "endpoint_name" with the name of your endpoint above

In [None]:
client = boto3.client('sagemaker-runtime')
endpoint_name = sm_client.list_endpoints(
    MaxResults=1,
    SortBy='CreationTime',
    SortOrder='Descending')['Endpoints'][0]['EndpointName']
print(endpoint_name)

!head -10 test_data.csv > test_sample.csv

with open('test_sample.csv', 'r') as f:
    for row in f:
        payload = row.rstrip('\n')
        response = client.invoke_endpoint(
        EndpointName= endpoint_name,
        Body= payload[2:],
        ContentType = 'text/csv')
        sleep(0.5)
print('done!')

In [None]:
# Extract the captured json files.
data_capture_prefix = '{}/monitoring/datacapture'.format(prefix)
s3_client = boto3.Session().client('s3')
current_endpoint_capture_prefix = '{}/{}/AllTraffic'.format(data_capture_prefix, endpoint_name)
result = s3_client.list_objects(Bucket=rawbucket, Prefix=current_endpoint_capture_prefix)
capture_files = [capture_file.get("Key") for capture_file in result.get('Contents')]
print("Found Capture Files:")
print("\n ".join(capture_files))

capture_files[0]

if you get an error above --it takes some time for the data to be captured into the S3 bucket. Rerun this cell above again in a minute or so.

In [None]:
# View contents of the captured file.
def get_obj_body(bucket, obj_key):
    return s3_client.get_object(Bucket=rawbucket, Key=obj_key).get('Body').read().decode("utf-8")

capture_file = get_obj_body(rawbucket, capture_files[-1])
print(json.dumps(json.loads(capture_file.split('\n')[5]), indent = 2, sort_keys =True))


In [None]:
from sagemaker.model_monitor import DefaultModelMonitor
from sagemaker.model_monitor.dataset_format import DatasetFormat

my_default_monitor = DefaultModelMonitor(
    role=role,
    instance_count=1,
    instance_type="ml.m5.xlarge",
    volume_size_in_gb=20,
    max_runtime_in_seconds=3600,
)

my_default_monitor.suggest_baseline(
    baseline_dataset=f"s3://{rawbucket}/{prefix}/train_data/train_data.csv",
    dataset_format=DatasetFormat.csv(header=False),
    output_s3_uri=f"s3://{rawbucket}/{prefix}/baseline_mm/",
    wait=True
)

## Delete Underlying Resources

Make sure to delete your endpoint once you are done. Other resources such as instances for training and processing are automatically deleted by SageMaker.

In [None]:
sm.delete_endpoint(EndpointName = endpoint_name)