# Introduction to Training and Deploying Models with Amazon SageMaker

This unit provides a comprehensive introduction to training and deploying machine learning models using Amazon SageMaker. Here's a summary of the key concepts and steps covered in this unit:

- Setting Up SageMaker Resources
- Preparing Data
- Configuring the Estimator
- Training the Model
- Deploying the Model
- Endpoint Management

Throughout the unit, the focus is on keeping the data simple to concentrate on the tools and processes involved in model training and deployment with Amazon SageMaker. The unit aims to equip learners with the foundational skills needed to leverage SageMaker for efficient and scalable machine learning tasks, from data preparation to model deployment and management.

The Iris dataset is selected for training in this example due to its simplicity and the clarity it provides in demonstrating machine learning concepts. It's a widely recognized toy dataset, ideal for focusing on the functionality and capabilities of the tools available in Amazon SageMaker without the complexity of larger datasets. This choice allows for an emphasis on the process and techniques of machine learning, rather than the intricacies of data preprocessing and analysis.

In [1]:
from utils.helpers import get_secret
from utils.toy_datasets import upload_dataset_to_s3

In [2]:
s3_bucket_uri = get_secret('s3_bucket_uri')
s3_bucket_name = get_secret('s3_bucket_name')

dataset_name = 'iris'
upload_dataset_to_s3(dataset_name, s3_bucket_name)

Files uploaded to S3 successfully.
Cleanup complete!


`sagemaker.Session()` initializes a new Amazon SageMaker session. This session acts as the **main interface for managing interactions with the Amazon SageMaker environment and AWS services**. It encapsulates the configuration and state for the operations you perform in SageMaker, such as training models, deploying endpoints, and accessing data in S3. By creating a SageMaker session, you gain a structured way to manage resources and execute SageMaker tasks within a specific AWS context, leveraging SageMaker's capabilities and services efficiently.

In [3]:
import sagemaker

session = sagemaker.Session()
role = get_secret('role_arn')

sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /home/ec2-user/.config/sagemaker/config.yaml


As its name indicates, `sagemaker.image_uris.retrieve` pulls the Docker image URI containing the XGBoost model. This URI points to a pre-built image that allows for the deployment or training of machine learning models using XGBoost. It's important to note that SageMaker offers a variety of [other model images](https://docs.aws.amazon.com/sagemaker/latest/dg/notebooks-available-images.html) as well, which can be found and retrieved in a similar manner.

Not all versions of a model are available in every AWS region. Availability can vary, and Amazon SageMaker provides [documentation](https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/image_uri_config/xgboost.json) that lists which versions of models are accessible in which regions. 

In [4]:
image_uri = sagemaker.image_uris.retrieve('xgboost', region='us-east-1', version='1.5-1')

The `Estimator` object in Amazon SageMaker is a high-level interface designed to handle the deployment of models for training and prediction. In this snippet, an Estimator is created for the purpose of training a machine learning model, with several key parameters specified to configure the training environment:

- `image_uri` specifies the Docker container image containing the model algorithm, in this case, an XGBoost model.
- `role` is the AWS IAM role that SageMaker assumes to perform tasks on your behalf, such as accessing data in S3.
- `instance_count=1` indicates that one instance will be used for the training job, suitable for handling the Iris dataset's size.
- `instance_type="ml.m5.large"` defines the type of computing instance to use, balancing cost and computational capability for the task.
- `output_path` sets the S3 location where the trained model artifacts will be stored.
- `sagemaker_session` links the estimator to the current SageMaker session, facilitating access to AWS resources and management of the training job.

In [5]:
estimator = sagemaker.estimator.Estimator(
    image_uri=image_uri,
    role=role,
    instance_count=1,
    instance_type="ml.m5.large",
    output_path=f"{s3_bucket_uri}/pipelines-output",
    sagemaker_session=session
)

[Here](https://docs.aws.amazon.com/sagemaker/latest/dg/xgboost_hyperparameters.html) is the list of hyperparameters that can be configured for the XGBoost.

In [6]:
estimator.set_hyperparameters(
    max_depth=5,
    objective='multi:softmax',
    num_class=3,
    num_round=10
)

The `TrainingInput` class is used to specify the location and format of the data stored in S3. This is important for informing SageMaker about where to find the data and how to interpret it during the training process.

*In the context of preparing data for training with Amazon SageMaker, it's important to note that, by default, SageMaker expects the target variable (or label) to be in the **first column** of the dataset. This convention applies when using built-in algorithms provided by SageMaker, where the CSV files used for training and validation should be formatted accordingly. The first column should contain the labels for each entry, and the subsequent columns should contain the features.*

More info about common data formats for training [here](https://docs.aws.amazon.com/sagemaker/latest/dg/cdf-training.html).

In [7]:
from sagemaker.inputs import TrainingInput

s3_train = TrainingInput(
    s3_data=f's3://{s3_bucket_name}/iris_dataset/train_data.csv',
    content_type='csv'
)

s3_validate = TrainingInput(
    s3_data=f's3://{s3_bucket_name}/iris_dataset/test_data.csv',
    content_type='csv'
)

In the context of Amazon SageMaker, the `estimator.fit()` method initiates a *training job* on the cloud infrastructure, using the machine specifications defined in the estimator configuration (i.e., `ml.m5.large` in this case). This method call leverages the setup previously defined by the Estimator object, including the choice of machine learning algorithm, the AWS compute instance type, and other training parameters. By passing a dictionary with keys `'train'` and `'validation'`, the method is directed to use the specified S3 locations for training and validation data, respectively. During a training job, [you are only billed for the time spent training](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_DescribeTrainingJob.html#:~:text=You%20are%20billed%20for%20the%20time%20interval%20between%20the%20value%20of%20TrainingStartTime%20and%20this%20time.).

In [8]:
estimator.fit({
    'train': s3_train,
    'validation': s3_validate
})

INFO:sagemaker:Creating training-job with name: sagemaker-xgboost-2024-02-06-15-44-14-363


2024-02-06 15:44:14 Starting - Starting the training job...
2024-02-06 15:44:30 Starting - Preparing the instances for training......
2024-02-06 15:45:36 Downloading - Downloading input data......
  from pandas import MultiIndex, Int64Index[0m
[34m[2024-02-06 15:47:21.582 ip-10-0-128-70.ec2.internal:7 INFO utils.py:28] RULE_JOB_STOP_SIGNAL_FILENAME: None[0m
[34m[2024-02-06 15:47:21.603 ip-10-0-128-70.ec2.internal:7 INFO profiler_config_parser.py:111] User has disabled profiler.[0m
[34m[2024-02-06:15:47:21:INFO] Imported framework sagemaker_xgboost_container.training[0m
[34m[2024-02-06:15:47:21:INFO] Failed to parse hyperparameter objective value multi:softmax to Json.[0m
[34mReturning the value itself[0m
[34m[2024-02-06:15:47:21:INFO] No GPUs detected (normal if no gpus installed)[0m
[34m[2024-02-06:15:47:21:INFO] Running XGBoost Sagemaker in algorithm mode[0m
[34m[2024-02-06:15:47:21:INFO] Determined 0 GPU(s) available on the instance.[0m
[34m[2024-02-06:15:47:21:INF

In the following snippet, the `estimator.deploy()` method is used to deploy the trained model to an endpoint on Amazon SageMaker, making it available for **real-time predictions**. This deployment process involves specifying the configuration for the endpoint:

- `initial_instance_count=1`: This specifies that one instance of the specified type will be used to host the model. This is typically sufficient for development or light production workloads.
- `instance_type='ml.t2.medium'`: This sets the type of the AWS compute instance that will serve the model. You can consult the available instance types and their pricing [here](https://aws.amazon.com/sagemaker/pricing/). Also make sure that you have a [service quota](https://docs.aws.amazon.com/servicequotas/latest/userguide/intro.html) for endpoint creation.
- `endpoint_name='iris-endpoint'`: This assigns a unique name to the endpoint, allowing it to be easily identified and accessed for making predictions.
- `serializer=CSVSerializer()`: This parameter specifies how input data should be serialized before being sent to the model for inference. The CSVSerializer converts input data into CSV format, which is required by many of SageMaker's built-in algorithms like the XGBoost.

The deploy method effectively creates a fully managed, scalable endpoint for your model, abstracting away the infrastructure management and allowing you to focus on consuming the model's predictions. Once deployed, the predictor object can be used to make real-time predictions by sending data to the iris-endpoint and receiving the model's output.

In [9]:
from sagemaker.serializers import CSVSerializer

predictor = estimator.deploy(
    initial_instance_count=1,
    instance_type='ml.t2.medium',
    endpoint_name='iris-endpoint',
    serializer=CSVSerializer()
)

INFO:sagemaker:Creating model with name: sagemaker-xgboost-2024-02-06-15-47-56-749
INFO:sagemaker:Creating endpoint-config with name iris-endpoint
INFO:sagemaker:Creating endpoint with name iris-endpoint


--------!

The `predict()`method on is used to generate predictions based on a set of input features. By default, if a deserializer was not specified when creating the endpoint, the endpoint will return the prediction response in bytes.

In [10]:
predictor.predict([7.2, 3, 6, 1.6])

b'2.0\n'

In [11]:
predictor.predict([7.2, 3, 6, 1.6]).decode('utf-8').strip()

'2.0'

**IMPORTANT!**

When you deploy a model to an endpoint in Amazon SageMaker using the `estimator.deploy()` method, as shown in the code snippet, **the deployed endpoint is live and incurs charges until you explicitly shut it down**. While the code assigns the live endpoint to a variable named predictor, it's important to understand the lifecycle and cost implications of such a deployment.

The endpoint, once deployed, continues to run and serve inference requests until it is manually stopped. To avoid incurring unnecessary charges, you should delete the endpoint when it is no longer needed. There are two primary ways to shut down a SageMaker endpoint:

1. Programmatically deleting the endpoint by calling the `delete_endpoint()` method on the predictor object.
2. Using the AWS Management Console, navigating to SageMaker service, then go to the "Inference" section and select "Endpoints". From there, you can find the endpoint you wish to delete, select it, and click the "Delete" action.

![Delete endpoint via the AWS Management Console](./img/delete_endpoint.png)

In [12]:
predictor.delete_endpoint()

INFO:sagemaker:Deleting endpoint configuration with name: iris-endpoint
INFO:sagemaker:Deleting endpoint with name: iris-endpoint
