# Part 1: Packaging and Uploading your Algorithm for use with Amazon SageMaker

### An overview of Docker

If you're familiar with Docker already, you can skip ahead to the next section.

For many data scientists, Docker containers are a new concept, but they are not difficult, as you'll see here. 

Docker provides a simple way to package arbitrary code into an _image_ that is totally self-contained. Once you have an image, you can use Docker to run a _container_ based on that image. Running a container is just like running a program on the machine except that the container creates a fully self-contained environment for the program to run. Containers are isolated from each other and from the host environment, so the way you set up your program is the way it runs, no matter where you run it.

Docker is more powerful than environment managers like conda or virtualenv because (a) it is completely language independent and (b) it comprises your whole operating environment, including startup commands, environment variable, etc.




In some ways, a Docker container is like a virtual machine, but it is much lighter weight. 

For example, a program running in a container can start in less than a second and many containers can run on the same physical machine or virtual machine instance.


## Docker File

Docker uses a simple file called a `Dockerfile` to specify how the image is assembled. 

We'll see an example of that below. You can build your Docker images based on Docker images built by yourself or others, which can simplify things quite a bit.

Docker has become very popular in the programming and devops communities for its flexibility and well-defined specification of the code to be run. It is the underpinning of many services built in the past few years, such as [Amazon ECS].

### Amazon SageMaker

Amazon SageMaker uses Docker to allow users to train and deploy arbitrary algorithms.

In Amazon SageMaker, Docker containers are invoked in a certain way for training and a slightly different way for hosting. The following sections outline how to build containers for the SageMaker environment.

Some helpful links:

* [Docker home page](http://www.docker.com)
* [Getting started with Docker](https://docs.docker.com/get-started/)
* [Dockerfile reference](https://docs.docker.com/engine/reference/builder/)
* [`docker run` reference](https://docs.docker.com/engine/reference/run/)

[Amazon ECS]: https://aws.amazon.com/ecs/



# How Amazon SageMaker runs your Docker container

In Amazon SageMaker, Docker containers are utilized differently for training and hosting purposes. This distinction is crucial for understanding how to build and use containers in the SageMaker environment:

1. Training:
   - When used for training, the container is invoked with the 'train' argument.
   - It's responsible for running the machine learning algorithm on the input data.
   - The container needs to handle tasks such as data preprocessing, model training, and saving the trained model.

2. Hosting (Inference):
   - For hosting, the container is invoked with the 'serve' argument.
   - It's used to deploy the trained model and serve predictions.
   - The container should be able to load the trained model and process incoming requests for predictions.

The following sections will provide detailed information on how to construct Docker containers 
that can effectively operate within the SageMaker ecosystem, addressing both the training and hosting scenarios.
This includes specifics on directory structures, input/output handling, and best practices for each use case.


 
 ### How Amazon SageMaker runs your Docker container
 
 Amazon SageMaker uses Docker containers to run your machine learning algorithms for both training and hosting (inference) purposes. This section explains in detail how SageMaker interacts with your Docker container.

 
 ### Container Execution
 
 When SageMaker runs your container, it passes either `train` or `serve` as an argument. This argument determines whether the container should perform training or serving tasks. The way your container handles this argument can vary:


 
 1. **No ENTRYPOINT defined:**
    - If you don't specify an `ENTRYPOINT` in your Dockerfile, SageMaker will execute the container as follows:
      - For training: `docker run <your-image> train`
      - For serving: `docker run <your-image> serve`
    - In this case, `train` and `serve` should be executable programs (e.g., Python scripts) within your container.
    - Example:
      ```
      # At training time
      $ ./train

      # At serving time
      $ ./serve
      ```

 
 2. **ENTRYPOINT defined:**
    - If you specify an `ENTRYPOINT` in your Dockerfile, that program will be executed when the container starts.
    - SageMaker will pass `train` or `serve` as the first argument to this program.
    - Your entrypoint program should examine this argument and decide how to proceed.
    - Example Dockerfile:
      ```dockerfile
      ENTRYPOINT ["python", "my_script.py"]
      ```
    - In this case, `my_script.py` would receive `train` or `serve` as its first command-line argument:
      ```python
      import sys

      if sys.argv[1] == 'train':
          # Execute training code
      elif sys.argv[1] == 'serve':
          # Execute serving code
    ```


 
 3. **Separate containers for training and hosting:**
    - You can create distinct containers for training and hosting.
    - In this scenario, you can define an `ENTRYPOINT` that's specific to each container's purpose.
    - The `train` or `serve` argument can be ignored or used for verification.
    - Example Dockerfile for a training-only container:
      ```dockerfile
      ENTRYPOINT ["python", "train.py"]
      ```
    - Example Dockerfile for a serving-only container:
      ```dockerfile
      ENTRYPOINT ["python", "serve.py"]
      ```


 
 By understanding these execution patterns, you can design your Docker containers to work seamlessly with Amazon SageMaker, whether for training, inference, or both.


#### Running your container during training

When Amazon SageMaker runs training, your `train` script is run just like a regular Python program. A number of files are laid out for your use, under the `/opt/ml` directory:

    /opt/ml
    |-- input
    |   |-- config
    |   |   |-- hyperparameters.json
    |   |   `-- resourceConfig.json
    |   `-- data
    |       `-- <channel_name>
    |           `-- <input data>
    |-- model
    |   `-- <model files>
    `-- output
        `-- failure


##### The input

---

`/opt/ml/input/config` contains information to control how your program runs. `hyperparameters.json` is a JSON-formatted dictionary of hyperparameter names to values. 

These values will always be strings, so you may need to convert them. 

`resourceConfig.json` is a JSON-formatted file that describes the network layout used for distributed training. 

Since scikit-learn doesn't support distributed training, we'll ignore it here.

---


`/opt/ml/input/data/<channel_name>/` (for File mode) contains the input data for that channel. 

The channels are created based on the __call to CreateTrainingJob__ but it's generally important that channels match what the algorithm expects. The files for each channel will be copied from S3 to this directory, preserving the tree structure indicated by the S3 key structure. 



Explanation:
This paragraph describes the structure and purpose of the input data directory in Amazon SageMaker's File mode. Here's a breakdown of the key concepts:

1. File mode: This is one of the data input modes in SageMaker, where data is downloaded to the container's file system before training starts.

2. Channel: In SageMaker, a channel is a named input source for the training data. Different channels can be used for different types of input data (e.g., training data, validation data).

3. CreateTrainingJob: This is an API call in SageMaker that initiates a training job. It's where you specify the input channels and their configurations.

4. S3 to container copying: SageMaker automatically copies the data from the specified S3 locations to the appropriate directories in the container.

5. Preserving directory structure: The hierarchy of files and folders in S3 is maintained when copied to the container, which can be useful for organizing different types or subsets of data.



6. Matching algorithm expectations: It's crucial that the channel names and data organization align with what your training algorithm is designed to use, ensuring smooth data ingestion during the training process.

For example, if your training script expects data in a specific format and location, you need to ensure that the data is organized accordingly in your S3 bucket and that you specify the correct channel names when creating the training job.

Here's a Python code example of how you might set up your training script to read data from the expected channels:

```python
# Example 1: Simple training with one channel
estimator = sagemaker.estimator.Estimator(
    # ... other parameters ...
)

estimator.fit({
    'train': 's3://my-bucket/path/to/train/data'
})

# In this case, your training script should expect data in /opt/ml/input/data/train/

# Example 2: Multiple channels for train and validation data
estimator.fit({
    'train': 's3://my-bucket/path/to/train/data',
    'validation': 's3://my-bucket/path/to/validation/data'
})

# Your training script should now look for:
# - Training data in /opt/ml/input/data/train/
# - Validation data in /opt/ml/input/data/validation/

# Example 3: Custom channel names
estimator.fit({
    'time_series': 's3://my-bucket/path/to/time_series/data',
    'static_features': 's3://my-bucket/path/to/static_features/data'
})

# Your training script should expect:
# - Time series data in /opt/ml/input/data/time_series/
# - Static features in /opt/ml/input/data/static_features/

# Example 4: Accessing data in your training script
import os
import pandas as pd

def train():
    train_data_path = '/opt/ml/input/data/train'
    train_files = [os.path.join(train_data_path, file) for file in os.listdir(train_data_path)]
    
    df = pd.concat([pd.read_csv(file) for file in train_files])
    # ... rest of your training code ...

# This script will work with the data structure created by Example 1
```
---





`/opt/ml/input/data/<channel_name>_<epoch_number>` (for Pipe mode) is the pipe for a given epoch. Epochs start at zero and go up by one each time you read them. There is no limit to the number of epochs that you can run, but you must close each pipe before reading the next epoch.



##### The output

* `/opt/ml/model/` is the directory where you write the model that your algorithm generates. Your model can be in any format that you want. It can be a single file or a whole directory tree. SageMaker will package any files in this directory into a compressed tar archive file. This file will be available at the S3 location returned in the `DescribeTrainingJob` result.
* `/opt/ml/output` is a directory where the algorithm can write a file `failure` that describes why the job failed. The contents of this file will be returned in the `FailureReason` field of the `DescribeTrainingJob` result. For jobs that succeed, there is no reason to write this file as it will be ignored.




#### Running your container during hosting

Hosting has a very different model than training because hosting is responding to inference requests that come in via HTTP. 

In this example, we use our recommended Python serving stack to provide robust and scalable serving of inference requests:



![Request serving stack](stack.png)



This stack is implemented in the sample code here and you can mostly just leave it alone. 

Amazon SageMaker uses two URLs in the container:

* `/ping` will receive `GET` requests from the infrastructure. Your program returns 200 if the container is up and accepting requests.
* `/invocations` is the endpoint that receives client inference `POST` requests. The format of the request and the response is up to the algorithm. If the client supplied `ContentType` and `Accept` headers, these will be passed in as well. 



The container will have the model files in the same place they were written during training:

    /opt/ml
    `-- model
        `-- <model files>


### The parts of the sample container

In the `container` directory are all the components you need to package the sample algorithm for Amazon SageMager:

    .
    |-- Dockerfile
    |-- build_and_push.sh
    |-- local_test
    `-- app_name (decision_trees)
        |-- nginx.conf
        |-- predictor.py
        |-- serve
        |-- train
        `-- wsgi.py



Let's discuss each of these in turn:

* __`Dockerfile`__ describes how to build your Docker container image. More details below.
* __`build_and_push.sh`__ is a script that uses the Dockerfile to build your container images and then pushes it to ECR. We'll invoke the commands directly later in this notebook, but you can just copy and run the script for your own algorithms.
* __`(app_name) decision_trees`__ is the directory which contains the files that will be installed in the container.
* __`local_test`__ is a directory that shows how to test your new container on any computer that can run Docker, including an Amazon SageMaker notebook instance. Using this method, you can quickly iterate using small datasets to eliminate any structural bugs before you use the container with Amazon SageMaker. We'll walk through local testing later in this notebook.



In this simple application, we only install five files in the container. You may only need that many or, if you have many supporting routines, you may wish to install more. These five show the standard structure of our Python containers, although you are free to choose a different toolset and therefore could have a different layout. If you're writing in a different programming language, you'll certainly have a different layout depending on the frameworks and tools you choose.



The files that we'll put in the container are:

* __`nginx.conf`__ is the configuration file for the nginx front-end. Generally, you should be able to take this file as-is.
* __`predictor.py`__ (`app.py`) is the program that actually implements the Flask web server and the decision tree predictions for this app. You'll want to customize the actual prediction parts to your application. Since this algorithm is simple, we do all the processing here in this file, but you may choose to have separate files for implementing your custom logic.


* __`serve`__ is the program started when the container is started for hosting. It simply launches the gunicorn server which runs multiple instances of the Flask app defined in `predictor.py`. You should be able to take this file as-is.


* __`train`__ is the program that is invoked when the container is run for training. You will modify this program to implement your training algorithm.


* __`wsgi.py`__ is a small wrapper used to invoke the Flask app. You should be able to take this file as-is.



In summary, the two files you will probably want to change for your application are `train` and `predictor.py`.

### The Dockerfile

The Dockerfile describes the image that we want to build. You can think of it as describing the complete operating system installation of the system that you want to run. A Docker container running is quite a bit lighter than a full operating system, however, because it takes advantage of Linux on the host machine for the basic operations. 

For the Python science stack, we will start from a standard Ubuntu installation and run the normal tools to install the things needed by scikit-learn. Finally, we add the code that implements our specific algorithm to the container and set up the right environment to run under.

Along the way, we clean up extra space. This makes the container smaller and faster to start.

Let's look at the Dockerfile for the example:

## Code Cell 1

In [1]:
!cat container/Dockerfile

# Build an image that can do training and inference in SageMaker
# This is a Python 3 image that uses the nginx, gunicorn, flask stack
# for serving inferences in a stable way.

FROM public.ecr.aws/bitnami/python:3.7

MAINTAINER Amazon AI <sage-learner@amazon.com>


RUN apt-get -y update && apt-get install -y --no-install-recommends \
         wget \
         python3-pip \
         python3-setuptools \
         nginx \
         ca-certificates \
    && rm -rf /var/lib/apt/lists/*

RUN ln -s /usr/bin/python3 /usr/bin/python
RUN ln -sf /usr/bin/pip3 /usr/bin/pip

# Here we get all python packages.
# There's substantial overlap between scipy and numpy that we eliminate by
# linking them together. Likewise, pip leaves the install caches populated which uses
# a significant amount of space. These optimizations save a fair amount of space in the
# image, which reduces start up time.
RUN pip --no-cache-dir install numpy==1.16.2 scipy==1.2.1 scikit-learn==0.20.2 pandas flask gunicorn

# Set so

### Building and registering the container

The following shell code shows how to build the container image using `docker build` and push the container image to ECR using `docker push`. 

This code is also available as the shell script `container/build-and-push.sh`, which you can run as `build-and-push.sh linear-regression_sample` to build the image `linear-regression_sample`. 



This code looks for an ECR repository in the account you're using and the current default region 

(if you're using a SageMaker notebook instance, this will be the region where the notebook instance was created). 

If the repository doesn't exist, the script will create it.

## Code Cell 2

In [None]:
# Install sm-docker and other necessary tools. 
# To install, simply use pip install within your notebook environment
!pip install setuptools
!pip install sagemaker-studio-image-build
!pip install matplotlib

## Code Cell 3

In [None]:
%%sh

# The name of our algorithm
# This line sets a shell variable 'algorithm_name' to 'sagemaker-linear-regression'
# This will be used as the base name for our Docker image
algorithm_name=sagemaker-linear-regression

# Change directory to the 'container' folder
# This is where our Dockerfile and other necessary files are located
cd container

# Make the 'train' script executable
# This script will be used to train our model inside the Docker container
chmod +x linear_regression/train

# Make the 'serve' script executable
# This script will be used to serve predictions from our model inside the Docker container
chmod +x linear_regression/serve

# The name of the image
# This creates the full name for our Docker image, appending ':latest' to indicate it's the most recent version
fullname="${algorithm_name}:latest"

# Use sagemaker-studio-image-build (sm-docker) to create and push the Docker image
# This command builds the Docker image and pushes it to Amazon ECR
# Parameters:
#   . : Build context (current directory)
#   --role sagemaker_studio_role : IAM role to use for building and pushing the image
#   --repository ${fullname} : The name of the ECR repository to push the image to
#   --bucket <Enter_Your_Bucket_Name> : S3 bucket to store temporary files during the build process
sm-docker build . --role sagemaker_studio_role --repository ${fullname} \
    --bucket <Enter_Your_Bucket_Name>

# When this script is run, the following actions occur:
# 1. The sm-docker command is executed, which is part of the sagemaker-studio-image-build tool.
# 2. It builds a Docker image from the current directory (.) using the Dockerfile present there.
# 3. The --role parameter specifies the IAM role to be used for building and pushing the image.
# 4. The --repository parameter sets the name of the ECR repository where the image will be pushed.
# 5. The --bucket parameter specifies the S3 bucket to store temporary files during the build process.
# 6. The image is built based on the instructions in the Dockerfile.
# 7. Once built, the image is automatically pushed to the specified ECR repository.
# 8. Any temporary files created during the build process are stored in the specified S3 bucket.

# Note: You need to replace <Enter_Your_Bucket_Name> with an actual S3 bucket name for this to work.

# Part 2: Review and prepare your data

Your data is loaded in the data/ folder. Review the data and split them into training and test sets.


## Code Cell 4

In [2]:
# Import Pandas and Numpy
# To learn more about Pandas, go to: https://pandas.pydata.org
# To learn more about Numpy, go to https://numpy.org/
import os
import pandas as pd
import numpy as np

This dataset has five cloumns. This datasets were created for the demostration purpose only.

* `book_id` - Unique id number of the book
* `customer_ratings` - In the scale of 1-5, received customer ratings.
* `helpful_votes` - The ratings that were postive to the product.
* `total_votes` - Total number of ratings
* `price` - Listed price

## Code Cell 5

In [3]:
# Read the file book_data.csv as a dataFrame
df = pd.read_csv('data/book_data.csv')

# Print the first 5 lines of the dataFrame.
df.head()

Unnamed: 0,book_id,customer_ratings,helpful_votes,total_votes,price
0,48541186,5,16,20,2.638007
1,52253037,5,1,1,2.662567
2,52534781,4,16,19,2.531479
3,25947084,5,2,2,2.572494
4,37527885,5,1,1,2.662567


An important part of training a machine learning model is splitting the data into training, validation, and test subsets. You will use the `train_test_split()` function from the `sklearn` library ([documentation](https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html)).

## Code Cell 6

In [4]:
# Import train_test_split module from Scikit-learn 
# To learn more about data split, 
# go to https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html

from sklearn.model_selection import train_test_split
train, test = train_test_split(df,test_size=0.2)

print(f"Length of train_data is: {train.shape}")
print(f"Length of test_data is: {test.shape}")

Length of train_data is: (3999, 5)
Length of test_data is: (1000, 5)


## Code Cell 7

In [5]:
# Converting the array to the data frame
train_df = pd.DataFrame(train)
test_df = pd.DataFrame(test)

# Save the train_df frames as train_data.csv
train_df.to_csv('data/train_data.csv', header=False, index = False)


# Save the test_df frames as test_data.csv
test_df.to_csv('data/test_data.csv', header=False, index = False)

## Code Cell 8

In [6]:
%%sh

# Move the train and test data to data/train and data/test folder respectively

cd data/
mkdir test
mkdir train
mv train_data.csv train/
mv test_data.csv test/

# Part 3: Using your Algorithm in Amazon SageMaker

Once you have your container packaged and data reviewed, you can use it to train models and use the model for hosting .

## Set up the environment

Here we specify a bucket to use and the role that will be used for working with SageMaker.

## Code Cell 9

In [None]:
# S3 bucket name and prefix
bucket_name = '<Enter_Your_Bucket_Name>'
prefix = "DEMO-scikit-book-price"

# Import boto3 and define IAM role
import boto3
from sagemaker import get_execution_role

role = get_execution_role()

## Create the session

The session remembers our connection parameters to SageMaker. We'll use it to perform all of our SageMaker operations.

## Code Cell 10

In [None]:
import sagemaker as sage
from time import gmtime, strftime

sess = sage.Session()

## Upload the data for training

When training large models with huge amounts of data, you'll typically use big data tools, like Amazon Athena, AWS Glue, or Amazon EMR, to create your data in S3. For the purposes of this example, we're using some the classic [Iris dataset](https://en.wikipedia.org/wiki/Iris_flower_data_set), which we have included. 

We can use use the tools provided by the SageMaker Python SDK to upload the data to a default bucket. 

## Code Cell 11

In [None]:
WORK_DIRECTORY = "data/train"

data_location = sess.upload_data(WORK_DIRECTORY, bucket=bucket_name, key_prefix=prefix)

# This code uploads the training data to an S3 bucket for use in SageMaker training.
# Here's what's happening:

# 1. WORK_DIRECTORY is set to "data/train", which is likely the local directory 
#    containing the training data files.

# 2. sess.upload_data() is a method provided by the SageMaker Python SDK that:
#    - Takes the local directory specified by WORK_DIRECTORY
#    - Uploads all files in this directory to the specified S3 bucket
#    - Uses the specified key_prefix to organize the files in the S3 bucket

# 3. The upload_data() method returns the S3 URI where the data is stored, 
#    which is then assigned to data_location.

# Behind the scenes, this code will:
# - Connect to your AWS account using your credentials
# - Create the specified S3 bucket if it doesn't already exist
# - Upload all files from the WORK_DIRECTORY to this bucket
# - Organize the uploaded files under the specified prefix in the bucket

# The resulting data_location will be an S3 URI that looks like:
# s3://<bucket_name>/<prefix>/data/train/
# This URI can then be used in subsequent steps to tell SageMaker where to find 
# the training data when setting up a training job.

## Create an estimator and fit the model

In order to use SageMaker to fit our algorithm, we'll create an `Estimator` that defines how to use the container to train. This includes the configuration we need to invoke SageMaker training:

* The __container name__. This is constructed as in the shell commands above.
* The __role__. As defined above.
* The __instance count__ which is the number of machines to use for training.
* The __instance type__ which is the type of machine to use for training.
* The __output path__ determines where the model artifact will be written.
* The __session__ is the SageMaker session object that we defined above.

Then we use fit() on the estimator to train against the data that we uploaded above.

## Code Cell 12

In [None]:
output_path=f"s3://{bucket_name}/output"
print(output_path)

## Code Cell 13

In [None]:
account = sess.boto_session.client("sts").get_caller_identity()["Account"]
region = sess.boto_session.region_name
image = f"{account}.dkr.ecr.{region}.amazonaws.com/sagemaker-linear-regression:latest"
print(image)

linear_regression = sage.estimator.Estimator(
    image,
    role,
    1,
    "ml.m4.xlarge",
    output_path=f"s3://{bucket_name}/output",
    sagemaker_session=sess,
)



In [None]:
# The fit() function initiates the training process for the linear regression model
# Here's what happens behind the scenes:

# 1. SageMaker prepares the training job:
#    - It sets up the specified EC2 instance (ml.m4.xlarge in this case)
#    - It pulls the Docker image we built earlier from ECR

# 2. SageMaker copies the training data from the specified S3 location (data_location)
#    to the EC2 instance's local storage

# 3. SageMaker runs the training script (train) inside the Docker container:
#    - The script loads the data
#    - It trains the linear regression model using the data
#    - It saves the trained model artifacts

# 4. After training completes, SageMaker copies the model artifacts to the
#    specified S3 output location (s3://{bucket_name}/output)

# 5. SageMaker terminates the EC2 instance

# 6. The fit() function returns, and the trained model is now ready for deployment

linear_regression.fit(data_location)

## Hosting your model
You can use a trained model to get real time predictions using HTTP endpoint. Follow these steps to walk you through the process.

### Deploy the model

Deploying the model to SageMaker hosting just requires a `deploy` call on the fitted model. This call takes an instance count, instance type, and optionally serializer and deserializer functions. These are used when the resulting predictor is created on the endpoint.

## Code Cell 14

In [None]:
from sagemaker.serializers import CSVSerializer

# Deploy the model
predictor = linear_regression.deploy(1, "ml.m5.xlarge", serializer=CSVSerializer())

# This code deploys a trained machine learning model using SageMaker. Here's a step-by-step explanation:

# 1. Import the CSVSerializer from the sagemaker.serializers module.
#    This serializer is used to convert data into CSV format, which is required for SageMaker hosting.

# 2. Deploy the model:
#    - 1 is the number of instances to deploy.
#    - "ml.m5.xlarge" is the instance type to use for hosting.

# Part 4: Validate the Endpoint

### Use test data for a prediction

In order to do some predictions, we'll use test data and do predictions against it. This is a way to see how the mechanism works.

## Code Cell 15

In [None]:
test_data = pd.read_csv("data/test/test_data.csv")

## Code Cell 16

In [None]:
x_data = test_data.iloc[:,1:4]
y_data = test_data.iloc[:,-1]

Prediction is as easy as calling predict with the predictor we got back from deploy and the data we want to do predictions with. The serializers take care of doing the data conversions for us.

## Code Cell 17

In [None]:
predictions = predictor.predict(x_data.values).decode("utf-8")

## Code Cell 18

In [None]:
# Converting the recived prediction results into a list.

def Convert(string):
    l = list(string.split("\n"))
    return l

predictions = Convert(predictions)

# Removing the last empty value
predictions.pop()

# Print the predicton value as a list
print(predictions)

#### Review the predicted result and the test result in a scatter plot

Scatter plot is a graph in which the values of two variables are plotted along two axes, the pattern of the resulting points revealing any correlation present.

## Code Cell 19

In [None]:
%matplotlib inline
import matplotlib.pyplot as plt

## Code Cell 20

In [None]:
# Convert the y_data array to a list
y_list = y_data.tolist()

## Code Cell 21

In [None]:
plt.scatter(predictions, y_list)

### Optional cleanup
When you're done with the endpoint, you'll want to clean it up.

## Code Cell 22

In [None]:
# predictor.delete_model()
# predictor.delete_endpoint()