![ML Logo](../../images/mod00_logo.png "Logo") 

# Module 7 : ML Model Deployment into Production for Batch & Real-Time Predictions (Bring Your Own Algorithm with Containers)


`(Revision History:
PA1, 2020-04-15, @akirmak: Initial version
`

## Module Overview

This notebook is a slightly modified version of the [AWS SageMaker Samples in Github: Advanced Functionality: SciKit Bring Your Own Model](https://github.com/awslabs/amazon-sagemaker-examples/blob/master/advanced_functionality/scikit_bring_your_own/scikit_bring_your_own.ipynb) and the [AWS Blog: Train and host Scikit-Learn models in Amazon SageMaker by building a Scikit Docker container](
Based on https://aws.amazon.com/blogs/machine-learning/train-and-host-scikit-learn-models-in-amazon-sagemaker-by-building-a-scikit-docker-container/)
 
In this module, you will 
 1. train your model locally and then package it in a container 
 1. make SageMaker use your custom container for training & inference.

Unlike the original example, will mostly use the CLI and the SageMaker console to make you familiar with those environments, beyond a Jupyter notebook. 


# Building your own algorithm container

With Amazon SageMaker, you can package your own algorithms that can than be trained and deployed in the SageMaker environment. This notebook will guide you through an example that shows you how to build a Docker container for SageMaker and use it for training and inference.

By packaging an algorithm in a container, you can bring almost any code to the Amazon SageMaker environment, regardless of programming language, environment, framework, or dependencies. 

_**Note:**_ SageMaker now includes a [pre-built scikit container](https://github.com/awslabs/amazon-sagemaker-examples/blob/master/sagemaker-python-sdk/scikit_learn_iris/Scikit-learn%20Estimator%20Example%20With%20Batch%20Transform.ipynb).  We recommend the pre-built container be used for almost all cases requiring a scikit algorithm.  However, this example remains relevant as an outline for bringing in other libraries to SageMaker as your own container.



## When should I build my own algorithm container?

You may not need to create a container to bring your own code to Amazon SageMaker. When you are using a framework (such as Apache MXNet or TensorFlow) that has direct support in SageMaker, you can simply supply the Python code that implements your algorithm using the SDK entry points for that framework. This set of frameworks is continually expanding, so we recommend that you check the current list if your algorithm is written in a common machine learning environment.

Even if there is direct SDK support for your environment or framework, you may find it more effective to build your own container. If the code that implements your algorithm is quite complex on its own or you need special additions to the framework, building your own container may be the right choice.

If there isn't direct SDK support for your environment, don't worry. You'll see in this walk-through that building your own container is quite straightforward.

## Permissions

Running this notebook requires permissions in addition to the normal `SageMakerFullAccess` permissions. This is because we'll creating new repositories in Amazon ECR. The easiest way to add these permissions is simply to add the managed policy `AmazonEC2ContainerRegistryFullAccess` to the role that you used to start your notebook instance. There's no need to restart your notebook instance when you do this, the new permissions will be available immediately.

## The example

Here, we'll show how to package a simple Python example which showcases the [decision tree][] algorithm from the widely used [scikit-learn][] machine learning package. The example is purposefully fairly trivial since the point is to show the surrounding structure that you'll want to add to your own code so you can train and host it in Amazon SageMaker.

The ideas shown here will work in any language or environment. You'll need to choose the right tools for your environment to serve HTTP requests for inference, but good HTTP environments are available in every language these days.

In this example, we use a single image to support training and hosting. This is easy because it means that we only need to manage one image and we can set it up to do everything. Sometimes you'll want separate images for training and hosting because they have different requirements. Just separate the parts discussed below into separate Dockerfiles and build two images. Choosing whether to have a single image or two images is really a matter of which is more convenient for you to develop and manage.

If you're only using Amazon SageMaker for training or hosting, but not both, there is no need to build the unused functionality into your container.

[scikit-learn]: http://scikit-learn.org/stable/
[decision tree]: http://scikit-learn.org/stable/modules/tree.html



# Part 1: Packaging and Uploading your Algorithm for use with Amazon SageMaker

### An overview of Docker

If you're familiar with Docker already, you can skip ahead to the next section.

For many data scientists, Docker containers are a new concept, but they are not difficult, as you'll see here. 

Docker provides a simple way to package arbitrary code into an _image_ that is totally self-contained. Once you have an image, you can use Docker to run a _container_ based on that image. Running a container is just like running a program on the machine except that the container creates a fully self-contained environment for the program to run. Containers are isolated from each other and from the host environment, so the way you set up your program is the way it runs, no matter where you run it.

Docker is more powerful than environment managers like conda or virtualenv because (a) it is completely language independent and (b) it comprises your whole operating environment, including startup commands, environment variable, etc.

In some ways, a Docker container is like a virtual machine, but it is much lighter weight. For example, a program running in a container can start in less than a second and many containers can run on the same physical machine or virtual machine instance.

Docker uses a simple file called a `Dockerfile` to specify how the image is assembled. We'll see an example of that below. You can build your Docker images based on Docker images built by yourself or others, which can simplify things quite a bit.

Docker has become very popular in the programming and devops communities for its flexibility and well-defined specification of the code to be run. It is the underpinning of many services built in the past few years, such as [Amazon ECS].

Amazon SageMaker uses Docker to allow users to train and deploy arbitrary algorithms.

In Amazon SageMaker, Docker containers are invoked in a certain way for training and a slightly different way for hosting. The following sections outline how to build containers for the SageMaker environment.

Some helpful links:

* [Docker home page](http://www.docker.com)
* [Getting started with Docker](https://docs.docker.com/get-started/)
* [Dockerfile reference](https://docs.docker.com/engine/reference/builder/)
* [`docker run` reference](https://docs.docker.com/engine/reference/run/)

[Amazon ECS]: https://aws.amazon.com/ecs/

### How Amazon SageMaker runs your Docker container

![Figure](../../images/mod07_fig03.png "Figure") 


Because you can run the same image in training or hosting, Amazon SageMaker runs your container with the argument `train` or `serve`. How your container processes this argument depends on the container:

* In the example here, we don't define an `ENTRYPOINT` in the Dockerfile so Docker will run the command `train` at training time and `serve` at serving time. In this example, we define these as executable Python scripts, but they could be any program that we want to start in that environment.
* If you specify a program as an `ENTRYPOINT` in the Dockerfile, that program will be run at startup and its first argument will be `train` or `serve`. The program can then look at that argument and decide what to do.
* If you are building separate containers for training and hosting (or building only for one or the other), you can define a program as an `ENTRYPOINT` in the Dockerfile and ignore (or verify) the first argument passed in. 

#### Running your container during training

When Amazon SageMaker runs training, your `train` script is run just like a regular Python program. A number of files are laid out for your use, under the `/opt/ml` directory:

    /opt/ml
    |-- input
    |   |-- config
    |   |   |-- hyperparameters.json
    |   |   `-- resourceConfig.json
    |   `-- data
    |       `-- <channel_name>
    |           `-- <input data>
    |-- model
    |   `-- <model files>
    `-- output
        `-- failure

##### The input

* `/opt/ml/input/config` contains information to control how your program runs. `hyperparameters.json` is a JSON-formatted dictionary of hyperparameter names to values. These values will always be strings, so you may need to convert them. `resourceConfig.json` is a JSON-formatted file that describes the network layout used for distributed training. Since scikit-learn doesn't support distributed training, we'll ignore it here.
* `/opt/ml/input/data/<channel_name>/` (for File mode) contains the input data for that channel. The channels are created based on the call to CreateTrainingJob but it's generally important that channels match what the algorithm expects. The files for each channel will be copied from S3 to this directory, preserving the tree structure indicated by the S3 key structure. 
* `/opt/ml/input/data/<channel_name>_<epoch_number>` (for Pipe mode) is the pipe for a given epoch. Epochs start at zero and go up by one each time you read them. There is no limit to the number of epochs that you can run, but you must close each pipe before reading the next epoch.

##### The output

* `/opt/ml/model/` is the directory where you write the model that your algorithm generates. Your model can be in any format that you want. It can be a single file or a whole directory tree. SageMaker will package any files in this directory into a compressed tar archive file. This file will be available at the S3 location returned in the `DescribeTrainingJob` result.
* `/opt/ml/output` is a directory where the algorithm can write a file `failure` that describes why the job failed. The contents of this file will be returned in the `FailureReason` field of the `DescribeTrainingJob` result. For jobs that succeed, there is no reason to write this file as it will be ignored.

#### Running your container during hosting

Hosting has a very different model than training because hosting is reponding to inference requests that come in via HTTP. In this example, we use our recommended Python serving stack to provide robust and scalable serving of inference requests:

![Request serving stack](stack.png)

This stack is implemented in the sample code here and you can mostly just leave it alone. 

Amazon SageMaker uses two URLs in the container:

* `/ping` will receive `GET` requests from the infrastructure. Your program returns 200 if the container is up and accepting requests.
* `/invocations` is the endpoint that receives client inference `POST` requests. The format of the request and the response is up to the algorithm. If the client supplied `ContentType` and `Accept` headers, these will be passed in as well. 

The container will have the model files in the same place they were written during training:

    /opt/ml
    `-- model
        `-- <model files>



### The parts of the sample container

In the `container` directory are all the components you need to package the sample algorithm for Amazon SageMager:

    .
    |-- Dockerfile
    |-- build_and_push.sh
    `-- decision_trees
        |-- nginx.conf
        |-- predictor.py
        |-- serve
        |-- train
        `-- wsgi.py

Let's discuss each of these in turn:

* __`Dockerfile`__ describes how to build your Docker container image. More details below.
* __`build_and_push.sh`__ is a script that uses the Dockerfile to build your container images and then pushes it to ECR. We'll invoke the commands directly later in this notebook, but you can just copy and run the script for your own algorithms.
* __`decision_trees`__ is the directory which contains the files that will be installed in the container.
* __`local_test`__ is a directory that shows how to test your new container on any computer that can run Docker, including an Amazon SageMaker notebook instance. Using this method, you can quickly iterate using small datasets to eliminate any structural bugs before you use the container with Amazon SageMaker. We'll walk through local testing later in this notebook.

In this simple application, we only install five files in the container. You may only need that many or, if you have many supporting routines, you may wish to install more. These five show the standard structure of our Python containers, although you are free to choose a different toolset and therefore could have a different layout. If you're writing in a different programming language, you'll certainly have a different layout depending on the frameworks and tools you choose.

The files that we'll put in the container are:

* __`nginx.conf`__ is the configuration file for the nginx front-end. Generally, you should be able to take this file as-is.
* __`predictor.py`__ is the program that actually implements the Flask web server and the decision tree predictions for this app. You'll want to customize the actual prediction parts to your application. Since this algorithm is simple, we do all the processing here in this file, but you may choose to have separate files for implementing your custom logic.
* __`serve`__ is the program started when the container is started for hosting. It simply launches the gunicorn server which runs multiple instances of the Flask app defined in `predictor.py`. You should be able to take this file as-is.
* __`train`__ is the program that is invoked when the container is run for training. You will modify this program to implement your training algorithm.
* __`wsgi.py`__ is a small wrapper used to invoke the Flask app. You should be able to take this file as-is.

In summary, the two files you will probably want to change for your application are `train` and `predictor.py`.


### The Dockerfile

The Dockerfile describes the image that we want to build. You can think of it as describing the complete operating system installation of the system that you want to run. A Docker container running is quite a bit lighter than a full operating system, however, because it takes advantage of Linux on the host machine for the basic operations. 

For the Python science stack, we will start from a standard Ubuntu installation and run the normal tools to install the things needed by scikit-learn. Finally, we add the code that implements our specific algorithm to the container and set up the right environment to run under.

Along the way, we clean up extra space. This makes the container smaller and faster to start.

Let's look at the Dockerfile for the example:

1. Open a terminal from Jupyter (Go to Jupyter Home Tab, select `New -> Terminal` from the top right 

![Figure](../../images/mod07_fig001.png "Figure") 


2. Open `vi ~/.bashrc`
- Append the following

    `
        export PS1="\[$(tput setaf 6)\]\u@\h:\w $ \[$(tput sgr0)\]"
        export CLICOLOR=1
        export LSCOLORS=ExFxCxDxBxegedabagacad

        alias ll='ls -lah'
        export EDITOR=vim
    `

- Do `source ~/.bashrc`
- Do `sudo yum install htop -y`
- Do `cd ~/SageMaker/architectingMLonAWS/mod7-deploy-scikit-byom/container`
- Do `less Dockerfile`

## Build the Containers and push to ECR Repo

The following shell code shows how to build the container image using `docker build` and push the container image to ECR using `docker push`. This code is also available as the shell script `container/build-and-push.sh`, which you can run as `build-and-push.sh decision_trees_sample` to build the image `decision_trees_sample`. 

This code looks for an ECR repository in the account you're using and the current default region (if you're using a SageMaker notebook instance, this will be the region where the notebook instance was created). If the repository doesn't exist, the script will create it.

- Run `./build_and_push.sh decision_trees`

## Testing and debugging

### Testing your algorithm on your local machine or on an Amazon SageMaker notebook instance

While you're first packaging an algorithm use with Amazon SageMaker, you probably want to test it yourself to make sure it's working right. In the directory `container/local_test`, there is a framework for doing this. It includes three shell scripts for running and using the container and a directory structure that mimics the one outlined above.

The scripts are:

* `train_local.sh`: Run this with the name of the image and it will run training on the local tree. For example, you can run `$ ./train_local.sh sagemaker-decision-trees`. It will generate a model under the `/test_dir/model` directory. You'll want to modify the directory `test_dir/input/data/...` to be set up with the correct channels and data for your algorithm. Also, you'll want to modify the file `input/config/hyperparameters.json` to have the hyperparameter settings that you want to test (as strings).
* `serve_local.sh`: Run this with the name of the image once you've trained the model and it should serve the model. For example, you can run `$ ./serve_local.sh sagemaker-decision-trees`. It will run and wait for requests. Simply use the keyboard interrupt to stop it.
* `predict.sh`: Run this with the name of a payload file and (optionally) the HTTP content type you want. The content type will default to `text/csv`. For example, you can run `$ ./predict.sh payload.csv text/csv`.

The directories as shipped are set up to test the decision trees sample algorithm presented here.

### Do a Local Training

- `cd ~/SageMaker/architectingMLonAWS/mod7-deploy-scikit-byom/container/local_test`

- Run `$ ./train_local.sh decision_trees
- Run `ll` 

You should see an output like below. 

- The **Model** (pkl file in this case is a binary format that contains code and configuration). It's used to save and load your machine learning models in Python using scikit-learn.
- The **Model** is provided with configuration parameters (e.g. hyperparams) via `hyperparameters.json` and `resourceConfig.json` files

![ML Logo](../../images/mod07_fig00.png "Figure") 

### Spawn a Local Container for Inference & accept inference requests from a local host over HTTP

run `$ ./serve_local.sh sagemaker-decision-trees`b

You should see an output like below. This is standard Docker Image build procedure. SageMaker comes with Docker preinstalled. 

![ML Logo](../../images/mod07_fig02.png "Figure") 

### Open a new Terminal and Do Batch Prediction Unit Test Towards the Inference Endpoint

- Open another terminal from Jupyter (Go to Jupyter Home Tab, select `New -> Terminal` from the top right 
- Do `source ~/.bashrc`
- Do `cd ~/SageMaker/architectingMLonAWS/mod7-deploy-scikit-byom/container/local_test`
- run `$ ./predict.sh payload.csv text/csv`

You should see an output like below. As explained above;

- the **Server** logic has flask, gunicorn and nginx; which all makes a simple app server
- the **Client** logic sends features in HTTP POST. 

![ML Figure](../../images/mod07_fig01.png "Figure") 


#  Part 2: Using your Algorithm in Amazon SageMaker Console

Once you have your container packaged, you can use it to train models and use the model for hosting or batch transforms. Let's do that with the algorithm we made above.

## Set up the environment


### The registry path of your own training image for BYOA (Bring Your own Algorithm)
After the image is built and pushed to Amazon ECR, Amazon SageMaker invokes the training service for your algorithm by running the Docker command we introduced earlier. You can use the Amazon SageMaker console to tell SageMaker the registry path where your own training image is stored in Amazon ECR. 

Scroll down to the Resource configuration. These configurations can be left alone for our example, but feel free to customize them for your own models. The main configuration here is the Instance type, as well as the number of instances under Instance count. Note, some algorithms, like Scikit algorithms, can only take advantage of a single instance. Setting this higher won’t automatically cause your algorithm to scale across multiple instances.

Your primary choices for instance type are from the m, c, and p families of instances. The m family is a balanced instance type, that is a good place to start if you are uncertain about where your model might be most intensive. The c family is for compute intensive workloads, while the p family provides GPU capacity, often used for deep learning. You can better see where your model is most intensive by monitoring CloudWatch for SageMaker.


Summarizing the configuration required:
* The __container name__. This is constructed as in the shell commands above.
* The __role__. As defined above.
* The __instance count__ which is the number of machines to use for training.
* The __instance type__ which is the type of machine to use for training.
* The __output path__ determines where the model artifact will be written.

Then we use fit() on the estimator to train against the data that we uploaded above.

![ML Figure](../../images/mod07_fig04.png "Figure") 

### Data Sources & Hyperparameters for your Algorithm
You can use the Amazon SageMaker console to tell SageMaker to configure your algorithm:

Set any hyperparameters for this job in the Hyperparameters box. In our example, the only hyperparameter we can set is `max_leaf_nodes`. You may optionally set a value for this here. If you are interested in enabling more hyperparameter options for your algorithm, remember these need to be passed in your train file in the Docker image.


Next, set Input data configuration. The Channel name has to align with what you programmed in your train script. We recommend that the S3 location that you set up for your data input matches the channel name, especially when you have multiple channels. This way Amazon SageMaker can inject channel data to the corresponding container location (e.g. `/opt/ml/input/data/<channel_name>`), in the location where program train looks for data. In our example, we have a single channel called training. Our model is expecting a CSV file, with no header, where the first column is our target label. Feel free to upload the famous Iris data set to your own S3 location for this, as it is what is used in this example. 

![ML Figure](../../images/mod07_fig05.png "Figure") 



Training model will be saved into S3 (so that you bring into your on-prem infra, IoT devices or elsewhere)

Finally, you need to specify where the model artifact is going to be saved on Amazon S3 after the training job is done. Remember, this takes anything in the container that is output to `/opt/ml/model/`, and then it archives it with tar and gzip compresses it. Specify an Amazon S3 bucket in the `S3 output path` field.

![ML Figure](../../images/mod07_fig06.png "Figure") 



### Model Artifact

Next, choose the Create training job button.

After the created training job is completed, you should be able to see the generated model in the S3 output path you provided. The full path of the model is

`s3://<bucket name>/models/<job name>/output/model.tar.gz`

# Part 3: Deploying to an endpoint
After the training job has finished, we can deploy an endpoint, to take our model to production. We can do this from the SageMaker console. There are three steps: First, we have to create a model resource. Second, we create an endpoint configuration. Third, we deploy the actual endpoint.

1. The first step is to create the model resource. This contains the information Amazon SageMaker needs about the Docker image and the Amazon S3 location of the artifact.

1. In the Amazon SageMaker console, in the left navigation pane, under Resources select Models. Then choose the Create model button.

We need to input a Model name, with a name like `decision-trees`, and then select or create an IAM role. You can use the same role that you used for your training job, as long as it has the `AmazonSageMakerFullAccess` IAM policy attached.

## Create Model
### The registry path of your own inference image (Bring your own Model Artifact)
Similarly, by indicating the location of your inference code image stored in Amazon ECR and a trained model artifact in Amazon S3, Amazon SageMaker invokes the hosting service for your algorithm through a SageMaker Runtime HTTPS endpoint, where inference requests are sent to to get prediction from the model, as shown in the following screenshot.


We need to provide information about our container. We can use the same container we used for the training job. Provide the ECR URI that we saved earlier for our container in the Location of inference code image field. Make sure to include an image tag, like the latest tag. Next, in the Location of model artifacts cell, provide the S3 URI for the model artifact that you created in the training job. This was the S3 output path you set during training.

Choose the Create model button.



![ML Figure](../../images/mod07_fig07.png "Figure") 


## Hosting your model
You can use a trained model to get real time predictions using HTTP endpoint. Follow these steps to walk you through the process.

## Create Endpoint
Next, we need to create an Endpoint configuration. We combine this with model resources to determine what kind of instances to provision. You can also use this to specify more than one model for doing A/B testing. For our example, we’ll only use a single model.

On the left navigation pane, under Resources, select Endpoint configuration. Choose the Create endpoint configuration button.

Give your endpoint configuration a name under Endpoint configuration name. Unless we anticipate having multiple versions of the same model, provide the same name as my model. Next, choose the Add model link in the lower left. Select the model you just created, and choose save. You can leave the additional values as their defaults for this example.

Choose the Create endpoint configuration button.

![ML Figure](../../images/mod07_fig08.png "Figure") 



## Deploy to an Actual Endpoint

Now that we have a model and an endpoint configuration, we can deploy an actual endpoint. Activating an endpoint will provision an instance. Note that as long as the endpoint is active, charges will accrue, so it’s important to only activate an endpoint when you intend to direct traffic towards it.

In the left navigation pane, under Resources, select Endpoints. Choose the Create endpoint button.

Give your endpoint a name under Endpoint name. Use the same name as my model when we are only deploying a single model with no variants. In this case, use decision-trees. Leave Use an existing endpoint configuration selected.

Under Endpoint configuration, select the endpoint configuration we created in the last step. Then choose the Select endpoint configuration button. Finally, choose the Create endpoint button.

![ML Figure](../../images/mod07_fig09.png "Figure") 


## Part 4: Inference
### You are Done! Now Test inferences using your Model Deployed to an Endpoint
The easiest way to test our endpoint is from the SageMaker notebook instance. Run the following code. This will take a random sample of the data in your data file, and send it to the end point. It will also print out the result.

### Choose some data and use it for a prediction
We'll extract some of the data we used for training and do predictions against it. This is, of course, bad statistical practice, but a good way to see how the mechanism works.

In [18]:
import boto3
import io
import pandas as pd
import itertools



In [20]:
# Set below parameters
bucket = 'prj-ml'
key = 'decision-tree-scikit-iris-dataset/training/iris.csv'
endpointName = 'hba-decision-trees-scikit-decsntree-iris-2020-03-14-runV1'

# Pull our data from S3
s3 = boto3.client('s3')
f = s3.get_object(Bucket=bucket, Key=key)

# Make a dataframe
shape = pd.read_csv(io.BytesIO(f['Body'].read()), header=None)



In [21]:
shape.sample(3)

Unnamed: 0,0,1,2,3,4
93,versicolor,5.0,2.3,3.3,1.0
42,setosa,4.4,3.2,1.3,0.2
54,versicolor,6.5,2.8,4.6,1.5


In [22]:
# Take a random sample
a = [50*i for i in range(3)]
b = [40+i for i in range(10)]
indices = [i+j for i,j in itertools.product(a,b)]
test_data=shape.iloc[indices[:-1]]
test_X=test_data.iloc[:,1:]
test_y=test_data.iloc[:,0]

# Convert the dataframe to csv data
test_file = io.StringIO()
test_X.to_csv(test_file, header=None, index=None)

# Talk to SageMaker
client = boto3.client('sagemaker-runtime')
response = client.invoke_endpoint(
    EndpointName=endpointName,
    Body=test_file.getvalue(),
    ContentType='text/csv',
    Accept='Accept'
)

print(response['Body'].read().decode('ascii'))

setosa
setosa
setosa
setosa
setosa
setosa
setosa
setosa
setosa
setosa
versicolor
versicolor
versicolor
versicolor
versicolor
versicolor
versicolor
versicolor
versicolor
versicolor
virginica
virginica
virginica
virginica
virginica
virginica
virginica
virginica
virginica



## Conclusion
Congratulations! You have now built, trained, deployed, and tested your Scikit model in Amazon SageMaker. Feel free to modify any part of this workflow or code to better suit your own needs.



### Optional cleanup
When you're done with the endpoint, you'll want to clean it up.

In [None]:
sess.delete_endpoint(predictor.endpoint)

## [Optional] Run Batch Transform Job
You can use a trained model to get inference on large data sets by using [Amazon SageMaker Batch Transform](https://docs.aws.amazon.com/sagemaker/latest/dg/how-it-works-batch.html). A batch transform job takes your input data S3 location and outputs the predictions to the specified S3 output folder. Similar to hosting, you can extract inferences for training data to test batch transform.

### Create a Transform Job
We'll create an `Batch Transform job` that defines how to use the container to get inference results on a data set. This includes the configuration we need to invoke SageMaker batch transform:

* The __instance count__ which is the number of machines to use to extract inferences
* The __instance type__ which is the type of machine to use to extract inferences
* The __output path__ determines where the inference results will be written

![ML Figure](../../images/mod07_fig10.png "Figure") 


We use tranform() on the transfomer to get inference results against the data that we uploaded. You can use these options when invoking the transformer. 

* The __data_location__ which is the location of input data
* The __content_type__ which is the content type set when making HTTP request to container to get prediction
* The __split_type__ which is the delimiter used for splitting input data 
* The __input_filter__ which indicates the first column (ID) of the input will be dropped before making HTTP request to container

For more information on the configuration options, see [CreateTransformJob API](https://docs.aws.amazon.com/sagemaker/latest/dg/API_CreateTransformJob.html)