# R Serving with Plumber

## Dockerfile

* The Dockerfile defines the environment in which our server will be executed.
* Below, you can see that the entrypoint for our container will be [deploy.R](deploy.R)

In [None]:
%pycat Dockerfile

## Code: deploy.R

The **deploy.R** script handles the following steps:
* Loads the R libraries used by the server.
* Loads a pretrained `xgboost` model that has been trained on the classical [Iris](https://archive.ics.uci.edu/ml/datasets/iris) dataset.
  * Dua, D. and Graff, C. (2019). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.
* Defines an inference function that takes a matrix of iris features and returns predictions for those iris examples.
* Finally, it imports the [endpoints.R](endpoints.R) script and launches the Plumber server app using those endpoint definitions.


In [None]:
%pycat deploy.R

## Code: endpoints.R

**endpoints.R** defines two routes:
* `/ping` returns a string 'Alive' to indicate that the application is healthy
* `/invocations` applies the previously defined inference function to the input features from the request body

For more information about the requirements for building your own inference container, see:
[Use Your Own Inference Code with Hosting Services](https://docs.aws.amazon.com/sagemaker/latest/dg/your-algorithms-inference-code.html)

In [None]:
%pycat endpoints.R

## Build the Serving Image

In [None]:
# docker image name
image_name = "iris-xgb-serving-plumber"

In [None]:
!docker build -t $image_name .

## Launch the Serving Container at Local

In [None]:
!echo "Launching Plumber"
!docker run -d --rm -p 5000:8080 $image_name
!echo "Waiting for the server to start.." && sleep 10

In [None]:
!docker container list

## Define Simple Python Client

In [None]:
import requests
from tqdm import tqdm
import pandas as pd

pd.set_option("display.max_rows", 500)

In [None]:
def get_predictions(examples, instance=requests, port=5000):
    payload = {"features": examples}
    return instance.post(f"http://127.0.0.1:{port}/invocations", json=payload)

In [None]:
def get_health(instance=requests, port=5000):
    instance.get(f"http://127.0.0.1:{port}/ping")

## Define Example Inputs

Let's define example inputs from the Iris dataset.

In [None]:
x = [0, 0, 0, 0]

### Plumber

In [None]:
predicted = get_predictions(x)

In [None]:
predicted.text

### Push Image to ECR

In [None]:
!./build_and_push.sh $image_name

In [None]:
import boto3

In [None]:
session = boto3.session.Session()
region = session.region_name
account_id = boto3.client('sts').get_caller_identity().get('Account')

region, account_id

In [None]:
# please copy the uri from ECR console.
# provide proviate the account id
r_plumber_ecr_repo_uri = f"{account_id}.dkr.ecr.{region}.amazonaws.com/{image_name}:latest"

### Create model and deploy on Endpoint

In [None]:
import boto3
import sagemaker
from sagemaker.model import Model
from sagemaker import get_execution_role

role = get_execution_role()

In [None]:
model_name = image_name # must be unique
r_model = Model(image_uri = r_plumber_ecr_repo_uri, role = role, name = model_name)

In [None]:
r_model.deploy(initial_instance_count = 1,
            instance_type = 'ml.t2.medium'
              )

### Invoke endpoint

In [None]:
client = boto3.client('sagemaker-runtime')

In [None]:
x = [3, 4]

payload = str(x)

response = client.invoke_endpoint(
    EndpointName = "iris-xgb-serving-plumber-2023-09-11-11-10-31-186", # must be matched with the endpoint name
    Body = payload,
    ContentType='text/csv'
)

In [None]:
response

In [None]:
# response['Body'] is stream and can only be read once
result = response['Body'].read().decode()

In [None]:
result

In [None]:
print('output: ', result)

In [None]:
print('input:', x)

### Stop All Serving Containers

Finally, we will shut down the serving container we launched for the test.

In [None]:
!docker kill $(docker ps -q)