# R Serving with Plumber

## Dockerfile

* The Dockerfile defines the environment in which our server will be executed.
* Below, you can see that the entrypoint for our container will be [deploy.R](deploy.R)

In [1]:
%pycat Dockerfile

[0mFROM[0m [0mrocker[0m[0;34m/[0m[0mr[0m[0;34m-[0m[0mbase[0m[0;34m:[0m[0mlatest[0m[0;34m[0m
[0;34m[0m[0;34m[0m
[0;34m[0m[0mMAINTAINER[0m [0mAmazon[0m [0mSageMaker[0m [0mExamples[0m [0;34m<[0m[0mamazon[0m[0;34m-[0m[0msagemaker[0m[0;34m-[0m[0mexamples[0m[0;34m@[0m[0mamazon[0m[0;34m.[0m[0mcom[0m[0;34m>[0m[0;34m[0m
[0;34m[0m[0;34m[0m
[0;34m[0m[0;31m## Create directories[0m[0;34m[0m
[0;34m[0m[0mRUN[0m [0mmkdir[0m [0;34m-[0m[0mp[0m [0;34m/[0m[0mopt[0m[0;34m/[0m[0mml[0m[0;34m/[0m[0;36m0[0m[0;36m1[0m[0m_data[0m[0;34m[0m
[0;34m[0m[0mRUN[0m [0mmkdir[0m [0;34m-[0m[0mp[0m [0;34m/[0m[0mopt[0m[0;34m/[0m[0mml[0m[0;34m/[0m[0;36m0[0m[0;36m2[0m[0m_code[0m[0;34m[0m
[0;34m[0m[0mRUN[0m [0mmkdir[0m [0;34m-[0m[0mp[0m [0;34m/[0m[0mopt[0m[0;34m/[0m[0mml[0m[0;34m/[0m[0;36m0[0m[0;36m3[0m[0m_output[0m[0;34m[0m
[0;34m[0m[0;34m[0m
[0;34m[0m[0mRUN[0m [0mapt[0

## Code: deploy.R

The **deploy.R** script handles the following steps:
* Loads the R libraries used by the server.
* Loads a pretrained `xgboost` model that has been trained on the classical [Iris](https://archive.ics.uci.edu/ml/datasets/iris) dataset.
  * Dua, D. and Graff, C. (2019). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.
* Defines an inference function that takes a matrix of iris features and returns predictions for those iris examples.
* Finally, it imports the [endpoints.R](endpoints.R) script and launches the Plumber server app using those endpoint definitions.


In [2]:
%pycat 02_code/deploy.R

[0msuppressPackageStartupMessages[0m[0;34m([0m[0mlibrary[0m[0;34m([0m[0mxgboost[0m[0;34m)[0m[0;34m)[0m[0;34m[0m
[0;34m[0m[0mlibrary[0m[0;34m([0m[0mplumber[0m[0;34m)[0m[0;34m[0m
[0;34m[0m[0mlibrary[0m[0;34m([0m[0mjsonlite[0m[0;34m)[0m[0;34m[0m
[0;34m[0m[0;34m[0m
[0;34m[0m[0mlibrary[0m[0;34m([0m[0mcaret[0m[0;34m)[0m[0;34m[0m
[0;34m[0m[0mlibrary[0m[0;34m([0m[0mdata[0m[0;34m.[0m[0mtable[0m[0;34m)[0m[0;34m[0m
[0;34m[0m[0mlibrary[0m[0;34m([0m[0mdplyr[0m[0;34m)[0m[0;34m[0m
[0;34m[0m[0mlibrary[0m[0;34m([0m[0mpurrr[0m[0;34m)[0m[0;34m[0m
[0;34m[0m[0mlibrary[0m[0;34m([0m[0mreadr[0m[0;34m)[0m[0;34m[0m
[0;34m[0m[0mlibrary[0m[0;34m([0m[0mstringr[0m[0;34m)[0m[0;34m[0m
[0;34m[0m[0mlibrary[0m[0;34m([0m[0mtidyr[0m[0;34m)[0m[0;34m[0m
[0;34m[0m[0mlibrary[0m[0;34m([0m[0mzeallot[0m[0;34m)[0m[0;34m[0m
[0;34m[0m[0;34m[0m
[0;34m[0m[0moptions[0m[0;34m([0m[0ms

## Code: endpoints.R

**endpoints.R** defines two routes:
* `/ping` returns a string 'Alive' to indicate that the application is healthy
* `/invocations` applies the previously defined inference function to the input features from the request body

For more information about the requirements for building your own inference container, see:
[Use Your Own Inference Code with Hosting Services](https://docs.aws.amazon.com/sagemaker/latest/dg/your-algorithms-inference-code.html)

In [3]:
%pycat endpoints.R

[0;34m[0m
[0;34m[0m[0;31m#' Ping to show server is there[0m[0;34m[0m
[0;34m[0m[0;31m#' @get /ping[0m[0;34m[0m
[0;34m[0m[0mfunction[0m[0;34m([0m[0;34m)[0m [0;34m{[0m[0;34m[0m
[0;34m[0m    [0;32mreturn[0m[0;34m([0m[0;34m'Alive'[0m[0;34m)[0m[0;34m[0m
[0;34m[0m[0;34m}[0m[0;34m[0m
[0;34m[0m[0;34m[0m
[0;34m[0m[0;34m[0m
[0;34m[0m[0;31m#' Predict Pricing Guidance[0m[0;34m[0m
[0;34m[0m[0;31m#' @param input json[0m[0;34m[0m
[0;34m[0m[0;31m#' @post /Predict[0m[0;34m[0m
[0;34m[0m[0mfunction[0m[0;34m([0m[0mreq[0m[0;34m,[0m[0mres[0m[0;34m)[0m [0;34m{[0m[0;34m[0m
[0;34m[0m  [0mpredict[0m[0;34m([0m[0minput[0m [0;34m=[0m [0mdata[0m[0;34m.[0m[0mframe[0m[0;34m([0m[0mreq[0m[0;31m$[0m[0mbody[0m[0;34m)[0m[0;34m)[0m[0;34m[0m
[0;34m[0m[0;34m}[0m[0;34m[0m
[0;34m[0m[0;34m[0m
[0;34m[0m[0;31m# function(req) {[0m[0;34m[0m
[0;34m[0m[0;34m[0m
[0;34m[0m[0;31m#     # Read in data

## Build the Serving Image

In [4]:
!docker build -t r-plumber .

Sending build context to Docker daemon  29.87MB
Step 1/17 : FROM rocker/r-base:latest
 ---> ce611fb80498
Step 2/17 : MAINTAINER Amazon SageMaker Examples <amazon-sagemaker-examples@amazon.com>
 ---> Using cache
 ---> 774bf2975e2f
Step 3/17 : RUN mkdir -p /opt/ml/01_data
 ---> Using cache
 ---> 9a379ffdc99e
Step 4/17 : RUN mkdir -p /opt/ml/02_code
 ---> Using cache
 ---> ee86d44ce403
Step 5/17 : RUN mkdir -p /opt/ml/03_output
 ---> Using cache
 ---> 91ebb58163b0
Step 6/17 : RUN apt-get -y update && apt-get install -y --no-install-recommends     wget     apt-transport-https     ca-certificates     libcurl4-openssl-dev     libsodium-dev
 ---> Using cache
 ---> 09200f4aa1eb
Step 7/17 : RUN R -e "install.packages(c('caret','data.table','dplyr','purrr','readr','stringr','tidyr','zeallot','xgboost','plumber'),repos='https://cloud.r-project.org')"
 ---> Using cache
 ---> cb828041d721
Step 8/17 : COPY 02_code/main.R /opt/ml/02_code/main.R
 ---> Using cache
 ---> 7767632ddfe5
Step 9/17 : COPY 02

## Launch the Serving Container

In [5]:
!echo "Launching Plumber"
!docker run -d --rm -p 5000:8080 r-plumber
!echo "Waiting for the server to start.." && sleep 10
print("Done")

Launching Plumber
a9b927ab5c3659c1882a815fe2497a9afc285583cbcc38b5bf16cdad3ceea142
Waiting for the server to start..
Done


In [6]:
!docker container list

CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES


In [7]:
# !docker container
# !docker --version
# !docker logs cf6e08a4d074
# !docker diff cf6e08a4d074
# !docker inspect cf6e08a4d074
!docker info

Client:
 Context:    default
 Debug Mode: false

Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 17
 Server Version: 20.10.17
 Storage Driver: overlay2
  Backing Filesystem: xfs
  Supports d_type: true
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux neuron nvidia runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 10c12954828e7c7c9b6e0ea9b0c02b01407d3ae1
 runc version: 1e7bb5b773162b57333d57f612fd72e3f8612d94
 init version: de40ad0
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 5.10.147-133.644.amzn2.x86_64
 Operating System: Amazon Linux 2
 OSType: linux
 Architecture: x86_64
 CPUs: 2
 Total Memory: 3.773GiB
 Name

In [8]:
!docker cp fa5248e73f3d:/opt/ml/03_output/expected.csv ~/03_output/expected.csv

invalid output path: directory "/home/ec2-user/03_output" does not exist


## Define Simple Python Client

In [10]:
import requests
from tqdm import tqdm
import pandas as pd

pd.set_option("display.max_rows", 500)

In [11]:
def get_predictions(examples, instance=requests, port=5000):
    payload = {"features": examples}
    return instance.post(f"http://127.0.0.1:{port}/invocations", json=payload)

In [12]:
def get_health(instance=requests, port=5000):
    instance.get(f"http://127.0.0.1:{port}/ping")

In [None]:
# def get_predictions(examples, instance=requests, port=5000):
#     payload = {"features": examples}
#     return instance.post(f"http://127.0.0.1:{port}/sum")

In [None]:
resp = requests.get(f"http://127.0.0.1:5000/ping")

In [None]:
# resp = requests.post(f"http://127.0.0.1:5000/invocations")

In [None]:
# url= 'http://127.0.0.1:5000/sum'
# payload = { 'a' : '2', 'b' : '5'  }
# headers = {}
# res = requests.post(url, data=json.dumps(payload), headers=headers)


In [None]:
resp.content

## Define Example Inputs

Let's define example inputs from the Iris dataset.

In [13]:
input = pd.read_csv("sample_data/input.csv")

input = input.fillna(1) #test

In [14]:
input.head()

Unnamed: 0,dealId,extStart,extEnd,extStId,extDealDescription,extBusinessModelDescription,customerId,currencyCode,extCategory,accountId,...,extResellerB,extAuth,extDistMgr,extCustRfp,extNonTnC,extCustomerName,extWarFlag,extProductName,extSKU1,extCos
0,30206808,8/19/2021,8/31/2022,601590151,Async UAT Testing - JP,Indirect/Partner Dir,601590151,USD,PC,601590151,...,'71282509-MAIN STREET TECHNOLOGIES',1.0,1.0,1.0,1.0,1.0,1.0,HP EBx3601030G3 i7-8650U 13 16GB/256 PC,7CZ10UP,25700.88
1,30206808,8/19/2021,8/31/2022,601590151,Async UAT Testing - JP,Indirect/Partner Dir,601590151,USD,PC,601590151,...,'71282509-MAIN STREET TECHNOLOGIES',1.0,1.0,1.0,1.0,1.0,1.0,HP 3y Travel Pickup Return NB Only SVC,U7NT8E,25700.88
2,30206808,8/19/2021,8/31/2022,601590151,Async UAT Testing - JP,Indirect/Partner Dir,601590151,USD,PC,601590151,...,'71282509-MAIN STREET TECHNOLOGIES',1.0,1.0,1.0,1.0,1.0,1.0,HP EBx3601030G3 i7-8650U 13 16GB/512 PC,7HC41UP,25704.56
3,30206808,8/19/2021,8/31/2022,601590151,Async UAT Testing - JP,Indirect/Partner Dir,601590151,USD,PC,601590151,...,'71282509-MAIN STREET TECHNOLOGIES',1.0,1.0,1.0,1.0,1.0,1.0,HP 3y Travel Pickup Return NB Only SVC,U7NT8E,25704.56
4,30206808,8/19/2021,8/31/2022,601590151,Async UAT Testing - JP,Indirect/Partner Dir,601590151,USD,PC,601590151,...,'71282509-MAIN STREET TECHNOLOGIES',1.0,1.0,1.0,1.0,1.0,1.0,HP Ex21013G3 i7-8650U 13 16GB/256 PC,7MC28UP,0.0


In [15]:
input_features = input[['dealId', 'extStart', 'extEnd', 'extStId', 'extDealDescription',
       'extBusinessModelDescription', 'customerId', 'currencyCode',
       'extCategory', 'accountId', 'extDealCountry', 'extPriceGeo',
       'extCurrency', 'extPriceTerms', 'extDealVersion', 'extDealtype',
       'extCustomerSegmentCode', 'extCustomerSegment', 'extMiscChargeCode',
       'extGlobalBusinessUnitName', 'extBundleConfigurationId', 'productId',
       'lineId', 'extBandedProductFlag', 'extUnbundledProductDesc',
       'extUnbundledProductLineDesc', 'extUnbundledLineItemNumber',
       'extProductBaseCategoryDesc', 'extBundledProductId',
       'extBundledProductDesc', 'qty', 'price', 'extQuotedCostOfSalesUSD',
       'extQuotedStandardDiscountUSD', 'extQuotedAdditionalDiscountUSD',
       'extProductLine', 'extBundleLineID', 'extBomUsg', 'extFloor',
       'extTypical', 'extExpert', 'extHighRisk', 'extIndustryName',
       'extResellerB', 'extAuth', 'extDistMgr', 'extCustRfp', 'extNonTnC',
       'extCustomerName', 'extWarFlag', 'extProductName', 'extSKU1', 'extCos']]

In [16]:
example_inputs = input_features.values.tolist()

In [17]:
# example_inputs

### Plumber

In [18]:
predicted = get_predictions(example_inputs)#.json()#["output"]

ConnectionError: HTTPConnectionPool(host='127.0.0.1', port=5000): Max retries exceeded with url: /invocations (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f73e8b5b1c0>: Failed to establish a new connection: [Errno 111] Connection refused'))

In [None]:
predicted

### Stop All Serving Containers

Finally, we will shut down the serving container we launched for the test.

In [20]:
!docker kill $(docker ps -q)

"docker kill" requires at least 1 argument.
See 'docker kill --help'.

Usage:  docker kill [OPTIONS] CONTAINER [CONTAINER...]

Kill one or more running containers


In [None]:
# predict("sample_data/input.csv")

In [None]:
data = {
        "dealId": "T000518649-01",
        "extStart": "07/30/2022",
        "extEnd": "08/19/2022",
        "extStId": "500725309",
        "extDealDescription": "unavailable",
        "extBusinessModelDescription": "Indirect/Partner Dir",
        "extBusinessModel": "CH",
        "currencyCode": "USD",
        "extDealCountry": "SG",
        "extCurrency": "SGD",
        "extPriceTerms": "DDU",
        "extDealVersion": "1",
        "extCustomerSegment":  "CORPORATE",
        "extMiscChargeCode": "A9R",
        "extGlobalBusinessUnitName": "1",
        "extBundleConfigurationId": None,
        "productId": "D9Y32AA",
        "lineId": "10",
        "extBandedProductFlag": "N",
        "extUnbundledProductDesc": "HP ULTRASLIM DOCKING STATION",
        "extUnbundledProductLineDesc": "MP - COMMERCIAL NOTEBOOK ACCESSORIES",
        "extProductBaseCategoryDesc": "1",
        "extBundledProductId": "D9Y32AA",
        "extBundledProductDesc": "HP ULTRASLIM DOCKING STATION",
        "qty": "1",
        "price": "180.24",
        "extQuotedCostOfSalesUSD": "72",
        "extQuotedStandardDiscountUSD": "0",
        "extQuotedAdditionalDiscountUSD": "0",
        "extProductLine": "MP",
        "extBomUsg": "0",
        "extResellerB": "None",
        "extAuth": "1",
        "extDistMgr": "1",
        "extCustRfp": "None",
        "extNonTnC": "None",
        "extCustomerName": "1",
        "extCos": "72"
    }

In [None]:
# r = requests.post(f"http://127.0.0.1:5000/invocations", data=payload)

In [None]:
# data

In [None]:
import requests
import json

url = f"http://127.0.0.1:5000/Predict"
headers = {'Content-type': 'application/json', 'Accept': 'text/plain'}
r = requests.post(url, data=json.dumps(data), headers=headers)

In [None]:
r.content

In [None]:
!docker run -d -p 8787:8787 --rm rocker/rstudio

In [19]:
!docker ps

CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES


In [None]:
# !docker info
# !docker inspect 1ae5a3dc126c