## Serverless LightGBM Bring Your Own Container Inference

<b> Additional Resources</b>
- [LightGBM Real-Time BYOC](https://github.com/RamVegiraju/SageMaker-Deployment/tree/master/RealTime/BYOC/LightGBM)
- [LightGBM Docker Installation Local](https://github.com/microsoft/LightGBM/blob/master/docker/dockerfile-python)
- [BYOC Guide](https://towardsdatascience.com/bring-your-own-container-with-amazon-sagemaker-37211d8412f4)
- [BYOC Setup](https://sagemaker-workshop.com/custom/containers.html)

## Serverless Client Setup

For testing you need to properly configure your Notebook Role to have SageMaker Full Access.

Let's start by installing preview wheels of the Python SDK, boto and aws cli

<b>Notebook Setting</b>
- Notebook Instance: ml.c5.xlarge
- Kernel Type: conda_python3

In [1]:
# Fallback in case wheels are unavailable
! pip install sagemaker botocore boto3 awscli --upgrade

Collecting sagemaker
  Downloading sagemaker-2.70.0.tar.gz (466 kB)
[K     |████████████████████████████████| 466 kB 7.3 MB/s eta 0:00:01
Collecting botocore
  Downloading botocore-1.23.19-py3-none-any.whl (8.4 MB)
[K     |████████████████████████████████| 8.4 MB 90.3 MB/s eta 0:00:01
Collecting boto3
  Downloading boto3-1.20.19-py3-none-any.whl (131 kB)
[K     |████████████████████████████████| 131 kB 65.3 MB/s eta 0:00:01
Collecting awscli
  Downloading awscli-1.22.19-py3-none-any.whl (3.8 MB)
[K     |████████████████████████████████| 3.8 MB 75.3 MB/s eta 0:00:01
Building wheels for collected packages: sagemaker
  Building wheel for sagemaker (setup.py) ... [?25ldone
[?25h  Created wheel for sagemaker: filename=sagemaker-2.70.0-py2.py3-none-any.whl size=649170 sha256=ffa302eddf206720318eb8e1ff5de327eb89e238a20568475774f87df6dc90f5
  Stored in directory: /home/ec2-user/.cache/pip/wheels/da/11/20/c45ef599886a2b1399effa68f80b98b2166dc624e19636c303
Successfully built sagemaker
Inst

In [2]:
import subprocess


def execute_cmd(cmd):
    print(cmd)
    output = subprocess.getstatusoutput(cmd)
    return output


def _download_from_s3(_file_path):
    _path = f"s3://reinvent21-sm-rc-wheels/{_file_path}"
    print(f"Path is {_path}")
    ls_cmd = f"aws s3 ls {_path}"
    print(execute_cmd(ls_cmd))

    cmd = f"aws s3 cp {_path} /tmp/"
    print("Downloading: ", cmd)
    return execute_cmd(cmd)


def _install_wheel(wheel_name):
    cmd = f"pip install --no-deps --log /tmp/output3.log /tmp/{wheel_name} --force-reinstall"

    ret = execute_cmd(cmd)

    _name = wheel_name.split(".")[0]
    _, _version = execute_cmd(f"python -c 'import {_name}; print({_name}.__version__)'")

    for package in ["botocore", "sagemaker", "boto3", "awscli"]:
        print(execute_cmd(f"python -c 'import {package}; print({package}.__version__)'"))

    print(f"Installed {_name}:{_version}")

    return ret


def install_sm_py_sdk():
    pySDK_name = "sagemaker.tar.gz"

    exit_code, _ = _download_from_s3("dist/sagemaker.tar.gz")

    if not exit_code:
        _install_wheel(pySDK_name)
    else:
        print(f"'{pySDK_name}' is not present in S3 Bucket. Installing from public PyPi...")
        execute_cmd("pip install sagemaker")


def install_boto_wheels():
    WHEELS = ["botocore.tar.gz", "boto3.tar.gz", "awscli.tar.gz"]

    for wheel_name in WHEELS:
        _path = f"boto3/{wheel_name}"
        exit_code, _ = _download_from_s3(_path)

        if not exit_code:
            _install_wheel(wheel_name)
        else:
            print(f"'{wheel_name}' is not present in S3 Bucket. Ignoring...")


install_boto_wheels()
install_sm_py_sdk()

Path is s3://reinvent21-sm-rc-wheels/boto3/botocore.tar.gz
aws s3 ls s3://reinvent21-sm-rc-wheels/boto3/botocore.tar.gz
(255, '\nAn error occurred (AccessDenied) when calling the ListObjectsV2 operation: Access Denied')
Downloading:  aws s3 cp s3://reinvent21-sm-rc-wheels/boto3/botocore.tar.gz /tmp/
aws s3 cp s3://reinvent21-sm-rc-wheels/boto3/botocore.tar.gz /tmp/
pip install --no-deps --log /tmp/output3.log /tmp/botocore.tar.gz --force-reinstall
python -c 'import botocore; print(botocore.__version__)'
python -c 'import botocore; print(botocore.__version__)'
(0, '1.23.7')
python -c 'import sagemaker; print(sagemaker.__version__)'
(0, '2.70.0')
python -c 'import boto3; print(boto3.__version__)'
(0, '1.20.19')
python -c 'import awscli; print(awscli.__version__)'
(0, '1.22.19')
Installed botocore:1.23.7
Path is s3://reinvent21-sm-rc-wheels/boto3/boto3.tar.gz
aws s3 ls s3://reinvent21-sm-rc-wheels/boto3/boto3.tar.gz
(255, '\nAn error occurred (AccessDenied) when calling the ListObjectsV2 

In [3]:
# Setup clients
import boto3

client = boto3.client(service_name="sagemaker")
runtime = boto3.client(service_name="sagemaker-runtime")

## Build & Push Docker Image to ECR

### Container Structure

- <b>Pretrained-Model</b>: Local training of a LGBM Model, upload the pkl file into the container as a pre-trained example.
- <b>Container</b>: Contains Dockerfile and inference code, the only files you need to adjust are the Dockerfile and predictor.py for your framework/model.

In [5]:
%%sh

# Name of algo -> ECR
algorithm_name=serverless-pretrained-byoc

cd container

#executable for serve
chmod +x regressor/serve

account=$(aws sts get-caller-identity --query Account --output text)

# Region, defaults to us-west-2
region=$(aws configure get region)
region=${region:-us-west-2}

fullname="${account}.dkr.ecr.${region}.amazonaws.com/${algorithm_name}:latest"

# If the repository doesn't exist in ECR, create it.
aws ecr describe-repositories --repository-names "${algorithm_name}" > /dev/null 2>&1

if [ $? -ne 0 ]
then
    aws ecr create-repository --repository-name "${algorithm_name}" > /dev/null
fi

# Get the login command from ECR and execute it directly
aws ecr get-login-password --region ${region}|docker login --username AWS --password-stdin ${fullname}

# Build the docker image locally with the image name and then push it to ECR
# with the full name.

docker build  -t ${algorithm_name} .
docker tag ${algorithm_name} ${fullname}

docker push ${fullname}

Login Succeeded
Sending build context to Docker daemon  28.16kB
Step 1/12 : FROM ubuntu:18.04
 ---> 5a214d77f5d7
Step 2/12 : ARG CONDA_DIR=/opt/conda
 ---> Using cache
 ---> 90035084e03d
Step 3/12 : ENV PATH $CONDA_DIR/bin:$PATH
 ---> Using cache
 ---> 27dae9eb586b
Step 4/12 : RUN apt-get -y update && apt-get install -y --no-install-recommends          wget          python3-pip          python3-setuptools          nginx          ca-certificates     && rm -rf /var/lib/apt/lists/*
 ---> Using cache
 ---> efb8ef471124
Step 5/12 : RUN ln -s /usr/bin/python3 /usr/bin/python
 ---> Using cache
 ---> 60ffc02f0c8d
Step 6/12 : RUN ln -s /usr/bin/pip3 /usr/bin/pip
 ---> Using cache
 ---> 3f140f561a3c
Step 7/12 : RUN apt-get update &&     apt-get install -y --no-install-recommends         ca-certificates         cmake         build-essential         gcc         g++         curl         git &&     curl -sL https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -o conda.sh &&     /bin

https://docs.docker.com/engine/reference/commandline/login/#credentials-store



In [6]:
import boto3
from sagemaker import get_execution_role

account_id = boto3.client('sts').get_caller_identity()['Account']
region = boto3.Session().region_name

#not really used in this use case, use when need to store model artifacts (Ex: MME)
s3_bucket = 'sagemaker-light-gbm-pretrained'

role = get_execution_role() #ensure you have sagemaker full access

## Model Creation

In [7]:
from time import gmtime, strftime
model_name = 'serverless-lgbm-model' + strftime("%Y-%m-%d-%H-%M-%S", gmtime())
model_url = 's3://{}/lgbm/'.format(s3_bucket) ## MODEL S3 URL


#replace with your algorithm_name/ECR repo from first cell
container = '{}.dkr.ecr.{}.amazonaws.com/serverless-pretrained-byoc:latest'.format(account_id, region)
instance_type = 'ml.c5d.18xlarge'

print('Model name: ' + model_name)
print('Model data Url: ' + model_url)
print('Container image: ' + container)

container = {
    'Image': container
}

create_model_response = client.create_model(
    ModelName = model_name,
    ExecutionRoleArn = role,
    Containers = [container])

print("Model Arn: " + create_model_response['ModelArn'])

Model name: serverless-lgbm-model2021-12-02-20-37-45
Model data Url: s3://sagemaker-light-gbm-pretrained/lgbm/
Container image: 474422712127.dkr.ecr.us-west-2.amazonaws.com/serverless-pretrained-byoc:latest
Model Arn: arn:aws:sagemaker:us-west-2:474422712127:model/serverless-lgbm-model2021-12-02-20-37-45


## Endpoint Config Creation

Adjust serverless config within parameters in documentation.

In [8]:
endpoint_config_name = 'serverless-lgbm-ep-config' + strftime("%Y-%m-%d-%H-%M-%S", gmtime())
print('Endpoint config name: ' + endpoint_config_name)

create_endpoint_config_response = client.create_endpoint_config(
    EndpointConfigName = endpoint_config_name,
    ProductionVariants=[
        {
            "VariantName": "byoVariant",
            "ModelName": model_name,
            "ServerlessConfig": {
                "MemorySizeInMB": 4096,
                "MaxConcurrency": 1,
        }}])

print("Endpoint config Arn: " + create_endpoint_config_response['EndpointConfigArn'])

Endpoint config name: serverless-lgbm-ep-config2021-12-02-20-37-53
Endpoint config Arn: arn:aws:sagemaker:us-west-2:474422712127:endpoint-config/serverless-lgbm-ep-config2021-12-02-20-37-53


## Endpoint Creation

In [9]:
endpoint_name = "serverless-lgbm-ep"
create_endpoint_response = client.create_endpoint(
    EndpointName=endpoint_name,
    EndpointConfigName=endpoint_config_name,
)

In [10]:
# wait for endpoint to reach a terminal state (InService) using describe endpoint
import time

describe_endpoint_response = client.describe_endpoint(EndpointName=endpoint_name)

while describe_endpoint_response["EndpointStatus"] == "Creating":
    describe_endpoint_response = client.describe_endpoint(EndpointName=endpoint_name)
    print(describe_endpoint_response["EndpointStatus"])
    time.sleep(15)

describe_endpoint_response

Creating
Creating
Creating
Creating
Creating
Creating
Creating
Creating
Creating
Creating
Creating
Creating
Creating
Creating
Creating
Creating
Creating
Creating
Creating
Creating
InService


{'EndpointName': 'serverless-lgbm-ep',
 'EndpointArn': 'arn:aws:sagemaker:us-west-2:474422712127:endpoint/serverless-lgbm-ep',
 'EndpointConfigName': 'serverless-lgbm-ep-config2021-12-02-20-37-53',
 'ProductionVariants': [{'VariantName': 'byoVariant',
   'DeployedImages': [{'SpecifiedImage': '474422712127.dkr.ecr.us-west-2.amazonaws.com/serverless-pretrained-byoc:latest',
     'ResolvedImage': '474422712127.dkr.ecr.us-west-2.amazonaws.com/serverless-pretrained-byoc@sha256:cbdf163a674023ec708a2f45969ea58025a1b157a54db1b54ca987d32ae6b132',
     'ResolutionTime': datetime.datetime(2021, 12, 2, 20, 37, 58, 290000, tzinfo=tzlocal())}],
   'CurrentWeight': 1.0,
   'DesiredWeight': 1.0,
   'CurrentInstanceCount': 0,
   'CurrentServerlessConfig': {'MemorySizeInMB': 4096, 'MaxConcurrency': 1}}],
 'EndpointStatus': 'InService',
 'CreationTime': datetime.datetime(2021, 12, 2, 20, 37, 56, 857000, tzinfo=tzlocal()),
 'LastModifiedTime': datetime.datetime(2021, 12, 2, 20, 43, 2, 533000, tzinfo=tzloc

## Endpoint Invocation

In [13]:
import json
content_type = "application/json"
request_body = {"input": [[7.5, 3846.0, 9061.0, 0.579]]} #sample data point from dataset

#Serialize data for endpoint
data = json.loads(json.dumps(request_body))
payload = json.dumps(data)

#Endpoint invocation
response = runtime.invoke_endpoint(
    EndpointName=endpoint_name,
    ContentType=content_type,
    Body=payload)

#Parse results
result = json.loads(response['Body'].read().decode())['output'][0]
result

569.3555397493

## Cleanup

In [33]:
client.delete_model(ModelName=model_name)
client.delete_endpoint_config(EndpointConfigName=endpoint_config_name)
client.delete_endpoint(EndpointName=endpoint_name)

{'ResponseMetadata': {'RequestId': '6ae68be4-b12f-4caa-aea1-a98ea4f0f422',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amzn-requestid': '6ae68be4-b12f-4caa-aea1-a98ea4f0f422',
   'content-type': 'application/x-amz-json-1.1',
   'content-length': '0',
   'date': 'Mon, 22 Nov 2021 19:32:01 GMT'},
  'RetryAttempts': 0}}