# Deployment notebook

This notebook is used to deploy the endpoint using the Sagemaker SDK, both locally and 
online. This is not meant to be the main source of endpoint provision, which should be
done with terraform through the CD pipeline, but rather this is a way to test that
everything works before provisioning it.

It also register the model in the model registry for CD provisioning later.


--- 

**Note**: this notebook must be run outside of the `dev environment` container. This is 
because the sagemaker local development container can't spin up.

The development workflow is as following: 
- All the development happens inside the dev container
- Only when there is the need to run the notebook, this is run from another vscode 
window connected with ssh only
- The `inference.py` script should be tested with their invidual functions, eg: as shown
in the `aws/endpoint/src/tests/` folder. Once these work as expected, only then the you
should execute the notebook. This is a huge time-saver, because the notebook can be
very slow to run.

---

Before running the cells, make sure you login to AWS using either:

- `aws configure sso` → for first time login
- `aws sso login` → for all subsequent login

In [2]:
# general settings, shared between local and online deployments
# `model_image_uri` from https://github.com/aws/deep-learning-containers/blob/master/available_images.md 

model_name = "ai-module-model"
model_entry_point = "../src/code/inference.py"
model_data = "../model/model.tar.gz"
model_image_uri = "<image-uri>"

endpoint_name = "endpoint-ai-module-0001-dev"

In [None]:
# set local temp folder to avoid /tmp to become full
import os
from pathlib import Path

repo_root_dir = Path(os.getcwd()).parents[2].resolve()
local_temp_folder_path = str(repo_root_dir / ".temp" / "sagemaker_local")

## Local

In [None]:
!pip install sagemaker[local]

In [21]:
import sagemaker
from sagemaker.local import LocalSession
from sagemaker.huggingface.model import HuggingFaceModel

session = LocalSession()

session.config = {
    "local": {
        "local_code": True,
        "container_root": local_temp_folder_path,
    }
}

session.settings = sagemaker.session_settings.SessionSettings(
    local_download_dir = local_temp_folder_path
)

role = sagemaker.get_execution_role()

print("Role:", role)
print("Local temp folder path:", local_temp_folder_path)

Role: arn:aws:iam::138140302683:role/aws-reserved/sso.amazonaws.com/AWSReservedSSO_AdministratorAccess_6f1d7369dc867f6b
Local temp folder path: /home/ubuntu/aws-ex-05/.temp/sagemaker_local


In [None]:
model = HuggingFaceModel(
    name=model_name,
    role=role,
    entry_point=model_entry_point,
    model_data=model_data,
    image_uri=model_image_uri,
    sagemaker_session=session,
)

predictor = model.deploy(
    initial_instance_count=1,
    instance_type="local_gpu",
    endpoint_name=endpoint_name,
    sagemaker_session=session,
)

## Online

In [1]:
import boto3
import sagemaker
from sagemaker.huggingface.model import HuggingFaceModel

boto_session = boto3.Session()
client = boto3.client(service_name="sagemaker")

sagemaker_session = sagemaker.Session()

sagemaker_session.settings = sagemaker.session_settings.SessionSettings(
    local_download_dir = local_temp_folder_path
)

role = "arn:aws:iam::138140302683:role/service-role/AmazonSageMaker-ExecutionRole-20230522T162566"

In [13]:
# step 1: create the model

model = HuggingFaceModel(
    name=model_name,
    role=role,
    entry_point=model_entry_point,
    model_data=model_data,
    image_uri=model_image_uri,
    sagemaker_session=sagemaker_session,
)

In [14]:
# step 2: register the model

model.register(
    model_package_group_name="ai-module-group-name",
    content_types=["application/json"],
    response_types=["application/json"],
    inference_instances=["ml.g4dn.xlarge"],
    approval_status="Approved",
)

<sagemaker.model.ModelPackage at 0x7f6bf15d3af0>

In [5]:
# step 3: create endpoint

predictor = model.deploy(
    initial_instance_count=1,
    instance_type="ml.g4dn.xlarge",
    endpoint_name=endpoint_name,
)

-------------!

# Predictor tests

In [None]:
input_data = None # fill this 

response = predictor.predict(data=input_data)

In [None]:
import boto3
import json

runtime_client = boto3.client('sagemaker-runtime')

endpoint_name = "endpoint-image-generation-0001-dev"

response = runtime_client.invoke_endpoint(
    EndpointName=endpoint_name,
    ContentType="application/json",
    Accept="application/json",
    Body=json.dumps(input_data),
)

response_body = json.loads(response["Body"].read().decode())

In [12]:
# delete endpoint 
# NOTE: this doesn't delete the model in the s3 bucket, nor it deletes the model from
# model registry

predictor.delete_model()
predictor.delete_endpoint()