# Triton ModelBuilder

This notebook was tested with the `conda_pytorch_p310` kernel on an Amazon SageMaker notebook instance of type `ml.g4dn.2xlarge`.

In [None]:
!pip install boto3 sagemaker -U -q

## Prerequisites

This notebook requires `tritonClient[http]` to be installed

In [None]:
!pip install tritonClient[http] -q

# SageMaker Model Builder experience

In the new experience, we have introduced a few new constructs. Here we will focus on the following: 

1. ModelBuilder
2. SchemaBuilder
3. InferenceSpec

In the following section, we will define these constructs and provide examples to elaborate on each one.

4.1 ModelBuilder:

ModelBuilder is a Python class that takes a framework model (such as XGBoost or PyTorch) or an Inference Spec (more details below) and converts them into a SageMaker deployable model. ModelBuilder provides a `build` function that generates the artifacts for deployment. The model artifact generated is specific to the model server, which is also customizable as one of the inputs.

```python
Class definition:

class ModelBuilder(
    model_path: str | None = '/tmp/sagemaker/model-builder/' + uuid.uuid1().hex,
    role_arn: str | None = None,
    sagemaker_session: Session | None = None,
    name: str | None = 'model-name-' + uuid.uuid1().hex,
    mode: Mode | None = Mode.SAGEMAKER_ENDPOINT,
    shared_libs: List[str] = lambda : [],
    dependencies: Dict[str, Any] | None = lambda : { "auto": False },
    env_vars: Dict[str, str] | None = lambda : {},
    log_level: int | None = logging.DEBUG,
    content_type: str | None = None,
    accept_type: str | None = None,
    s3_model_data_url: str | None = None,
    instance_type: str | None = "ml.c5.xlarge",
    schema_builder: str | None = None,
    model: Any | None = None,
    inference_spec: InferenceSpec = None,
    image_uri: str | None = None,
    model_server: str | None = None
)
```
Example:

The above class file provide all the options for customization. However to deploy the framework model, the model builder just expects model, input, output and the role. 

```python
model_builder = ModelBuilder(
    model=model,  # Pass in the actual model object. It's "predict" method will be invoked in the endpoint.
    schema_builder=SchemaBuilder(input, output), # Pass in a "SchemaBuilder" which will use the sample test input and output objects to infer the serialization needed.
    role_arn=role, # Pass in the role arn or update intelligent defaults.
    )
```

4.2 SchemaBuilder:

The SchemaBuilder enables you to define the input and output for your endpoint. It allows the SchemaBuilder to generate the corresponding marshalling functions for serializing and deserializing the input and output. For further details, please consult the notebook or refer to the video.

Class definition:
```python
class SchemaBuilder(
    sample_input: Any,
    sample_output: Any,
    input_translator: CustomPayloadTranslator = None,
    output_translator: CustomPayloadTranslator = None
)
```
Example:

The CustomPayloadTranslator class provides all the options for customization. However, for [common inference data format](https://docs.aws.amazon.com/sagemaker/latest/dg/cdf-inference.html), you can just provide the sample input/output for the SchemaBuilder.
```python
input = "How is the demo going?"
output = "Comment la démo va-t-elle?"
schema = SchemaBuilder(input, output)
```

4.3 InferenceSpec

In the case you want to specify custom function to load and invoke the model instead of the framework model function, then you can pass the inference spec with your implementation in `load` and `invoke` function. 

class definition:
```python
class InferenceSpec(abc.ABC):
    @abc.abstractmethod
    def load(self, model_dir: str):
        pass

    @abc.abstractmethod
    def invoke(self, input_object: object, model: object):
        pass
```
Example:
```python
class MyInferenceSpec(InferenceSpec):
    def load(self, model_dir: str):
        return pipeline("translation_en_to_fr", model="t5-small")
        
    def invoke(self, input, model):
        return model(input)
   
inf_spec = MyInferenceSpec()

```

### SageMaker ModelBuilder: Local deployment

Now we will use SageMaker ModelBuilder class to prepare the model for local and remote deployment.

We are using a ResNet50 model using `ModelBuilder` with Triton model server to deploy a PyTorch model on local machine. Then we use `Mode` to switch between local testing and deploying to a SageMaker Endpoint. 

In [None]:
from sagemaker.serve import ModelBuilder, SchemaBuilder, Mode
from sagemaker.serve import ModelServer
from torchvision.transforms import transforms
from torchvision.models import resnet50, ResNet50_Weights
from pathlib import Path
from PIL import Image
import torch

import boto3, sagemaker

# Load model
resnet_model = resnet50(weights=ResNet50_Weights.DEFAULT)
resnet_model.eval()

# Define image transformation
transform = transforms.Compose([
    transforms.Resize(256),
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])

# Get sample input
image_path = Path('./pytorch/zidane.jpeg').resolve()
image = Image.open(str(image_path))
sample_input = transform(image).unsqueeze(0)


with torch.no_grad():
    sample_output = resnet_model(sample_input)

In [None]:
from sagemaker import get_execution_role, Session, image_uris
import boto3
sagemaker_session = Session()
region = boto3.Session().region_name

# get execution role
# please use execution role if you are using notebook instance or update the role arn if you are using a different role
execution_role = get_execution_role() if get_execution_role() is not None else "your-role-arn"

In [None]:
runtime_sm_client = boto3.client("sagemaker-runtime")
sagemaker_session = sagemaker.Session(boto_session=boto3.Session())
triton_working_dir = str(Path('./triton_working_dir').resolve())

triton_model = ModelBuilder(
    model=resnet_model,
    schema_builder=SchemaBuilder(sample_input=sample_input, sample_output=sample_output),
    model_path=triton_working_dir, # Optional
    role_arn=execution_role,
    sagemaker_session=sagemaker_session,
    model_server=ModelServer.TRITON,
    mode=Mode.LOCAL_CONTAINER
).build()

In [None]:
triton_local_predictor = triton_model.deploy()

In [None]:
triton_local_predictor.predict(sample_input)

### SageMaker ModelBuilder: Deploy to a SageMaker Endpoint

Now we have tested the model prediction locally, we can continue to deploy the model to a SageMaker endpoint.We do that by switching `mode` to deploy on SageMaker endpoint. We also need to make sure we supply `role`, `initial_instance_count` and `instance_type`.

In [None]:
triton_predictor = triton_model.deploy(
    mode=Mode.SAGEMAKER_ENDPOINT, 
    role=execution_role,
    initial_instance_count=1,
    instance_type="ml.g4dn.2xlarge"
)

In [None]:
triton_predictor.predict(sample_input)

## Clean up

In [None]:
triton_local_predictor.delete_predictor()
triton_predictor.delete_model()
triton_predictor.delete_endpoint()