## Creating the model-handler

Input interface to models can vary from one to another and therefore Torchserve provides a way to define how the input data sent to the model should be handled to generate the tensors most often needed for your models. This is called the model handler. Torch serve provide the baseclass to write the model handler.

Torchserve provides default handlers for image_classifier, image_segmenter, object_detector and text_classifier. It is instructive to read the [code] (https://github.com/pytorch/serve/blob/master/ts/torch_handler) for the these handlers to better understand the mechanism.

In this example, we will create a simple handler from scratch for illustration purpose:

## Package Model

## Build Image

## Test Locally

## Deploy

In [1]:
from google.cloud import aiplatform

In [37]:
project='pytorch-tpu-nfs'
staging_bucket='automl-samples'
aiplatform.init(project=project, staging_bucket=staging_bucket)

In [38]:
display_name = 'pt-txt-cls'
serving_container_image_uri = 'us-central1-docker.pkg.dev/pytorch-tpu-nfs/pytorch-models/bert-base'
artifact_uri = None
serving_container_predict_route = '/predictions/bert-base'
serving_container_health_route = '/ping'
description = "bert-base finetuned on IMDB"
serving_container_command = None
serving_container_args = None
serving_container_environment_variables = None
serving_container_ports = [7080]
instance_schema_uri = None
parameters_schema_uri = None
prediction_schema_uri = None
explanation_metadata = None
explanation_parameters = None
sync = True

In [39]:
model = aiplatform.Model.upload(
    display_name=display_name,
    artifact_uri=artifact_uri,
    serving_container_image_uri=serving_container_image_uri,
    serving_container_predict_route=serving_container_predict_route,
    serving_container_health_route=serving_container_health_route,
    serving_container_ports=serving_container_ports,
    sync=sync,
)

model.wait()

print(model.display_name)
print(model.resource_name)

pt-txt-cls
projects/64701051322/locations/us-central1/models/7620635927278256128


In [40]:
endpoint = aiplatform.Endpoint.create(
    project=project,
    display_name=display_name,
)

In [41]:
endpoint = endpoint
deployed_model_display_name = display_name
machine_type = 'n1-standard-16'
min_replica_count = 1
max_replica_count = 3
sync = sync
model.deploy(
        endpoint=endpoint,
        deployed_model_display_name=display_name,
        machine_type=machine_type,
        min_replica_count=min_replica_count,
        max_replica_count=max_replica_count,
        sync=sync,
    )

model.wait()

In [51]:
%%writefile sample_review.txt
It was long and boring movie. I could not watch for 5 hours.

Overwriting sample_review.txt


In [52]:
import base64

sample_review = 'sample_review.txt'
b64_encode = base64.b64encode(open(sample_review, "rb").read())
instance = {
      "data": {
        "b64": str(b64_encode.decode("utf-8"))
      }
    }


In [53]:
prediction = endpoint.predict([instance])

In [54]:
prediction

Prediction(predictions=['NOT_HAPPY'], deployed_model_id='7278854936807342080')