# Amazon SageMaker Semantic Segmentation Algorithm - v9: deploy

post migration from [roofAI](https://github.com/kamangir/roofAI/tree/main/notebooks/sagemaker)

In [1]:
# !pip3 install 'sagemaker>=2,<3'

In [2]:
import sys
sys.path.append("../../")

from blueflow import notebooks

import sagemaker

from blue_options import string

from blue_sandbox import fullname
from blue_sandbox.logger import logger

logger.info(f"{fullname()}, built on {string.pretty_date()}")



sagemaker.config      - Not applying SDK defaults from location: /Library/Application Support/sagemaker/config.yaml
sagemaker.config      - Not applying SDK defaults from location: /Users/kamangir/Library/Application Support/sagemaker/config.yaml


🌀  blue_sandbox-5.108.1, built on 12 January 2025, 13:36:05


In [3]:
# attach to a completed training job
ss_estimator = sagemaker.estimator.Estimator.attach("sagesemseg-model-2025-01-11-22-00-08-bW-2025-01-12-06-00-08-928")


2025-01-12 06:05:43 Starting - Preparing the instances for training
2025-01-12 06:05:43 Downloading - Downloading the training image
2025-01-12 06:05:43 Training - Training image download completed. Training in progress.
2025-01-12 06:05:43 Uploading - Uploading generated training model
2025-01-12 06:05:43 Completed - Training job completed


## Deployment

Once the training is done, we can deploy the trained model as an Amazon SageMaker hosted endpoint. This will allow us to make predictions (or inference) from the model.

Note that we don't have to host on the same number or type of instances that we used to train, and can choose any SageMaker-supported instance type. Training is compute-heavy job that may have different infrastructure requirements than inference/hosting. In our case we chose the GPU-accelerated `ml.p3.2xlarge` instance to train, but will host the model on a lower cost-per-hour `ml.c5.xlarge` type - because we'll only be serving occasional requests.

The endpoint deployment can be accomplished as follows:

In [None]:
ss_predictor = ss_estimator.deploy(initial_instance_count=1, instance_type="ml.c5.xlarge")

sagemaker Creating model with name: sagesemseg-model-2025-01-11-22-00-08-bW-2025-01-12-21-36-11-294
sagemaker Creating endpoint-config with name sagesemseg-model-2025-01-11-22-00-08-bW-2025-01-12-21-36-11-294
sagemaker Creating endpoint with name sagesemseg-model-2025-01-11-22-00-08-bW-2025-01-12-21-36-11-294


---------

In [None]:
# As with Estimators & training jobs, we can instead attach to an existing Endpoint:
# ss_predictor = sagemaker.predictor.Predictor("ss-notebook-demo-2020-10-29-07-23-03-086")

## Inference

Now that the trained model is deployed to an endpoint, we can use this endpoint for inference.

To test it out, let us download an image from the web which the algorithm has so-far not seen. 

In [None]:
filename_raw = "data/test.jpg"

!wget -O $filename_raw https://upload.wikimedia.org/wikipedia/commons/b/b4/R1200RT_in_Hongkong.jpg

The scale of the input image may affect the prediction results and latency, so we'll down-scale the raw image before sending it to our endpoint. You could experiment with different input resolutions (and aspect ratios) and see how the results change:

In [None]:
from matplotlib import pyplot as plt
import PIL

%matplotlib inline

filename = "data/test_resized.jpg"
width = 800

im = PIL.Image.open(filename_raw)

aspect = im.size[0] / im.size[1]

im.thumbnail([width, int(width / aspect)], PIL.Image.ANTIALIAS)
im.save(filename, "JPEG")

plt.imshow(im)
plt.show()

The endpoint accepts images in formats similar to the ones found images in the training dataset. The input `Content-Type` should be `image/jpeg`, and the output `Accept` type can be either:

- `image/png`, which produces an indexed-PNG segmentation mask as used in training: One predicted class ID per pixel... Or,
- `application/x-protobuf`, which produces a 3D matrix giving the *confidence of each class*, for each pixel.

In the SageMaker SDK, A `Predictor` has an associated **serializer** and **deserializer** which control how data gets translated to our API call, and loaded back into a Python result object.

There are pre-built [serializers](https://sagemaker.readthedocs.io/en/stable/api/inference/serializers.html) and [deserializers](https://sagemaker.readthedocs.io/en/stable/api/inference/deserializers.html) offered by the SDK, and we're free to define custom ones so long as they offer the same API.


### Basic inference - class IDs PNG

In our first example, we'll request the simple PNG response and would like to map those into pixel arrays (assigned class for each pixel)... So we'll write a custom deserializer for that:

In [None]:
from PIL import Image
import numpy as np


class ImageDeserializer(sagemaker.deserializers.BaseDeserializer):
    """Deserialize a PIL-compatible stream of Image bytes into a numpy pixel array"""

    def __init__(self, accept="image/png"):
        self.accept = accept

    @property
    def ACCEPT(self):
        return (self.accept,)

    def deserialize(self, stream, content_type):
        """Read a stream of bytes returned from an inference endpoint.
        Args:
            stream (botocore.response.StreamingBody): A stream of bytes.
            content_type (str): The MIME type of the data.
        Returns:
            mask: The numpy array of class labels per pixel
        """
        try:
            return np.array(Image.open(stream))
        finally:
            stream.close()


ss_predictor.deserializer = ImageDeserializer(accept="image/png")

For the input our data is already stored as a JPEG file, so we'll use the built-in `IdentitySerializer` and feed it the file bytes:

In [None]:
ss_predictor.serializer = sagemaker.serializers.IdentitySerializer("image/jpeg")

with open(filename, "rb") as imfile:
    imbytes = imfile.read()

# Extension exercise: Could you write a custom serializer which takes a filename as input instead?

With that configured, calling our endpoint is now simple!

In [None]:
%%time

cls_mask = ss_predictor.predict(imbytes)

print(type(cls_mask))
print(cls_mask.shape)

Let us display the segmentation mask.

Since the raw value of each pixel is a small number (the class ID), we'll apply a [colormap](https://matplotlib.org/3.3.2/tutorials/colors/colormaps.html) to make it a bit more human readable and not just a black square!

In [None]:
plt.imshow(cls_mask, cmap="jet")
plt.show()

### Advanced inference - class probabilities matrix

The second `Accept` type allows us to request all the class probabilities for each pixel.

Our input processing will be unchanged, but we'll define a new custom Deserializer to unpack the *RecordIO-wrapped protobuf* content returned by the endpoint.

This format takes a little more effort to convert into an array than the basic PNG response. In the code below, we:

- Make use of `mxnet` to open the [RecordIO](http://mesos.apache.org/documentation/latest/recordio/) wrapping
- Use the `record_pb2` utility from the SageMaker SDK to load the Record contents in [protocol buffers](https://github.com/protocolbuffers/protobuf) format
- Find that the record contains two fields `shape` (the shape of the matrix) and `target` (the probability predictions).
- Load the `target` matrix in usable numpy array format, and map its shape appropriately.


What we receive back is a recordio-protobuf of probablities sent as a binary. It takes a little bit of effort to convert into a readable array. Let us convert them to numpy format. We can make use of `mxnet` that has the capability to read recordio-protobuf formats. Using this, we can convert the outcoming bytearray into numpy array.

The protobuf array has two parts to it. The first part contains the shape of the output and the second contains the values of probabilites. Using the output shape, we can transform the probabilities into the shape of the image, so that we get a map of values. There typically is a singleton dimension since we are only inferring on one image. We can also remove that using the `squeeze` method.

In [None]:
import io
import tempfile

import mxnet as mx
from sagemaker.amazon.record_pb2 import Record


class SSProtobufDeserializer(sagemaker.deserializers.BaseDeserializer):
    """Deserialize protobuf semantic segmentation response into a numpy array"""

    def __init__(self, accept="application/x-protobuf"):
        self.accept = accept

    @property
    def ACCEPT(self):
        return (self.accept,)

    def deserialize(self, stream, content_type):
        """Read a stream of bytes returned from an inference endpoint.
        Args:
            stream (botocore.response.StreamingBody): A stream of bytes.
            content_type (str): The MIME type of the data.
        Returns:
            mask: The numpy array of class confidences per pixel
        """
        try:
            rec = Record()
            # mxnet.recordio can only read from files, not in-memory file-like objects, so we buffer the
            # response stream to a file on disk and then read it back:
            with tempfile.NamedTemporaryFile(mode="w+b") as ftemp:
                ftemp.write(stream.read())
                ftemp.seek(0)
                recordio = mx.recordio.MXRecordIO(ftemp.name, "r")
                protobuf = rec.ParseFromString(recordio.read())
            values = list(rec.features["target"].float32_tensor.values)
            shape = list(rec.features["shape"].int32_tensor.values)
            # We 'squeeze' away extra dimensions introduced by the fact that the model can operate on batches
            # of images at a time:
            shape = np.squeeze(shape)
            mask = np.reshape(np.array(values), shape)
            return np.squeeze(mask, axis=0)
        finally:
            stream.close()


ss_predictor.deserializer = SSProtobufDeserializer()

In [None]:
%%time
prob_mask = ss_predictor.predict(imbytes)

print(type(prob_mask))
print(prob_mask.shape)

The assigned class labels from the previous method are equivalent to the *index of the maximum-confidence class*, for each pixel - so we should be able to reconstruct the same image as before by taking the `argmax` over the classes dimension:

In [None]:
cls_mask_2 = np.argmax(prob_mask, axis=0)

plt.imshow(cls_mask_2, cmap="jet")
plt.show()

But this time, we can also view the *probabilities* for a particular class:

In [None]:
target_cls_id = 14  # (motorbike)
plt.imshow(prob_mask[target_cls_id, :, :], cmap="inferno")
plt.show()

...And perhaps generate an overlay image for easy human review:

In [None]:
imarray = np.array(PIL.Image.open(filename)) / 255.0  # Convert image pixels from 0-255 to 0-1
hilitecol = np.array((0.0, 1.0, 1.0, 1.0))  # Cyan, 100% opacity (RGBAlpha 0-1 range)

# Red-shift our image to make the cyan highlights more obvious:
imshifted = imarray.copy()
imshifted[:, :, 1] *= 0.6
imshifted[:, :, 2] *= 0.5

# Construct a mask with alpha channel taken from the model result:
hilitemask = np.tile(hilitecol[np.newaxis, np.newaxis, :], list(imarray.shape[:2]) + [1])
hilitemask[:, :, 3] = prob_mask[target_cls_id, :, :]

# Overlay the two images:
fig, (ax0, ax1, ax2) = plt.subplots(1, 3, figsize=(16, 6))

ax0.imshow(imarray)
ax0.axis("off")
ax0.set_title("Original Image")
ax2.imshow(hilitemask)
ax2.axis("off")
ax2.set_title("Highlight Mask")

ax1.imshow(imshifted)
ax1.imshow(hilitemask)
ax1.axis("off")
ax1.set_title("Color-shifted Overlay")

plt.show()

## Delete the Endpoint

Deployed endpoints are backed by infrastructure (1x`ml.c5.xlarge` in our case, as we requested above) - so we should delete the endpoint when we're finished with it, to avoid incurring continued costs.

In [None]:
ss_predictor.delete_endpoint()

## Notebook CI Test Results

This notebook was tested in multiple regions. The test results are as follows, except for us-west-2 which is shown at the top of the notebook.

![This us-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/us-east-1/introduction_to_amazon_algorithms|semantic_segmentation_pascalvoc|semantic_segmentation_pascalvoc.ipynb)

![This us-east-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/us-east-2/introduction_to_amazon_algorithms|semantic_segmentation_pascalvoc|semantic_segmentation_pascalvoc.ipynb)

![This us-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/us-west-1/introduction_to_amazon_algorithms|semantic_segmentation_pascalvoc|semantic_segmentation_pascalvoc.ipynb)

![This ca-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ca-central-1/introduction_to_amazon_algorithms|semantic_segmentation_pascalvoc|semantic_segmentation_pascalvoc.ipynb)

![This sa-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/sa-east-1/introduction_to_amazon_algorithms|semantic_segmentation_pascalvoc|semantic_segmentation_pascalvoc.ipynb)

![This eu-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-west-1/introduction_to_amazon_algorithms|semantic_segmentation_pascalvoc|semantic_segmentation_pascalvoc.ipynb)

![This eu-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-west-2/introduction_to_amazon_algorithms|semantic_segmentation_pascalvoc|semantic_segmentation_pascalvoc.ipynb)

![This eu-west-3 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-west-3/introduction_to_amazon_algorithms|semantic_segmentation_pascalvoc|semantic_segmentation_pascalvoc.ipynb)

![This eu-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-central-1/introduction_to_amazon_algorithms|semantic_segmentation_pascalvoc|semantic_segmentation_pascalvoc.ipynb)

![This eu-north-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-north-1/introduction_to_amazon_algorithms|semantic_segmentation_pascalvoc|semantic_segmentation_pascalvoc.ipynb)

![This ap-southeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-southeast-1/introduction_to_amazon_algorithms|semantic_segmentation_pascalvoc|semantic_segmentation_pascalvoc.ipynb)

![This ap-southeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-southeast-2/introduction_to_amazon_algorithms|semantic_segmentation_pascalvoc|semantic_segmentation_pascalvoc.ipynb)

![This ap-northeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-northeast-1/introduction_to_amazon_algorithms|semantic_segmentation_pascalvoc|semantic_segmentation_pascalvoc.ipynb)

![This ap-northeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-northeast-2/introduction_to_amazon_algorithms|semantic_segmentation_pascalvoc|semantic_segmentation_pascalvoc.ipynb)

![This ap-south-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-south-1/introduction_to_amazon_algorithms|semantic_segmentation_pascalvoc|semantic_segmentation_pascalvoc.ipynb)
