# Black Magic AI Detectron2 Instance Segmentation Cloud Vision API Tutorial

<img src="../images/blackmagicailogo.png">

This tutorial demonstrats how to create an AWS Detectron2 Instance Segmentation Cloud API by deploying a pre-trained  Detectron2 model to an AWS Sagemaker endpoint and exposing it as a REST API using AWS API Gateway.

You can make a copy of this tutorial by "File -> Open in playground mode" and make changes there. __DO NOT__ request access to this tutorial.



# Install detectron2

In [None]:
# Versions: https://github.com/pytorch/vision/
# This is the current pytorch version on Colab. Uncomment this if Colab changes its pytorch version
!pip install torch==1.10.2+cu113 torchvision==0.11.3+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html    
# Install detectron2 that matches the above pytorch version
# See https://detectron2.readthedocs.io/tutorials/install.html for instructions
!pip install detectron2==0.6 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu113/torch1.10/index.html #commented
exit(0)  # After installation, you need to "restart runtime" in Colab. This line can also restart runtime

In [None]:
# import some common libraries
import numpy as np
import cv2, json
import torch, torchvision
# import some common detectron2 utilities
from detectron2 import model_zoo
from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg
from detectron2.utils.visualizer import Visualizer
from detectron2.data import MetadataCatalog, DatasetCatalog
from detectron2.modeling import build_model
import detectron2.data.transforms as T
from detectron2.checkpoint import DetectionCheckpointer

In [None]:
# check pytorch installation: 
import sys
print(torch.__version__, torch.cuda.is_available())
print(torchvision.__version__)
print(sys.version_info)

# Run a pre-trained Detectron2 Instance Segmentation model
["...instance segmentation, we care about detection and segmentation of the instances of objects separately"](https://kharshit.github.io/blog/2019/08/23/quick-intro-to-instance-segmentation)

In other words, we perform segmentation only on the objects detected within the bounding box of object detection.

Define source Image

In [None]:
import matplotlib.pyplot as plt
import numpy as np
from PIL import Image, ImageDraw, ImageFont

input_image="../images/" + "city-scene.jpg"
image_src = Image.open(input_image)
np_image = np.array(image_src, dtype='float32')

image_src.show()

Then, we create a detectron2 config and a detectron2 `DefaultPredictor` to run inference on this image.

In [None]:
# Step 1
cfg = get_cfg()

In [None]:
# Instance Segmentation
# add project-specific config (e.g., TensorMask) here if you're not running a model in detectron2's core library
cfg.merge_from_file(model_zoo.get_config_file("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml"))
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5  # set threshold for this model
# Find a model from detectron2's model zoo. You can use the https://dl.fbaipublicfiles... url as well
cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml")
modelSeg = build_model(cfg)
checkpointer = DetectionCheckpointer(modelSeg)
checkpointer.load(cfg.MODEL.WEIGHTS)
modelSeg.eval()

In [None]:
# Instance Segmentation Visualizer
predictor = DefaultPredictor(cfg)
predictions = predictor(np_image)["instances"]
# We can use `Visualizer` to draw the predictions on the image.
v = Visualizer(np_image[:, :, ::-1], MetadataCatalog.get(cfg.DATASETS.TRAIN[0]), scale=1.2)
out = v.draw_instance_predictions(predictions.to("cpu"))
# cv2_imshow(out.get_image()[:, :, ::-1])
plt.imshow(cv2.cvtColor(out.get_image()[:, :, ::-1], cv2.COLOR_BGR2RGB))
plt.show()

### Export Model

In [None]:
# Export model
torch.save(modelSeg, "../models/model-seg.pth", _use_new_zipfile_serialization=True)

In [None]:
%%bash
# Create model asset required for Sagemaker endpoint deployment. Copy model.tar.gz to S3 bucket model folder.
# Rename model-object.pth file name to model-object.pth per required by Sagemaker endpoint specs
tar --transform='flags=r;s|models/model-seg.pth|model.pth|' -czvf ../models/model.tar-seg.gz ../models/model-seg.pth ../code/inference-seg.py

In [None]:
# Load model
saved_object_model = torch.load("../models/model-seg.pth")
saved_object_model.eval()

In [None]:
# Load image input and get predictions
original_image = cv2.imread(input_image) 

aug = T.ResizeShortestEdge(
             [800, 800], 1333
#             [cfg.INPUT.MIN_SIZE_TEST, cfg.INPUT.MIN_SIZE_TEST], cfg.INPUT.MAX_SIZE_TEST
        )
with torch.no_grad():  # https://github.com/sphinx-doc/sphinx/issues/4258
        # Apply pre-processing to image.
#         if cfg.INPUT.FORMAT == "RGB":
#             # whether the model expects BGR inputs or RGB
#             original_image = original_image[:, :, ::-1]
        height, width = original_image.shape[:2]
        image = aug.get_transform(original_image).apply_image(original_image)
        image = torch.as_tensor(image.astype("float32").transpose(2, 0, 1))

        inputs = {"image": image, "height": height, "width": width}
        predictions = saved_object_model([inputs])
#         print(predictions)

### Use Detectron2 Visualizer on saved model ouput

In [None]:
# Visualizer for loaded model
predictionsSeg=predictions[0]["instances"]
# We can use `Visualizer` to draw the predictions on the image.
v = Visualizer(np_image[:, :, ::-1], MetadataCatalog.get(cfg.DATASETS.TRAIN[0]), scale=1.2)
out = v.draw_instance_predictions(predictionsSeg.to("cpu"))
plt.imshow(cv2.cvtColor(out.get_image()[:, :, ::-1], cv2.COLOR_BGR2RGB))
plt.show()

### Display results using custom Semantic Visualizer which does not use Detectron2 dependances

In [None]:
import random
# Define color map
# ref; copied from https://github.com/facebookresearch/detectron2/blob/224cd2318fdb45b5e22bbb861ee9711ee52c8b75/detectron2/utils/colormap.py
# RGB:
_COLORS = np.array(
    [
        0.000, 0.447, 0.741,
        0.850, 0.325, 0.098,
        0.929, 0.694, 0.125,
        0.494, 0.184, 0.556,
        0.466, 0.674, 0.188,
        0.301, 0.745, 0.933,
        0.635, 0.078, 0.184,
        0.300, 0.300, 0.300,
        0.600, 0.600, 0.600,
        1.000, 0.000, 0.000,
        1.000, 0.500, 0.000,
        0.749, 0.749, 0.000,
        0.000, 1.000, 0.000,
        0.000, 0.000, 1.000,
        0.667, 0.000, 1.000,
        0.333, 0.333, 0.000,
        0.333, 0.667, 0.000,
        0.333, 1.000, 0.000,
        0.667, 0.333, 0.000,
        0.667, 0.667, 0.000,
        0.667, 1.000, 0.000,
        1.000, 0.333, 0.000,
        1.000, 0.667, 0.000,
        1.000, 1.000, 0.000,
        0.000, 0.333, 0.500,
        0.000, 0.667, 0.500,
        0.000, 1.000, 0.500,
        0.333, 0.000, 0.500,
        0.333, 0.333, 0.500,
        0.333, 0.667, 0.500,
        0.333, 1.000, 0.500,
        0.667, 0.000, 0.500,
        0.667, 0.333, 0.500,
        0.667, 0.667, 0.500,
        0.667, 1.000, 0.500,
        1.000, 0.000, 0.500,
        1.000, 0.333, 0.500,
        1.000, 0.667, 0.500,
        1.000, 1.000, 0.500,
        0.000, 0.333, 1.000,
        0.000, 0.667, 1.000,
        0.000, 1.000, 1.000,
        0.333, 0.000, 1.000,
        0.333, 0.333, 1.000,
        0.333, 0.667, 1.000,
        0.333, 1.000, 1.000,
        0.667, 0.000, 1.000,
        0.667, 0.333, 1.000,
        0.667, 0.667, 1.000,
        0.667, 1.000, 1.000,
        1.000, 0.000, 1.000,
        1.000, 0.333, 1.000,
        1.000, 0.667, 1.000,
        0.333, 0.000, 0.000,
        0.500, 0.000, 0.000,
        0.667, 0.000, 0.000,
        0.833, 0.000, 0.000,
        1.000, 0.000, 0.000,
        0.000, 0.167, 0.000,
        0.000, 0.333, 0.000,
        0.000, 0.500, 0.000,
        0.000, 0.667, 0.000,
        0.000, 0.833, 0.000,
        0.000, 1.000, 0.000,
        0.000, 0.000, 0.167,
        0.000, 0.000, 0.333,
        0.000, 0.000, 0.500,
        0.000, 0.000, 0.667,
        0.000, 0.000, 0.833,
        0.000, 0.000, 1.000,
        0.000, 0.000, 0.000,
        0.143, 0.143, 0.143,
        0.857, 0.857, 0.857,
        1.000, 1.000, 1.000
    ]
).astype(np.float32).reshape(-1, 3)
indices = random.sample(range(len(_COLORS)), 74) ## Create list of indices for random colors in the double RGB _colors array

all_classed_list = ['person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light', 'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard', 'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch', 'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone', 'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear', 'hair drier', 'toothbrush']

ret = [[int(i) for i in (_COLORS[i] * 255)] for i in indices] ## Array containing RGB int values
colorMap=dict(zip(all_classed_list, tuple(ret)))

Define custom Semantic Instance visualizer

In [None]:
# Opening JSON file
with open('pan_metadata.json') as json_file:
    metadata = json.load(json_file)

In [None]:
def semanitc_visualizer(predictionsSegs, instance_list, image_src, thing_classes, thing_colors, boxes, opacity):
    imageSize = np.append(tuple(reversed(image_src.size)), 4) # get image shape info
    rgba = np.zeros(imageSize, dtype = np.uint8)
    rgba[:, :] = [255, 255, 255, 0]
    font = ImageFont.truetype('fonts/FreeSerif.ttf', 8)

    for idx,binary_mask in enumerate(predictionsSegs):
#         binary_mask=np.array((maskX == True),  dtype=int)
        color=thing_colors[instance_list[idx]]
        name=thing_classes[instance_list[idx]]
        rgba[(binary_mask == 1), :] = np.append(color, opacity)

    maskXImg = Image.fromarray(np.asarray(rgba),mode='RGBA')
    draw = ImageDraw.Draw(maskXImg)
#     Draw boxes and instance labels
    for seg_info, boxx in zip(instance_list, boxes):
      box=boxx
      text = f"{thing_classes[seg_info]} {.85:.0%}"
      len = draw.textlength(text=text)
      bbox = draw.textbbox((box[0], box[1]), text, font=font)
      h = bbox[3] - bbox[1]
      draw.rectangle([(box[0], box[1]-h), (box[0] + len, box[1])], fill=(0,0,0))#text background rectangle
      draw.text((box[0], box[1]-h), text, fill=(255, 255, 255)) # draw text on instance
      draw.rectangle([(box[0], box[1]), (box[2], box[3])], outline=(0,255,0))#blue rectangle
    return Image.alpha_composite(image_src, maskXImg)

In [None]:
# Define function inputs    
opacity=150
predictionsSegMasks=predictions[0]["instances"].pred_masks.cpu() # predictions[0]["instances"].pred_masks.cpu().numpy()
prediction_classes=predictions[0]["instances"].pred_classes
boxes=predictions[0]['instances'].pred_boxes
image_src = image_src.convert('RGBA')

# Call function
semanitc_visualizer(predictionsSegMasks, prediction_classes, image_src, metadata["thing_classes"], metadata["thing_colors"], boxes, opacity)
# out.show()

### Deploy Semantic/Instance Detection Model to Endpoint

Upload model.tar.gz file to s3 bucket model folder

In [None]:
import boto3
import sagemaker
from sagemaker.pytorch import PyTorchModel
from sagemaker.predictor import Predictor
from sagemaker import get_execution_role, Session

sess = Session(default_bucket='<INSERT-AWS-S3-BUCKET-NAME-HERE>')

# print(model_data)
role = get_execution_role()

# Connect to S3 bucket and upload file to s3 bucket
s3 = boto3.resource('s3')
s3.Bucket('<INSERT-AWS-S3-BUCKET-NAME-HERE>').upload_file("../models/model.tar-seg.gz", "model/model.tar.gz")

uri = sess.list_s3_files(sess.default_bucket(), 'model')
# print(uri)
model_data = sagemaker.s3.s3_path_join('s3://', sess.default_bucket(), uri[1])

In [None]:
# Sematic Instance detection
region = sess.boto_region_name
serve_image_uri = f"<INSERT-AWS-ELASTIC-CONTAINER-REGISTRY-REPOSITORY-NAME-HERE>" ##custom image
pyModel = PyTorchModel(
    entry_point="inference-seg.py",
    source_dir="../code",
    role=role,
    model_data=model_data,
    image_uri=serve_image_uri,
    framework_version="1.10.2",
    py_version="py38"
)

predictorEndpt = pyModel.deploy(instance_type='ml.p3.2xlarge', initial_instance_count=1)

Validate Endpoint - perform inference

In [None]:
# Ref:
# https://aws.amazon.com/blogs/compute/handling-binary-data-using-amazon-api-gateway-http-apis/
# https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-payload-encodings.html
# https://aws.amazon.com/premiumsupport/knowledge-center/api-gateway-binary-data-lambda/
# https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-payload-encodings-configure-with-control-service-api.html
# https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-payload-encodings-configure-with-console.html
# https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-mapping-template-reference.html
import boto3
import io
from base64 import b64encode,b64decode
from io     import BytesIO
from PIL import Image, ImageDraw, ImageFont
endpoint = '<INSERT_ENDPOINT_NAME_HERE>'
runtime= boto3.client('runtime.sagemaker')

image_src = Image.open(input_image)
# resize image
# size = 640, 480
# size = 320, 240
size = 250, 170
# size = 160, 120
image_src.thumbnail(size, Image.ANTIALIAS)
    
imgByteArr = io.BytesIO()

image_src.save(imgByteArr, format=image_src.format)
imgByteArr = imgByteArr.getvalue()

# Send image via InvokeEndpoint API
response = runtime.invoke_endpoint(EndpointName=endpoint, ContentType='application/x-image', Body=imgByteArr)
result = response['Body'].read().decode()
res = json.loads(result) # convert json string to Python dict for parsing

Display Detectron2 Semantic Instance segmentation inference results

In [None]:
# Define function inputs    
opacity=150
predictionsSegMasks=np.array(res[0]['sematic_seg']) # res[0]['sematic_seg']
prediction_classes=res[0]["pred_classes"]
boxes=res[0]['pred_boxes']

image_src = image_src.convert('RGBA')

# Call function
semanitc_visualizer(predictionsSegMasks, prediction_classes, image_src, metadata["thing_classes"], metadata["thing_colors"], boxes, opacity)

### Call Semantic Instance API using Python Request library

Create AWS API gateway before performing this step

In [None]:
# import some common libraries
# Using Python Request library
import requests
import json
import numpy as np
import time
import io
from PIL import Image, ImageDraw, ImageFont

# Define Constants
API_INVOKE_URL="<INSERT_API_INVOKE_URL_HERE>"

# define variables
url=API_INVOKE_URL

def cloud_api_predict(headers, payload):
    # send POST request to url
    return requests.request("POST", url, headers=headers, data=payload).text

# Read image into memory - needed because of image size reduction
image_src = Image.open(input_image)
# resize image
# size = 640, 480
# size = 320, 240
size = 250, 170
# size = 160, 120
image_src.thumbnail(size, Image.ANTIALIAS)

imgByteArr = io.BytesIO()

image_src.save(imgByteArr, format=image_src.format)
payload = imgByteArr.getvalue()

# with open(input_image, 'rb') as f:
#     payload = f.read()

headers = {
  'Accept': 'image/jpeg',
  'Content-Type': 'image/jpeg'
}

predictions=cloud_api_predict(headers, payload)

Display Detectron2 Object Detections inference results from API

In [None]:
res = json.loads(predictions) # convert json string to Python dict for parsing
# Define function inputs    
opacity=150
predictionsSegMasks=np.array(res[0]['sematic_seg']) # res[0]['sematic_seg']
prediction_classes=res[0]["pred_classes"]
boxes=res[0]['pred_boxes']

image_src = image_src.convert('RGBA')

# Call function
semanitc_visualizer(predictionsSegMasks, prediction_classes, image_src, metadata["thing_classes"], metadata["thing_colors"], boxes, opacity)

Create aws Lamda deployment zip package
This is required because Pillow is not automatically included in AWS lamda environment.
#Ref:

https://docs.aws.amazon.com/lambda/latest/dg/lambda-deploy-functions.html

https://docs.aws.amazon.com/lambda/latest/dg/python-package.html#python-package-create-package-no-dependency

In [None]:
lambda template
!pip install virtualenv
# !ls -l
%cd lamdaenv
# !pwd
!python3 -m venv myvenv
!source myvenv/bin/activate
!pip install Pillow
!deactivate
%cd ..
# !ls -l myvenv/lib/python3.6/site-packages
# !ls -l myvenv/lib64/python3.6/site-packages
%cd myvenv/lib/python3.6/site-packages
!zip -r ../../../../my-deployment-package1.zip .
%cd ../../../../

%cd myvenv/lib64/python3.6/site-packages
!zip -g my-deployment-packag1e.zip lambda_function.py
# !ls -l
# !pwd
# %cd SageMaker

# !pip show Pillow | grep Location:
# %cd /home/ec2-user/anaconda3/envs/pytorch_latest_p36/lib/python3.6/site-packages
# !zip -r ../../../../../../SageMaker/lamdaenv/my-deployment-package.zip PIL Pillow-8.4.0.dist-info Pillow.libs
# %cd ../../../../../../SageMaker/lamdaenv
# !zip -g my-deployment-package.zip lambda_function.py

# !pwd
# !ls -l
