# Compiling TF2.4 Models with SageMaker Neo

You need to run this notebook on a SageMaker Studio Instance for a complete experience!

**SageMaker Studio Kernel**: Data Science

TF2.4 model zoo: https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf2_detection_zoo.md

We're going to use an **EfficientDet model** in this example, but you can apply the same strategy used in this notebook to compile any models from the list. Depending on the model, you only need to customize the testing code in the last section.

In this exercise you'll:
   - Get a pre-trained model from the Efficientdet
   - Prepare the model to compile it with Neo
   - Compile the model for the target: **X86_64**
   - Get the optimized model and run a simple local test

In [None]:
# required for local tests
!pip install dlr

In [None]:
from sagemaker import get_execution_role

sagemaker_role = get_execution_role()

## 1) Get the pre-trainded model and upload it to S3

In [None]:
import urllib.request
import io
import tarfile
import shutil
import sagemaker

net_version=0
assert(net_version >= 0 and net_version <= 7)
input_shapes=[512,640,768,896,1024,1280,1280,1536]
img_size=input_shapes[net_version]

url=f"http://download.tensorflow.org/models/object_detection/tf2/20200711/efficientdet_d{net_version}_coco17_tpu-32.tar.gz"
print(f"Downloading the model: {url}")

shutil.rmtree('export')
with urllib.request.urlopen(url) as f:
    with tarfile.open(fileobj=io.BytesIO(f.read()), mode="r:gz") as tar:        
        tar.extractall(path='export')
        
print("Create a model package and upload it to S3")
sagemaker_session = sagemaker.Session()
model_name=f'efficientdet-d{net_version}'
with tarfile.open("model.tar.gz", "w:gz") as f:
    f.add(f"export/efficientdet_d{net_version}_coco17_tpu-32/saved_model/", "export/1")
    f.list()

s3_uri = sagemaker_session.upload_data('model.tar.gz', key_prefix=f'{model_name}/model')

print(f"Done\nS3 uri: {s3_uri}")

## 2) Compile the model with SageMaker Neo (X86_64)

**ATTENTION:** It takes around 30mins to compile an EfficientDet

In [None]:
import time
import boto3
import sagemaker

arch='X86_64' # Jetson = ARM64

role = sagemaker.get_execution_role()
sm_client = boto3.client('sagemaker')
compilation_job_name = f'{model_name}-tf2-{int(time.time()*1000)}'
sm_client.create_compilation_job(
    CompilationJobName=compilation_job_name,
    RoleArn=role,
    InputConfig={
        'S3Uri': s3_uri,
        'DataInputConfig': f'{{"input_tensor": [1,{img_size},{img_size},3]}}',
        'Framework': 'TENSORFLOW',
        'FrameworkVersion': '2.4'
    },
    OutputConfig={
        'S3OutputLocation': f's3://{sagemaker_session.default_bucket()}/{model_name}-tf2/optimized/',
        'TargetPlatform': { 
            'Os': 'LINUX', 
            'Arch': arch,
            #'Accelerator': 'NVIDIA'  # comment this if you don't have an Nvidia GPU
        },
        # Comment or change the following line depending on your edge device
        # Jetson Xavier: sm_72; Jetson Nano: sm_53
        #'CompilerOptions': '{"trt-ver": "7.1.3", "cuda-ver": "10.2", "gpu-code": "sm_72"}' # Jetpack 4.4.1
    },
    StoppingCondition={ 'MaxRuntimeInSeconds': 18000 }
)
while True:
    resp = sm_client.describe_compilation_job(CompilationJobName=compilation_job_name)    
    if resp['CompilationJobStatus'] in ['STARTING', 'INPROGRESS']:
        print('Running...')
    else:
        print(resp['CompilationJobStatus'], compilation_job_name)
        break
    time.sleep(5)
    

## 3) Download the compiled model
**ATTENTION:** Only for X86_64 with no GPU

In [None]:
output_model_path = f's3://{sagemaker_session.default_bucket()}/{model_name}-tf2/optimized/model-LINUX_{arch}.tar.gz'
!aws s3 cp $output_model_path /tmp/model.tar.gz
!rm -rf compiled_model && mkdir compiled_model
!tar -xzvf /tmp/model.tar.gz -C compiled_model

## 4) Run the model locally

### download the labels and a sample image

In [None]:
%matplotlib inline
import numpy as np
import cv2
import matplotlib.pyplot as plt
import os
import urllib.request

image_url='https://sagemaker-examples.readthedocs.io/en/latest/_images/cat2.jpg'
if not os.path.exists('cat.jpg'):
    urllib.request.urlretrieve(image_url, 'cat.jpg')

### load the model using the runtime DLR

In [None]:
import dlr
# load the model (CPU x86_64)
model = dlr.DLRModel('compiled_model', 'cpu')

In [None]:
# load the image and make it squared if needed
img = cv2.cvtColor(cv2.imread('cat.jpg'), cv2.COLOR_BGR2RGB)
h,w,c = img.shape
if w!=h: # pad the image and make it square
    sqr_size = max(h,w)
    sqr_img = np.zeros((sqr_size, sqr_size, c), dtype=np.uint8)
    sqr_img[:h, :w],img = img,sqr_img

In [None]:
# resize the image to the expected size+transform it to pytorch/imagenet format
x = cv2.resize(img, (img_size, img_size)).astype(np.float32) / 255.0
# normalize
x -= [0.485, 0.456, 0.406]
x /= [0.229, 0.224, 0.225]
x = x.transpose(2,0,1) # HWC --> CHW
c,h,w = x.shape
x = x.reshape(1,c,h,w) # CHW --> NCHW

In [None]:
y = model.run(x)
idx = np.argmax(y)
print(f"Class id: {idx}, Score: {y[0][0][idx]}, Label: {labels[idx]}")
plt.imshow(img)

# Done! :)