# Deploying model into AWS cloud environment using Sagemaker.

Steps:
- [ ] Upload model to s3 bucket
- [ ] Convert keras to proto buf format from h5
- [ ] Deploy an endpoint in sagemaker
- [ ] Establish a lambda function triggered by API gateway (Code provided)
- [ ] Notebook entry to generate s3 signed urls
- [ ] Upload file to s3 using signed urls
- [ ] Infer endpoint to infer on the uploaded file
- [ ] Visualize the image and results
- [ ] Add descriptions to the notebook

In [None]:
# Upgrade available version of sagemaker

!pip install -U sagemaker tensorflow==2.9.1 numpy

In [None]:
# import sagemaker and tensorflow model to load the model from s3 bucket.

import sagemaker
from sagemaker.tensorflow import TensorFlowModel


In [None]:
# Tensorflow imports needed for model conversion.
import tarfile
import tensorflow as tf
import tensorflow.compat.v1.keras.backend as K

from tensorflow.python.saved_model import builder as model_builder
from tensorflow.python.saved_model.signature_def_utils import predict_signature_def
from tensorflow.python.saved_model import tag_constants


from sagemaker import get_execution_role
from sagemaker import Session
from sagemaker.tensorflow.serving import Model


In [None]:
# Account related constants

BUCKET = "smoke-dataset-bucket"
EXECUTION_ROLE = 'arn:aws:iam::574347909231:role/igarss-sagemaker-role'


In [None]:
model_filename = "<username>_smoke_wmts_ref_3layer.h5" # <model filename> # smoke_wmts_ref_3layer.h5
framework_version = "2.9.1" # <tensorflow version> # 2.9.1


In [None]:
role = get_execution_role()
print(role)
session = Session()
session.download_data('.', BUCKET, key_prefix=f"<username>/{model_filename}", extra_args=None)

## Convert Keras hdf-5 format to proto-buf format

We used Keras to train and save our model. This is not recognized by the Sagemaker processes as a native model format. Hence, we will need to change the format to Protocol Buffer format which is easier to be read by Sagemaker.

The following methods help convert the current format to Proto-Buf format and prepare a package which can then be deployed to the Sagemaker Endpoint.

In [None]:
from tensorflow.keras.models import load_model

MODEL_VERSION = '1'
EXPORT_DIRECTORY = 'export/Servo/{}'
MODEL_ARCHIVE = '<name>model.tar.gz'


def upload_model(model_archive=MODEL_ARCHIVE):
    """
    Method to upload proto-buf based model to S3.
    Args:
        model_archive(str): name of the model archive.
    Returns:
        model_data: details of the model that was uploaded
    """
    with tarfile.open(model_archive, mode='w:gz') as archive:
        archive.add('export', recursive=True) 

    role = get_execution_role()
    sess = Session()
    bucket = sess.default_bucket()
    
    # upload model artifacts to S3
    return sess.upload_data(
            path=model_archive, 
            key_prefix='model'
        )


def convert_to_proto_buf(model_name):
    """
    Converts Keras based model to Protocol Buffer format recognized by Tensorflow
    Args:
        model_name: File path to the model we want to convert.
    Return: 
        signature: Input/Output signature of the loaded model
    """
    loaded_model = load_model(model_name)
    export_dir =  EXPORT_DIRECTORY.format(MODEL_VERSION)
    builder = model_builder.SavedModelBuilder(export_dir)
    tf.compat.v1.disable_eager_execution()
    signature = predict_signature_def(
            inputs={"inputs": loaded_model.input}, outputs={"score": loaded_model.output}
        )
    builder.add_meta_graph_and_variables(
    sess=K.get_session(), tags=[tag_constants.SERVING], signature_def_map={"serving_default": signature})
    builder.save()
    return signature



In [None]:
INSTANCE_TYPE = 'ml.t2.medium'

convert_to_proto_buf(model_filename)
model_data = upload_model()


### Define Sagemaker model based on uploaded model

Once the model is uploaded, we need to let Sagemaker know where to get the model and provide an execution role which will have access to the S3 bucket where the model was pushed.

In [None]:
sagemaker_model = TensorFlowModel(
        model_data=model_data, 
        framework_version='2.8',
        role=EXECUTION_ROLE
    )


### Deploy the model

Now, we deploy the model into an endpoint. Endpoints are how models are hosted. Once the endpoints are established, we can add a lambda function which interacts with the endpoint and use it as the API backend.

Here, we are using `ml.t2.medium` instance to host our sagemaker endpoint. If there is a higher memory requirement, we can change the type of instance depending on our needs.

_Note: Change `<name>` to reflect your name._

In [None]:
%%time
uncompiled_predictor = sagemaker_model.deploy(
        initial_instance_count=1, 
        instance_type=INSTANCE_TYPE,
        endpoint_name="<name>-prediction-endpoint"
    )   

### Compare the local and deployed version of the model.

The endpoint should be available for us to use in a couple of minutes. Once the model is deployed, we can compare the inferences betwee Keras version and Proto-buf deployed version.

Here, we are just creating random values and infering on the data point using both versions of the model. The difference between the inferences should be negligible.

In [None]:
# Locally load the model for comparison
loaded_model = load_model(model_filename)

In [None]:
import numpy as np

data = np.random.randn(1, 256, 256, 6)

In [None]:
deployed_model_preds = uncompiled_predictor.predict(data)
original_model_preds = loaded_model.predict(data)

In [None]:
difference = deployed_model_preds['predictions'] - original_model_preds