# Deploy ESM Embeddings Server on on Amazon SageMaker

Copyright 2024 Amazon.com, Inc. or its affiliates. All Rights Reserved.
SPDX-License-Identifier: MIT-0

---
## 1. Setup

### 1.1. Create clients

In [None]:
import boto3
import sagemaker

boto_session = boto3.session.Session()
sagemaker_session = sagemaker.session.Session(boto_session)
s3 = boto_session.resource("s3")
region = boto_session.region_name
role = sagemaker.get_execution_role()

### 1.2. Build BioNeMo-Inference Container Image

If you don't already have access to the BioNeMo-SageMaker container image, run the following cell to build and deploy it to your AWS account. Take note of the image URI - you'll use it for the processing and training steps below.

Here is an example shell script you can use in your environment (including SageMaker Notebook Instances) to build the container.

Once you have built and pushed the container, we strongly recommend using [ECR image scanning](https://docs.aws.amazon.com/AmazonECR/latest/userguide/image-scanning.html) to ensure that it meets your security requirements.

In [None]:
%%bash

# The name of our algorithm
algorithm_name=bionemo-inference

pushd container/inference

account=$(aws sts get-caller-identity --query Account --output text)

# Get the region defined in the current configuration (default to us-west-2 if none defined)
region=$(aws configure get region)
region=${region:-us-west-2}

fullname="${account}.dkr.ecr.${region}.amazonaws.com/${algorithm_name}:latest"

# If the repository doesn't exist in ECR, create it.
aws ecr describe-repositories --repository-names "${algorithm_name}" > /dev/null 2>&1

if [ $? -ne 0 ]
then
    aws ecr create-repository --repository-name "${algorithm_name}" > /dev/null
fi

# Get the login command from ECR and execute it directly
$(aws ecr get-login --region ${region} --no-include-email)

# Build the docker image locally with the image name and then push it to ECR
# with the full name.

docker build -t ${algorithm_name} .
docker tag ${algorithm_name} ${fullname}

docker push ${fullname}

popd

---
## 2. Deploy Real-Time Inference Endpoint

### 2.1. Create esm1nv model

In [None]:
from sagemaker.model import Model

# Replace this with your ECR repository URI from above
BIONEMO_IMAGE_URI = (
    "<ACCOUNT ID>.dkr.ecr.<REGION>.amazonaws.com/bionemo-inference:latest"
)

esm_embeddings = Model(
    image_uri=BIONEMO_IMAGE_URI,
    name="esm-embeddings",
    model_data=None,
    role=role,
    predictor_cls=sagemaker.predictor.Predictor,
    sagemaker_session=sagemaker_session,
    env={"SM_SECRET_NAME": "NVIDIA_NGC_CREDS", "MODEL_NAME": "esm1nv"},
)

### 2.2. Deploy model to SageMaker endpoint

In [None]:
esm_embeddings_predictor = esm_embeddings.deploy(
    initial_instance_count=1,
    instance_type='ml.g5.xlarge',
    serializer = sagemaker.base_serializers.CSVSerializer(),
    deserializer = sagemaker.base_deserializers.NumpyDeserializer()
)

### 2.3. Test model

In [None]:
esm_embeddings_predictor.predict("MSLKRKNIALIPAAGIGVRFGADKPKQYVEIGSKTVLEHVL,MIQSQINRNIRLDLADAILLSKAKKDLSFAEIADGTGLA")