# Deploy ESM3-open Model Package from AWS Marketplace 


--------------------

## <font color='orange'>Important:</font>

Please visit model detail page in <a href="https://aws.amazon.com/marketplace/pp/prodview-xbvra5ylcu4xq">https://aws.amazon.com/marketplace/pp/prodview-xbvra5ylcu4xq</a> to learn more. <font color='orange'>If you do not have access to the link, please contact account admin for the help.</font>

You will find details about the model including pricing, supported region, and end user license agreement. To use the model, please click “<font color='orange'>Continue to Subscribe</font>” from the detail page, come back here and learn how to deploy and inference.

-------------------


ESM3 is a frontier generative model for biology, able to jointly reason across three fundamental biological properties of proteins: sequence, structure, and function. These three data modalities are represented as tracks of discrete tokens at the input and output of ESM3. You can present the model with a combination of partial inputs across the tracks, and ESM3 will provide output predictions for all the tracks.
ESM3 is a generative masked language model. You can prompt it with partial sequence, structure, and function keywords, and iteratively sample masked positions until all positions are unmasked.

<img src="images/esm3-architecture.png" alt="ESM3 Architecture" style="width:80%;"/>


The ESM3 architecture is highly scalable due to its transformer backbone and all-to-all reasoning over discrete token sequences. At its largest scale, ESM3 was trained with 1.07e24 FLOPs on 2.78 billion proteins and 771 billion unique tokens, and has 98 billion parameters.
Here we present esm3-open-small. With 1.4B parameters it is the smallest and fastest model in the family, trained specifically to be open sourced. ESM3-open is available under a non-commercial license.

This sample notebook shows you how to deploy [EvolutionaryScale - ESM3](https://aws.amazon.com/marketplace/pp/prodview-xbvra5ylcu4xq) using Amazon SageMaker.

> **Note**: This is a reference notebook and it cannot run unless you make changes suggested in the notebook.

> ESM3 model package support SageMaker Realtime Inference but not SageMaker Batch Transform.

## Pre-requisites:
1. **Note**: This notebook contains elements which render correctly in Jupyter interface. Open this notebook from an Amazon SageMaker Notebook Instance or Amazon SageMaker Studio.
1. Ensure that IAM role used has **AmazonSageMakerFullAccess**
1. To deploy this ML model successfully, ensure that:
    1. Either your IAM role has these three permissions and you have authority to make AWS Marketplace subscriptions in the AWS account used: 
        1. **aws-marketplace:ViewSubscriptions**
        1. **aws-marketplace:Unsubscribe**
        1. **aws-marketplace:Subscribe**  
    2. or your AWS account has a subscription to [ESM3](https://aws.amazon.com/marketplace/pp/prodview-xbvra5ylcu4xq). If so, skip step: [Subscribe to the model package](#1.-Subscribe-to-the-model-package)

## Contents:
1. [Subscribe to the model package](#1.-Subscribe-to-the-model-package)
2. [Create an endpoint and perform real-time inference](#A.-Create-a-model-from-the-subscribed-model-package)
   1. [Create a model from the subscribed model package](#A.-Create-an-endpoint-configuration)
   2. [Create an endpoint configuration](#B.-Create-an-endpoint-configuration)
   3. [Create a realtime inference endpoint](#C.-Create-an-endpoint-configuration)
   4. [Create input payload](#D.-Create-an-input-payload)
   5. [Perform real-time inference](#E.-Perform-real-time-inference)   
   5. [Delete the endpoint](#H.-Delete-the-endpoint)
4. [Clean-up](#4.-Clean-up)
    1. [Delete the model](#A.-Delete-the-model)
    2. [Unsubscribe to the listing (optional)](#B.-Unsubscribe-to-the-listing-(optional))
    

## Usage instructions
You can run this notebook one cell at a time (By using Shift+Enter for running a cell).

## 1. Subscribe to the model package

To subscribe to the model package:
1. Open the model package listing page [EvolutionaryScale ESM3 Model](https://aws.amazon.com/marketplace/pp/prodview-xbvra5ylcu4xq)
1. On the AWS Marketplace listing, click on the **Continue to subscribe** button.
1. On the **Subscribe to this software** page, review and click on **"Accept Offer"** if you and your organization agrees with EULA, pricing, and support terms. 
1. Once you click on **Continue to configuration button** and then choose a **region**, you will see a **Product Arn** displayed. This is the model package ARN that you need to specify while creating a deployable model using Boto3. Copy the ARN corresponding to your region and specify the same in the following cell.

In [None]:
!pip install --upgrade boto3 sagemaker
# if you upgrade the package, you need to restart the kernel

In [None]:
import boto3
import time
import json
import sagemaker

## 2. Create an endpoint and perform real-time inference

If you want to understand how real-time inference with Amazon SageMaker works, see [Documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/how-it-works-hosting.html).

### A. Create a model from the subscribed model package

Create a model from your SageMaker marketplace subscriptions. 
<img src="images/model-from-marketplace.png" alt="Subscribed model - ESM3" style="width:80%;"/>

ESM3 model package is available to be used to create a model that can be hosted behind a realtime inference endpoint


### B. Create an endpoint configuration

ESM3 allows inference only on `ml.g5.2xlarge` or `ml.g5.4xlarge` instance types. Let's create an endpoint configuration which supports one of these instance types

In [None]:
g5_instance_type = "2xlarge" # or 4xlarge
instance_type = f"ml.g5.{g5_instance_type}" 
region = "us-east-1" # Replace with your desired region
endpoint_config_name = f"ESM3-{g5_instance_type}-config"
model_name = "esm3"  # this is the name of the model you've created following the steps above.

In [None]:
# list all the models available in this region - this is for a sanity check
sagemaker_client = boto3.client('sagemaker', region_name=region)  

In [None]:
sagemaker_client = boto3.client('sagemaker', region_name=region)
response = sagemaker_client.create_endpoint_config(
    EndpointConfigName = endpoint_config_name,
    ProductionVariants=[
        {
            'VariantName': 'AllTraffic',
            'ModelName': model_name,
            'InitialInstanceCount': 1,
            'InstanceType': instance_type
        }
    ]
)

Once endpoint has been created, you would be able to perform real-time inference.

### C. Create a real time endpoint

In [None]:
endpoint_name = "esm3-test"

#### You can now create an endpoint using Boto3

This will take ~10 minutes given the size of the model.

In [None]:
# Create the endpoint
response = sagemaker_client.create_endpoint(
    EndpointName=endpoint_name,
    EndpointConfigName=endpoint_config_name
)

# Wait for the endpoint to be in service
while True:
    response = sagemaker_client.describe_endpoint(EndpointName=endpoint_name)
    status = response['EndpointStatus']
    print(f"Endpoint status: {status}")
    if status == 'InService':
        break
    elif status == 'Failed':
        raise Exception("Endpoint creation failed")
    time.sleep(30)

print("Endpoint is now in service and ready to use")

### D. Create an input payload

In [None]:
input_sequence = "QYAPQTQSGRTSIVHLFEWRWVDIALECERYLGPKGFGGVQVSPPNENVVVTNPSRPWWERYQPVSYKLCTRSGNENEFRDMVTRCNNVGVRIYVDAVINHMCGSGAAAGTGTTCGSYCNPGSREFPAVPYSAWDFNDGKCKTASGGIESYNDPYQVRDCQLVGLLDLALEKDYVRSMIADYLNKLIDIGVAGFRIDASKHMWPGDIKAVLDKLHNLNTNWFPAGSRPFIFQEVIDLGGEAIKSSEYFGNGRVTEFKYGAKLGTVVRKWSGEKMSYLKNWGEGWGFMPSDRALVFVDNHDNQRGHGAGGSSILTFWDARLYKVAVGFMLAHPYGFTRVMSSYRWARNFVNGEDVNDWIGPPNNNGVIKEVTINADTTCGNDWVCEHRWREIRNMVWFRNVVDGEPFANWWDNGSNQVAFGRGNRGFIVFNNDDWQLSSTLQTGLPGGTYCDVISGDKVGNSCTGIKVYVSSDGTAQFSISNSAEDPFIAIHAESKL"

In [None]:
payload = json.dumps({"model": "esm3-sm-open-v1", "sequence": input_sequence, "sequence_logprobs": True})

In [None]:
print(payload)

### E. Perform real-time inference

In [None]:
# Create a SageMaker Runtime client
sagemaker_runtime = boto3.client('sagemaker-runtime', region_name='us-east-1')

# Invoke the endpoint
response = sagemaker_runtime.invoke_endpoint(
    EndpointName=endpoint_name,
    ContentType='application/json',
    Body=payload
)

# Read and parse the response
result = json.loads(response['Body'].read().decode())

In [None]:
result

In [None]:
prompt="""Write a cold outreach email introducing myself as Susan, a business development manager at CoolCompany, to Amy who is a product manager at Microsoft asking if they'd be interested in speaking about an integration to add autocomplete to Microsoft Office."""

response = co.generate(prompt=prompt, max_tokens=100, temperature=0.9, stream=False)

print(response.generations[0]['text'])

## 4. Clean-up

### A. Delete the endpoint

Now that you have successfully performed a real-time inference, you do not need the endpoint any more. You can terminate the endpoint to avoid being charged.

In [None]:
# Delete the endpoint
try:
    response = sagemaker_client.delete_endpoint(EndpointName=endpoint_name)
    print(f"Endpoint '{endpoint_name}' deletion initiated.")
except sagemaker_client.exceptions.ClientError as e:
    print(f"Error deleting endpoint: {e}")

### B. Unsubscribe to the listing (optional)

If you would like to unsubscribe to the model package, follow these steps. Before you cancel the subscription, ensure that you do not have any [deployable model](https://console.aws.amazon.com/sagemaker/home#/models) created from the model package or using the algorithm. Note - You can find this information by looking at the container name associated with the model. 

**Steps to unsubscribe to product from AWS Marketplace**:
1. Navigate to __Machine Learning__ tab on [__Your Software subscriptions page__](https://aws.amazon.com/marketplace/ai/library?productType=ml&ref_=mlmp_gitdemo_indust)
2. Locate the listing that you want to cancel the subscription for, and then choose __Cancel Subscription__  to cancel the subscription.

