# Deploy MSA-Search NIM from AWS Marketplace

The MSA search NIM is powered by GPU MMSeqs2. GPU MMSeqs2 is a GPU-accelerated toolkit for protein database search and Multiple Sequence Alignment (MSA). While not a deep learning model, MMSeqs2 does require large protein databases for sequence similarity search.


The MSA-Search NIM supports GPU-accelerated MSA of a query amino acid sequence against a set of protein sequence databases. These databases are searched for similar sequences to the query, and then the collection of sequences is aligned to establish similar regions even when the proteins have different lengths and motifs.

The outputs of the MSA process inform structural prediction models such as AlphaFold2 and OpenFold. This tends to improve structural prediction accuracy because similar sequences often have similar structures. MSA-Search is also used by evolutionary biologists to look for homology between protein sequences that may indicate a common evolutionary origin.

The MSA-Search NIM implements two search styles.

- The AlphaFold2 search type was first used in the [AlphaFold2 paper](https://www.nature.com/articles/s41586-021-03819-2) and performs a single-pass search per database.

- The ColabFold search process in the MSA Search NIM was first introduced in [ColabFold](https://github.com/sokrypton/ColabFold) and implements a cascaded search of generated profiles, providing even higher sensitivity and generally better throughput.

Both methods use GPU-accelerated MMSeqs2 to improve accuracy and reduce latency. Combined with AlphaFold2 or OpenFold, the MSA-Search NIM enables a sensitive and high-throughput protein structure prediction pipeline.

In general, NIMs offer an easy-to-deploy and straightforward route for self-hosted AI applications. Two significant advantages that NIMs offer for system administrators and developers are:

- Increased productivity: NIMs enable developers to build generative AI applications quickly, in minutes rather than weeks, by providing a standardized way to add AI capabilities to their applications.

- Simplified deployment: NIMs provide containers that can be easily deployed on various platforms, including clouds, data centers, or workstations, making it convenient for developers to test and deploy their applications.

Please check out the [MSA-Search NIM docs](https://docs.nvidia.com/nim/bionemo/msa-search/latest/overview.html) and [NIM LLM docs](https://docs.nvidia.com/nim/large-language-models/latest/introduction.html) for more information.

## ⚠️ Disclaimer

- Ensure you use this NIM with GPUs with at least 48 GB of VRAM. In addition, this NIM requires roughly 1.3 Terabytes (1300 Gigabytes) of fast NVMe SSD storage to store the databases.
- Due to the large size of sequence datasets, the endpoint deployment process can take up to ~1.5 hours. The deployment process can be monitored using CloudWatch or SageMaker endpoint status resources. 

## Pre-requisites:
1. **Note**: This notebook contains elements which render correctly in Jupyter interface. Open this notebook from an Amazon SageMaker Notebook Instance or Amazon SageMaker Studio.
1. Ensure that IAM role used has **AmazonSageMakerFullAccess**
1. To deploy this ML model successfully, ensure that:
    1. Either your IAM role has these three permissions and you have authority to make AWS Marketplace subscriptions in the AWS account used: 
        1. **aws-marketplace:ViewSubscriptions**
        1. **aws-marketplace:Unsubscribe**
        1. **aws-marketplace:Subscribe**  
    2. or your AWS account has a subscription to one of the models listed above.


## Subscribe to the model package
To subscribe to the model package:
1. Open the model package listing page
1. On the AWS Marketplace listing, click on the **Continue to subscribe** button.
1. On the **Subscribe to this software** page, review and click on **"Accept Offer"** if you and your organization agrees with EULA, pricing, and support terms. 
1. Once you click on **Continue to configuration button** and then choose a **region**, you will see a **Product Arn** displayed. This is the model package ARN that you need to specify while creating a deployable model. Copy the ARN corresponding to your region and specify the same in the following cell.

In [None]:
import boto3, json, sagemaker, time, os
from sagemaker import get_execution_role, ModelPackage
from botocore.config import Config

config = Config(read_timeout=10000)
sess = boto3.Session()
sm = sess.client("sagemaker")
sagemaker_session = sagemaker.Session(boto_session=sess)
role = get_execution_role()
client = boto3.client("sagemaker-runtime", config=config)
region = sess.region_name

In [None]:
# replace the arn below with the model package arn you want to deploy
nim_package = "MSA_SEARCH_NIM_PRODUCT_MODEL_PACKAGE_ARN_FROM_YOUR_AWS_MARKEPLACE_SUBSCRIPTION"

# Mapping for Model Packages
model_package_map = {
    "us-east-1": f"arn:aws:sagemaker:us-east-1:865070037744:model-package/{nim_package}",
    "us-east-2": f"arn:aws:sagemaker:us-east-2:057799348421:model-package/{nim_package}",
    "us-west-1": f"arn:aws:sagemaker:us-west-1:382657785993:model-package/{nim_package}",
    "us-west-2": f"arn:aws:sagemaker:us-west-2:594846645681:model-package/{nim_package}",
    "ca-central-1": f"arn:aws:sagemaker:ca-central-1:470592106596:model-package/{nim_package}",
    "eu-central-1": f"arn:aws:sagemaker:eu-central-1:446921602837:model-package/{nim_package}",
    "eu-west-1": f"arn:aws:sagemaker:eu-west-1:985815980388:model-package/{nim_package}",
    "eu-west-2": f"arn:aws:sagemaker:eu-west-2:856760150666:model-package/{nim_package}",
    "eu-west-3": f"arn:aws:sagemaker:eu-west-3:843114510376:model-package/{nim_package}",
    "eu-north-1": f"arn:aws:sagemaker:eu-north-1:136758871317:model-package/{nim_package}",
    "ap-southeast-1": f"arn:aws:sagemaker:ap-southeast-1:192199979996:model-package/{nim_package}",
    "ap-southeast-2": f"arn:aws:sagemaker:ap-southeast-2:666831318237:model-package/{nim_package}",
    "ap-northeast-2": f"arn:aws:sagemaker:ap-northeast-2:745090734665:model-package/{nim_package}",
    "ap-northeast-1": f"arn:aws:sagemaker:ap-northeast-1:977537786026:model-package/{nim_package}",
    "ap-south-1": f"arn:aws:sagemaker:ap-south-1:077584701553:model-package/{nim_package}",
    "sa-east-1": f"arn:aws:sagemaker:sa-east-1:270155090741:model-package/{nim_package}",
}

region = boto3.Session().region_name
if region not in model_package_map.keys():
    raise Exception(f"Current boto3 session region {region} is not supported.")

model_package_arn = model_package_map[region]
model_package_arn

## Create the SageMaker Endpoint

We first define SageMaker model using the specified ModelPackageArn.

In [None]:
# Define the model details
sm_model_name = "MSA-Search-NIM-v1-0-0"

# Create the SageMaker model
create_model_response = sm.create_model(
    ModelName=sm_model_name,
    PrimaryContainer={
        'ModelPackageName': model_package_arn
    },
    ExecutionRoleArn=role,
    EnableNetworkIsolation=True
)
print("Model Arn: " + create_model_response["ModelArn"])

Next we create endpoint configuration specifying instance type

In [None]:
# Create the endpoint configuration
endpoint_config_name = sm_model_name

create_endpoint_config_response = sm.create_endpoint_config(
    EndpointConfigName=endpoint_config_name,
    ProductionVariants=[
        {
            'VariantName': 'AllTraffic',
            'ModelName': sm_model_name,
            'InitialInstanceCount': 1,
            'InstanceType': 'ml.g6e.12xlarge', 
            'InferenceAmiVersion': "al2-ami-sagemaker-inference-gpu-3-1",
            'RoutingConfig': {'RoutingStrategy': 'LEAST_OUTSTANDING_REQUESTS'},
            'ModelDataDownloadTimeoutInSeconds': 3600, # Specify the model download timeout in seconds.
            'ContainerStartupHealthCheckTimeoutInSeconds': 3600, # Specify the health checkup timeout in seconds
        }
    ]
)
print("Endpoint Config Arn: " + create_endpoint_config_response["EndpointConfigArn"])

Using the above endpoint configuration we create a new sagemaker endpoint and wait for the deployment to finish. The status will change to InService once the deployment is successful.

In [None]:
# Create the endpoint
endpoint_name = endpoint_config_name
create_endpoint_response = sm.create_endpoint(
    EndpointName=endpoint_name,
    EndpointConfigName=endpoint_config_name
)

print("Endpoint Arn: " + create_endpoint_response["EndpointArn"])

In [None]:
resp = sm.describe_endpoint(EndpointName=endpoint_name)
status = resp["EndpointStatus"]
print("Status: " + status)

while status == "Creating":
    time.sleep(60)
    resp = sm.describe_endpoint(EndpointName=endpoint_name)
    status = resp["EndpointStatus"]
    print("Status: " + status)

print("Arn: " + resp["EndpointArn"])
print("Status: " + status)

### Run Inference

Once we have the model deployed we can use a sample payload to do an inference request. For inference request format, currently NIM on SageMaker supports the OpenAI API inference protocol. For explanation of supported parameters please see [this link](https://docs.nvidia.com/nim/bionemo/evo2/2.1.0/quickstart-guide.html).

### MSA-Search NIM Inference Example-1

In [None]:
sm_runtime = boto3.client("sagemaker-runtime", region_name=region)

payload = {
    "sequence": (
        "MALWMRLLPLLALLALWGPDPAAAFVNQHLCGSHLVEALYLVCGERGFFYTPKTRREAEDLQVGQVELGGGPGAGSLQPLALEGSLQKRGIVE"
        "QCCTSICSLYQLENYCN"
    ),
    "e_value": 0.0001,
    "iterations": 1,
    "databases": ["Uniref30_2302", "colabfold_envdb_202108", "PDB70_220313"],
    "search_type": "alphafold2",
    "output_alignment_formats": ["fasta", "a3m"],
    "max_msa_sequences": 1000,
}

response = sm_runtime.invoke_endpoint(
    EndpointName=endpoint_name,
    ContentType="application/json",
    Body=json.dumps(payload),
)

result = json.loads(response["Body"].read())
print(json.dumps(result, indent=2)[:1000], "...")

### MSA-Search NIM Inference Example-2

In [None]:
payload = {
    "sequence": (
        "MALWMRLLPLLALLALWGPDPAAAFVNQHLCGSHLVEALYLVCGERGFFYTPKTRREAEDLQVGQVELGGGPGAGSLQPLALEGSLQKRGIVEQCCTSICSLYQLENYCN"
    ),
    "e_value": 0.0001,
    "iterations": 2,
    "databases": ["Uniref30_2302", "colabfold_envdb_202108", "PDB70_220313"],
}

response = sm_runtime.invoke_endpoint(
    EndpointName=endpoint_name,
    ContentType="application/json",
    Body=json.dumps(payload),
)

result = json.loads(response["Body"].read())
print(json.dumps(result, indent=2)[:1000], "...")

### Terminate endpoint and clean up artifacts

In [None]:
sm.delete_model(ModelName=sm_model_name)
sm.delete_endpoint_config(EndpointConfigName=endpoint_config_name)
sm.delete_endpoint(EndpointName=endpoint_name)