# Deploy voyage-code-02 model package from AWS Marketplace 

Embedding models, a crucial building block for retrieval systems, semantic search, and retrieval-augmented generation (RAG), are neural networks that convert documents into numerical vectors. Voyage-code-02 is a cutting-edge embedding model that is trained particularly for semantic retrieval of code and code-related texts from both natural language and code queries. The model excels in code-related AI applications, including semantic code search/retrieval, code completion, and various functions of general code assistants. On 11 code retrieval tasks, voyage-code-02 has a significant 16.94% improvement over any alternatives, including OpenAI and Cohere. Voyage-code-02 also has consistent enhancements, averaging 4.93%, across general-purpose corpora.

## Pre-requisites:
- Confirm that the IAM role possesses **AmazonSageMakerFullAccess**.
- For successful deployment of this ML model, ensure the following:
    1. Your IAM role is endowed with three specific permissions, and you hold the authority to make AWS Marketplace subscriptions within the utilized AWS account:
        1. **aws-marketplace:ViewSubscriptions**
        1. **aws-marketplace:Unsubscribe**
        1. **aws-marketplace:Subscribe**  
    2. Alternatively, your AWS account maintains a subscription to one of the aforementioned models.
    

## 1. Subscribe to the model package

To subscribe to the voyage-code-02 model package:
1. Navigate to the model package listing page.
2. Click on the **Continue to subscribe** button found on the AWS Marketplace listing.
3. On the **Subscribe to this software** page, carefully review the details. If you and your organization agree with the End-User License Agreement (EULA), pricing, and support terms, click on **"Accept Offer"**.
4. After selecting **Continue to configuration** and choosing a **region**, you will be presented with a **Product Arn**. This is the model package ARN required for creating a deployable model using Boto3. Copy the ARN that corresponds to your selected region and use it in the subsequent cell.


In [None]:
!pip install boto3 --upgrade

In [None]:
import boto3

# Specify the voyage-code-02 package identifier
voyage_code_02_identifier = "TBD"

# ARN mapping for model packages by Region
model_package_arn_mapping = {
    "us-east-2": f"arn:aws:sagemaker:us-east-2:TBD:model-package/{voyage_code_02_identifier}",
}

# Determine the current AWS region of the boto3 session
current_region = boto3.Session().region_name

# Validate if the current region is supported
if current_region not in model_package_arn_mapping:
    raise Exception(f"The region {current_region} of the current boto3 session is not supported.")

# Retrieve the model package ARN for the current region
model_package_arn = model_package_arn_mapping[current_region]

## 2. Create an endpoint for real-time inference

In [None]:
import json
import sagemaker as sage
from sagemaker import get_execution_role, ModelPackage
import time

session = sage.Session()
role = get_execution_role()
sm_runtime = boto3.client("sagemaker-runtime")

model = ModelPackage(
    role=role,
    model_package_arn=model_package_arn,
    sagemaker_session=session,
)
model.deploy(
    initial_instance_count=1,
    instance_type="ml.g5.2xlarge",
    endpoint_name="voyage-code-02",
)

The established endpoint can be utilized for real-time inference as demonstrated in the following example.

In [None]:
input_json = '''{
    "input": ["Sample text 1", "Sample text 2"],
    "input_type": "query", 
    "truncation": "true"
  }'''

print(input_json)

response = sm_runtime.invoke_endpoint(
    EndpointName=model.endpoint_name,
    ContentType="application/json",
    Accept="application/json",
    Body=input_json,
)

print(json.load(response["Body"]))

## 3. Clean-up

Now that you have successfully performed a real-time inference, you do not need the endpoint any more. You can terminate the endpoint to avoid being charged.

In [None]:
model.sagemaker_session.delete_endpoint(model.endpoint_name)
model.sagemaker_session.delete_endpoint_config(model.endpoint_name)
model.delete_model()