# Deploy voyage-finance-2 model package from AWS Marketplace 

Embedding models are neural networks that convert documents into numerical vectors. They are a crucial building block for retrieval systems, semantic search, and retrieval-augmented generation (RAG). voyage-finance-2 is optimized for finance domain retrieval and RAG. It demonstrates superior finance retrieval quality and outperformed competing models on financial retrieval datasets, with an average of 7% gain over OpenAI and 12% over Cohere. voyage-finance-2 supports a 32K context length.

## Pre-requisites:
- Confirm that the IAM role possesses **AmazonSageMakerFullAccess**.
- For successful deployment of this ML model, ensure the following:
    1. Your IAM role is endowed with three specific permissions, and you hold the authority to make AWS Marketplace subscriptions within the utilized AWS account:
        1. **aws-marketplace:ViewSubscriptions**
        1. **aws-marketplace:Unsubscribe**
        1. **aws-marketplace:Subscribe**  
    2. Alternatively, your AWS account maintains a subscription to one of the aforementioned models.
    

## 1. Subscribe to the model package

To subscribe to the voyage-finance-2 model package:
1. Navigate to the model package listing page.
2. Click on the **Continue to subscribe** button found on the AWS Marketplace listing.
3. On the **Subscribe to this software** page, carefully review the details. If you and your organization agree with the End-User License Agreement (EULA), pricing, and support terms, click on **"Accept Offer"**.
4. After selecting **Continue to configuration** and choosing a **region**, you will be presented with a **Product Arn**. This is the model package ARN required for creating a deployable model using Boto3. Copy the ARN that corresponds to your selected region and use it in the subsequent cell.


In [None]:
!pip install boto3 --upgrade

In [None]:
import boto3

# Specify the voyage-finance-2 package identifier
voyage_finance_2_identifier = "voyage-finance-2-1adbefe1db413d249a1e44270d748140"

# Arn mapping for model packages by Region
model_package_arn_mapping = {
    "us-east-1": f"arn:aws:sagemaker:us-east-1:865070037744:model-package/{voyage_finance_2_identifier}",
    "us-east-2": f"arn:aws:sagemaker:us-east-2:057799348421:model-package/{voyage_finance_2_identifier}",
    "us-west-1": f"arn:aws:sagemaker:us-west-1:382657785993:model-package/{voyage_finance_2_identifier}",
    "us-west-2": f"arn:aws:sagemaker:us-west-2:594846645681:model-package/{voyage_finance_2_identifier}",
    "ca-central-1": f"arn:aws:sagemaker:ca-central-1:470592106596:model-package/{voyage_finance_2_identifier}",
    "eu-central-1": f"arn:aws:sagemaker:eu-central-1:446921602837:model-package/{voyage_finance_2_identifier}",
    "eu-west-1": f"arn:aws:sagemaker:eu-west-1:985815980388:model-package/{voyage_finance_2_identifier}",
    "eu-west-2": f"arn:aws:sagemaker:eu-west-2:856760150666:model-package/{voyage_finance_2_identifier}",
    "eu-west-3": f"arn:aws:sagemaker:eu-west-3:843114510376:model-package/{voyage_finance_2_identifier}",
    "eu-north-1": f"arn:aws:sagemaker:eu-north-1:136758871317:model-package/{voyage_finance_2_identifier}",
    "ap-southeast-1": f"arn:aws:sagemaker:ap-southeast-1:192199979996:model-package/{voyage_finance_2_identifier}",
    "ap-southeast-2": f"arn:aws:sagemaker:ap-southeast-2:666831318237:model-package/{voyage_finance_2_identifier}",
    "ap-northeast-2": f"arn:aws:sagemaker:ap-northeast-2:745090734665:model-package/{voyage_finance_2_identifier}",
    "ap-northeast-1": f"arn:aws:sagemaker:ap-northeast-1:977537786026:model-package/{voyage_finance_2_identifier}",
    "ap-south-1": f"arn:aws:sagemaker:ap-south-1:077584701553:model-package/{voyage_finance_2_identifier}",
    "sa-east-1": f"arn:aws:sagemaker:sa-east-1:270155090741:model-package/{voyage_finance_2_identifier}",
}

# Determine the current AWS region of the boto3 session
current_region = boto3.Session().region_name

# Validate if the current region is supported
if current_region not in model_package_arn_mapping:
    raise Exception(f"The region {current_region} of the current boto3 session is not supported.")

# Retrieve the model package Arn for the current region
model_package_arn = model_package_arn_mapping[current_region]

## 2. Create an endpoint for real-time inference

In [None]:
import json
import sagemaker as sage
from sagemaker import get_execution_role, ModelPackage
import time

session = sage.Session()
role = get_execution_role()
sm_runtime = boto3.client("sagemaker-runtime")

model = ModelPackage(
    role=role,
    model_package_arn=model_package_arn,
    sagemaker_session=session,
)
# The following step deploys the model endpoint to Sagemaker and may take up to 10 mins.
model.deploy(
    initial_instance_count=1,
    instance_type="ml.g5.xlarge",
    endpoint_name="voyage-finance-2",
)

The established endpoint can be utilized for real-time inference as demonstrated in the following example.

For detailed information on the specific usage methods (input, input_type, truncation), input prerequisites (batch size limit, maximum context length requiment) and its expected throughput, kindly consult this [page](https://aws.amazon.com/marketplace/pp/prodview-erofjpgna7gtq?sr=0-3&ref_=beagle&applicationId=AWSMPContessa).

In [None]:
input_json = '''{
    "input": ["Sample text 1", "Sample text 2"],
    "input_type": "query", 
    "truncation": "true"
  }'''

print(input_json)

response = sm_runtime.invoke_endpoint(
    EndpointName=model.endpoint_name,
    ContentType="application/json",
    Accept="application/json",
    Body=input_json,
)

print(json.load(response["Body"]))

Here is a demonstration of how to index a large dataset.

In [None]:
# Example dataset: an array of documents
dataset = [
    "The effectiveness of machine learning in predictive analytics.",
    "A comparative study of deep learning models for natural language processing.",
    "Innovative approaches to data encryption in cybersecurity.",
    "Exploring the impact of blockchain technology on supply chain management.",
    "The role of artificial intelligence in enhancing user experience design."
]
encoded_data = []

for text_data in dataset:
    input_json = json.dumps({
        "input": [text_data],
        "input_type": "document",
        "truncation": "true"
    })
    response = sm_runtime.invoke_endpoint(
        EndpointName=model.endpoint_name,
        ContentType="application/json",
        Accept="application/json",
        Body=input_json,
    )
    encoded_data.append(json.load(response["Body"]))

print("Encoding completed.")

## 3. Clean-up

Now that you have successfully performed a real-time inference, you do not need the endpoint any more. You can terminate the endpoint to avoid being charged.

In [None]:
model.sagemaker_session.delete_endpoint(model.endpoint_name)
model.sagemaker_session.delete_endpoint_config(model.endpoint_name)
model.delete_model()