# Real-time Inference with Jina ColBERT Model Package

This notebook shows you how to deploy Jina ColBERT ([jina-colbert](https://aws.amazon.com/marketplace/seller-profile?id=seller-stch2ludm6vgy) / [jina-colbert-reranker](https://aws.amazon.com/marketplace/seller-profile?id=seller-stch2ludm6vgy)) using Amazon SageMaker and perform inference with it.

## Pre-requisites:
1. Ensure that IAM role used has **AmazonSageMakerFullAccess**
1. To deploy this ML model successfully, ensure that:
    1. Either your IAM role has these three permissions and you have authority to make AWS Marketplace subscriptions in the AWS account used: 
        1. **aws-marketplace:ViewSubscriptions**
        1. **aws-marketplace:Unsubscribe**
        1. **aws-marketplace:Subscribe**  
    2. or your AWS account has a subscription to [jina-colbert](https://aws.amazon.com/marketplace/seller-profile?id=seller-stch2ludm6vgy) / [jina-colbert-reranker](https://aws.amazon.com/marketplace/seller-profile?id=seller-stch2ludm6vgy).

## Contents:
1. [Subscribe to the model package](#1.-Model-package-setup)
2. [Embedding](#3.-Embedding)
3. [Reranking](#3.-Reranking)
4. [Clean-up](#4.-Clean-up)
    1. [Delete the model](#A.-Delete-the-model)
    2. [Unsubscribe to the listing (optional)](#B.-Unsubscribe-to-the-listing-(optional))

# 1. Model package setup

Please subscribe to the model package from AWS Marketplace [here](https://aws.amazon.com/marketplace/pp/prodview-5iljbegvoi66w).

Install `jina-sagemaker` package 


```bash
pip install --upgrade jina-sagemaker
```

In [None]:
# Specify the role as required by SageMaker
role = ""

In [None]:
import boto3

region = "us-east-1"

# Specify the model name
colbert_model_name = ""
colbert_reranker_model_name = ""

# Mapping for Model Packages
def get_arn_for_model(region_name, model_name):
    model_package_map = {
        "us-east-1": f"arn:aws:sagemaker:us-east-1:253352124568:model-package/{model_name}",
        "us-east-2": f"arn:aws:sagemaker:us-east-2:057799348421:model-package/{model_name}",
        "us-west-1": f"arn:aws:sagemaker:us-west-1:382657785993:model-package/{model_name}",
        "us-west-2": f"arn:aws:sagemaker:us-west-2:594846645681:model-package/{model_name}",
        "ca-central-1": f"arn:aws:sagemaker:ca-central-1:470592106596:model-package/{model_name}",
        "eu-central-1": f"arn:aws:sagemaker:eu-central-1:446921602837:model-package/{model_name}",
        "eu-west-1": f"arn:aws:sagemaker:eu-west-1:985815980388:model-package/{model_name}",
        "eu-west-2": f"arn:aws:sagemaker:eu-west-2:856760150666:model-package/{model_name}",
        "eu-west-3": f"arn:aws:sagemaker:eu-west-3:843114510376:model-package/{model_name}",
        "eu-north-1": f"arn:aws:sagemaker:eu-north-1:136758871317:model-package/{model_name}",
        "ap-southeast-1": f"arn:aws:sagemaker:ap-southeast-1:192199979996:model-package/{model_name}",
        "ap-southeast-2": f"arn:aws:sagemaker:ap-southeast-2:666831318237:model-package/{model_name}",
        "ap-northeast-2": f"arn:aws:sagemaker:ap-northeast-2:745090734665:model-package/{model_name}",
        "ap-northeast-1": f"arn:aws:sagemaker:ap-northeast-1:977537786026:model-package/{model_name}",
        "ap-south-1": f"arn:aws:sagemaker:ap-south-1:077584701553:model-package/{model_name}",
        "sa-east-1": f"arn:aws:sagemaker:sa-east-1:270155090741:model-package/{model_name}",
    }

    return model_package_map[region_name]

colbert_model_package_arn = get_arn_for_model(region, colbert_model_name)
colbert_reranker_model_package_arn = get_arn_for_model(region, colbert_reranker_model_name)

---

# 2. Embedding

To learn about real-time inference capabilities in Amazon SageMaker, please refer to the [Documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/realtime-endpoints.html).

Let's create an endpoint that uses Jina ColBERT for embedding.

In [None]:
from jina_sagemaker import Client, InputType

client = Client(region_name=region)
embedding_endpoint_name = "my-embedding-endpoint"

We can create a new endpoint using the `create_endpoint` method and passing the required parameters like `instance_type`, `n_instances` etc.

In [None]:
client.create_endpoint(
    arn=colbert_model_package_arn, 
    role=role, 
    endpoint_name=embedding_endpoint_name, 
    instance_type="ml.g4dn.xlarge", 
    n_instances=1,
)

Or, we can connect to an existing endpoint using the `connect_to_endpoint` method by passing the endpoint name.

In [None]:
client.connect_to_endpoint(endpoint_name=colbert_model_name)

## Perform real-time inference

### Usage with `jina-sagemaker` sdk

#### Embed the documents

In [None]:
result = client.embed(texts=[
    "How is the weather today?", 
    "what's the color of an orange",
], use_colbert=True, input_type=InputType.DOCUMENT)
print(result)

#### Embed the query

In [None]:
result = client.embed(texts="How is the weather today?",
                      use_colbert=True, input_type=InputType.QUERY)
print(result)

### Usage with aws-sdk

Create a input file `input.json` with the following content.

```json
{
  "data": [
    {
      "text": "How is the weather today?"
    },
    {
      "text": "what's the color of an orange"
    }
  ],
  "parameters": {
    "input_type": "document"
  }
}
```

Run the AWS `invoke-endpoint` CLI.

In [None]:
aws sagemaker-runtime invoke-endpoint \
--endpoint-name <endpoint-name> \
--content-type 'application/json' \
--body fileb://input.json \
output.json

---

# 3. Reranking

To learn about real-time inference capabilities in Amazon SageMaker, please refer to the [Documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/realtime-endpoints.html).

Let's create an endpoint that uses Jina ColBERT for reranking.

In [None]:
from jina_sagemaker import Client, InputType

client = Client(region_name=region)
reranking_endpoint_name = "my-reranking-endpoint"

We can create a new endpoint using the `create_endpoint` method and passing the required parameters like `instance_type`, `n_instances` etc.

In [None]:
client.create_endpoint(
    arn=colbert_reranker_model_package_arn, 
    role=role, 
    endpoint_name=reranking_endpoint_name, 
    instance_type="ml.g4dn.xlarge", 
    n_instances=1,
)

Or, we can connect to an existing endpoint using the `connect_to_endpoint` method by passing the endpoint name.

In [None]:
client.connect_to_endpoint(endpoint_name=reranking_endpoint_name)

## Perform real-time inference

### Usage with `jina-sagemaker` sdk

In [None]:
result = client.rerank(
    documents=["the dog is in my house", "he likes dog", "hello world"],
    query="where is the dog",
    top_n=2,
)

print(result)

### Usage with aws-sdk

Create a input file `input.json` with the following content.

```json
{
    "data": {
        "documents": [{"text": "the dog is in my house"},
                      {"text": "he likes dog"},
                      {"text": "hello world"}],
        "query": "where is the dog",
        "top_n": 2
    }
}
```

Run the AWS `invoke-endpoint` CLI.

In [None]:
aws sagemaker-runtime invoke-endpoint \
--endpoint-name <endpoint-name> \
--content-type 'application/json' \
--body fileb://input.json \
output.json

---

# 4. Clean-up

## A. Delete the model

In [None]:
client.delete_endpoint()
client.close()

## B. Unsubscribe to the listing (optional)

If you would like to unsubscribe to the model package, follow these steps. Before you cancel the subscription, ensure that you do not have any [deployable model](https://console.aws.amazon.com/sagemaker/home#/models) created from the model package or using the algorithm. Note - You can find this information by looking at the container name associated with the model. 

**Steps to unsubscribe to product from AWS Marketplace**:
1. Navigate to __Machine Learning__ tab on [__Your Software subscriptions page__](https://aws.amazon.com/marketplace/ai/library?productType=ml&ref_=mlmp_gitdemo_indust)
2. Locate the listing that you want to cancel the subscription for, and then choose __Cancel Subscription__  to cancel the subscription.
