# Using Jurassic-2 Light on SageMaker through Model Packages

This sample notebook shows you how to deploy **Jurassic-2 Light** using Amazon SageMaker.


--------------------
## <font color='orange'>Important:</font>
Please visit model detail page in <a href="https://aws.amazon.com/marketplace/pp/prodview-roz6zicyvi666">https://aws.amazon.com/marketplace/pp/prodview-roz6zicyvi666</a> to learn more. <font color='orange'>If you do not have access to the link, please contact account admin for the help.</font>

You will find details about the model including pricing, supported region, and end user license agreement. To use the model, please click “<font color='orange'>Continue to Subscribe</font>” from the detail page, come back here and learn how to deploy and inference.


-------------------

Jurassic-2 Light is the quickest large language model (LLM) by AI21 Labs. Small but mighty, Jurassic-2 Light is ideal for simple language tasks that require maximum affordability and minimal latency in your private environment. Common use cases include keyword extraction, sentence classification, named entity recognition (NER), short-form copy generation, sentiment analysis, and more. It follows natural language instructions and supports non-English languages including Spanish, French, German, Portuguese, Italian and Dutch.


## Pre-requisites:
1. Before running this notebook, please make sure you got this notebook from the model catalog on SageMaker AWS Management Console.
1. **Note**: This notebook contains elements which render correctly in Jupyter interface. Open this notebook from an Amazon SageMaker Notebook Instance or Amazon SageMaker Studio.
1. Ensure that IAM role used has **AmazonSageMakerFullAccess**.
1. This notebook is intended to work with **boto3 v1.25.4** or higher.

## Contents:
1. [Select model package](#1.-Select-model-package)
1. [Create an endpoint and perform real-time inference](#2.-Create-an-endpoint-and-perform-real-time-inference)
   1. [Create an endpoint](#A.-Create-an-endpoint)
   1. [Interact with the model](#B.-Interact-with-the-model)
   1. [Prompt with instructions](#C.-Prompt-with-instructions)
   1. [Prompt with examples](#D.-Prompt-with-examples)
1. [Clean-up](#3.-Clean-up)
   1. [Delete the endpoint](#A.-Delete-the-endpoint)
   1. [Delete the model](#B.-Delete-the-model)
    

## Usage instructions
You can run this notebook one cell at a time (By using Shift+Enter for running a cell).

## Imports

In [None]:
import json
from sagemaker import ModelPackage
from sagemaker import get_execution_role
import sagemaker as sage
import boto3

### Check the version of boto3 - must be v1.25.4 or higher
If you see a lower version number, pick another kernel to run the notebook, with Python 3.8 or above

In [None]:
boto3.__version__

### Install ai21 python SDK

In [None]:
! pip install -U "ai21[AWS]>=1.2.4"
import ai21

### Check the version of ai21 - must be 1.2.4 or higher

In [None]:
ai21.__version__

## 1. Select model package
Confirm that you received this notebook from the model catalog in SageMaker AWS Management Console.

In [None]:
region = boto3.Session().region_name

# Get the updated ARN
model_package_arn = ai21.SageMaker.get_model_package_arn(model_name="j2-light", region=region)

In [None]:
role = get_execution_role()
sagemaker_session = sage.Session()

runtime_sm_client = boto3.client("runtime.sagemaker")

## 2. Create an endpoint and perform real-time inference

### <span style='color:Blue'> How to choose the best instance for my use case?</span>
<span style='color:#00178E'> When you create your endpoint, you need to choose the instance type to run the model on. Choosing the right instance is mainly a matter of economics. Depending on your use case, you probably want the most cost-effective instance possible. In this notebook we use one of the supported instances.</span>

<span style='color:#00178E'>Looking for the list of all supported instances? See</span> [here](https://docs.ai21.com/docs/choosing-the-right-instance-type-for-amazon-sagemaker-models#jurassic-2-light).

If you want to understand how real-time inference with Amazon SageMaker works, see [Documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/deploy-model.html).

In [None]:
endpoint_name = "j2-light-internal"

content_type = "application/json"

real_time_inference_instance_type = (
    "ml.g5.12xlarge"    # Recommended instance
)

### A. Create an endpoint

In [None]:
# create a deployable model from the model package.
model = ModelPackage(
    role=role, model_package_arn=model_package_arn, sagemaker_session=sagemaker_session
)

# Deploy the model
predictor = model.deploy(1, real_time_inference_instance_type, endpoint_name=endpoint_name, 
                         model_data_download_timeout=3600,
                         container_startup_health_check_timeout=600,
                        )

Once endpoint has been created, you would be able to perform real-time inference.

### B. Interact with the model

You can think of Jurassic-2 light as a smart auto-completion algorithm: give it some text as input and it will generate relevant text to naturally complete your input.

These two helpful concepts are worth being familiar with:
- **Prompt** - the input you provide to the model.
- **Completion** - the output text the model generates.

Enter a simple prompt: "To be, or", and let the model complete it

In [None]:
response = ai21.Completion.execute(destination=ai21.SageMakerDestination(endpoint_name),
                                   prompt="To be, or",
                                   maxTokens=4,
                                   temperature=0,
                                   numResults=1)

print(response['completions'][0]['data']['text'])

As you can see, the model identifies the beginning of a famous quote, and completes it correctly.

### C. Prompt with instructions

**Why**? This model was specifically trained to follow natural language instructions. It is the most natural way to interact with large language models: simply tell the model what you want it to do, and it will follow.

**When?** Drafting, seeking for inspiration, or when the format and guidelines are "work in progress".

**How?** Just provide an instruction.

For this notebook, we will apply the model to extract entities from news headlines. We will start with providing the model a simple instruction.

In [None]:
headline = "Homer Simpson to sign executive order on January 2024 to increase the number of gun background checks in Springfield"

instruction = f"""Extract the entities from the following headline.
Headline: {headline}
"""

In [None]:
response = ai21.Completion.execute(destination=ai21.SageMakerDestination(endpoint_name),
                                   prompt=instruction,
                                   maxTokens=10,
                                   temperature=0,
                                   numResults=1)

print(response['completions'][0]['data']['text'])

#### Adjust the parameters
A useful parameter is the temperature. **You can increase creativity by tweaking the temperature.** With temperature 0, the model will always choose the most probable completion, so it will always be the same. Increasing the temperature will provide varying completions, where the completion may be different with every generation.
*Note: in tasks such as NER, you should use low temperature, somewhere between 0-0.2*.

In [None]:
response = ai21.Completion.execute(destination=ai21.SageMakerDestination(endpoint_name),
                                   prompt=instruction,
                                   maxTokens=100,
                                   temperature=0.2,
                                   numResults=2) # this will make the model generate 2 optional completions

for comp in response['completions']:
    print(comp['data']['text'].strip())
    print("=============")

#### Be specific in your prompt
You may want to extract specific entities. You can ask is specifically from the model.

In [None]:
specific_instruction = f"""Extract the entities from the following headline.
Headline: {headline}
Name, Date, Location:"""

response = ai21.Completion.execute(destination=ai21.SageMakerDestination(endpoint_name),
                                   prompt=specific_instruction,
                                   maxTokens=100,
                                   temperature=0)

print(response['completions'][0]['data']['text'])

### D. Prompt with examples

**Why?** Examples are helpful in assisting the model to comprehend and generate responses that adhere to the intended format.

**When?** Examples are particularly useful when there are stringent format constraints, a well-defined objective, and an overall structure to be maintained.

**How?** To establish a pattern for the model to follow, present a few instances (“shots”) of input-output pairs in the prompt. This enables the model to mimic the pattern. Then, provide the input for a query example and allow the model to generate a suitable completion. This approach is commonly referred to as a "*few-shot prompt*".

#### Create a few-shot prompt

We will build a few-shot prompt comprised of the following:

1. Prefix with 3 examples. Each example contains the relevant inputs (a product name and some features to incorporate) and the output (an engaging product description). They are separated by "##".

2. The query inputs. An unseen product name and set of features for which we would like the model to output a new product description. These should follow the same format of the inputs in the prefix.

First, we collect some example data for the prompt prefix:

In [None]:
EXAMPLES_DATA = [
    {"headline": "Inflation cooled to 6% in February 2023 as the Federal Reserve weighs next steps on interest rates", 
     "entity": "the Federal Reserve", 
     "time": "February 2023",
     "location": "NA"},
    {"headline": "Novo Nordisk to lower list price of some of its insulin by up to 75% in the U.S.", 
     "entity": "Novo Nordisk", 
     "time": "NA",
     "location": "the U.S."},
   {"headline": "John Snow says protecting Winterfell is not a 'vital' national interest", 
     "entity": "John Snow", 
     "time": "NA",
     "location": "Winterfell"}
]

Then, we use the following helper functions to construct the prefix:

In [None]:
def make_single_example(headline, entity, time, location):
    example = "Extract from the following headline these properties: Entity, Time, Location. In the case where it doesn't appear in the sentence, write NA.\n"
    example += f"Headline: {headline}\n"
    if entity:
        example += f"Entity: {entity}\n"
        example += f"Time: {time}\n"
        example += f"Location: {location}"
    
    return example

SEPARATOR = "\n##\n"

FEW_SHOT_PREFIX = SEPARATOR.join(
    make_single_example(x["headline"], x["entity"], x["time"], x["location"]) for x in EXAMPLES_DATA
)

And finally, we create a function to handle query inputs and create the full prompt:

In [None]:
def create_ner_prompt(headline):
    """
    Create a few-shot prompt to extract named entities with Jurassic-2 Light given a headline
    The prompt contains a preset sequence of examples followed by the query headline
    """
    return FEW_SHOT_PREFIX + SEPARATOR + make_single_example(headline, '', '', '')  # keep the entities blank and let the model generate

Let's see how this looks for the t-shirt with the specific features from before:

In [None]:
few_shot_prompt = create_ner_prompt(headline=headline)

print(few_shot_prompt)

In [None]:
response = ai21.Completion.execute(destination=ai21.SageMakerDestination(endpoint_name),
                                   prompt=few_shot_prompt,
                                   maxTokens=30,
                                   temperature=0,
                                   stopSequences=['##'],
                                   numResults=1)

print(response['completions'][0]['data']['text'])

As you can see, the completions follow a similar pattern to the examples in the few-shot prompt.

### Interested in learning more?
Take a look at our [blog post](https://www.ai21.com/blog/building-cv-profile-generator-using-ai21-studio) to understand the process of building a good prompt.

## 3. Clean-up

### A. Delete the endpoint

Now that you have successfully performed a real-time inference, you do not need the endpoint any more. You can terminate the endpoint to avoid being charged.

In [None]:
model.sagemaker_session.delete_endpoint(endpoint_name)
model.sagemaker_session.delete_endpoint_config(endpoint_name)

### B. Delete the model

In [None]:
model.delete_model()