# llm-serving: AI21 Contexual Answers
* Container: Data Science 3.0 (studio, python 3.10)
* **유료 모델**

## 0. Install packages

In [24]:
install_needed = True  # should only be True once

In [25]:
import sys
import IPython

if install_needed:
    print("installing deps and restarting kernel")
    !{sys.executable} -m pip install -U pip
    !{sys.executable} -m pip install -U sagemaker
    !{sys.executable} -m pip install -U "ai21[SM]"
    
    IPython.Application.instance().kernel.do_shutdown(True)

installing deps and restarting kernel
[0mCollecting ai21[SM]
  Using cached ai21-1.1.4-py3-none-any.whl
Installing collected packages: ai21
Successfully installed ai21-1.1.4
[0m

## 1. model map
Confirm that you received this notebook from the model catalog in SageMaker AWS Management Console.

In [1]:
model_package_map = {
    "us-east-1": "arn:aws:sagemaker:us-east-1:865070037744:model-package/contextual-answers-1-0-001-a85d7d493b3a39e3a8a8ec734f2befae",
    "us-east-2": "arn:aws:sagemaker:us-east-2:057799348421:model-package/contextual-answers-1-0-001-a85d7d493b3a39e3a8a8ec734f2befae",
    "us-west-1": "arn:aws:sagemaker:us-west-1:382657785993:model-package/contextual-answers-1-0-001-a85d7d493b3a39e3a8a8ec734f2befae",
    "us-west-2": "arn:aws:sagemaker:us-west-2:594846645681:model-package/contextual-answers-1-0-001-a85d7d493b3a39e3a8a8ec734f2befae",
    "ca-central-1": "arn:aws:sagemaker:ca-central-1:470592106596:model-package/contextual-answers-1-0-001-a85d7d493b3a39e3a8a8ec734f2befae",
    "eu-central-1": "arn:aws:sagemaker:eu-central-1:446921602837:model-package/contextual-answers-1-0-001-a85d7d493b3a39e3a8a8ec734f2befae",
    "eu-west-1": "arn:aws:sagemaker:eu-west-1:985815980388:model-package/contextual-answers-1-0-001-a85d7d493b3a39e3a8a8ec734f2befae",
    "eu-west-2": "arn:aws:sagemaker:eu-west-2:856760150666:model-package/contextual-answers-1-0-001-a85d7d493b3a39e3a8a8ec734f2befae",
    "eu-west-3": "arn:aws:sagemaker:eu-west-3:843114510376:model-package/contextual-answers-1-0-001-a85d7d493b3a39e3a8a8ec734f2befae",
    "eu-north-1": "arn:aws:sagemaker:eu-north-1:136758871317:model-package/contextual-answers-1-0-001-a85d7d493b3a39e3a8a8ec734f2befae",
    "ap-southeast-1": "arn:aws:sagemaker:ap-southeast-1:192199979996:model-package/contextual-answers-1-0-001-a85d7d493b3a39e3a8a8ec734f2befae",
    "ap-southeast-2": "arn:aws:sagemaker:ap-southeast-2:666831318237:model-package/contextual-answers-1-0-001-a85d7d493b3a39e3a8a8ec734f2befae",
    "ap-northeast-2": "arn:aws:sagemaker:ap-northeast-2:745090734665:model-package/contextual-answers-1-0-001-a85d7d493b3a39e3a8a8ec734f2befae",
    "ap-northeast-1": "arn:aws:sagemaker:ap-northeast-1:977537786026:model-package/contextual-answers-1-0-001-a85d7d493b3a39e3a8a8ec734f2befae",
    "ap-south-1": "arn:aws:sagemaker:ap-south-1:077584701553:model-package/contextual-answers-1-0-001-a85d7d493b3a39e3a8a8ec734f2befae",
    "sa-east-1": "arn:aws:sagemaker:sa-east-1:270155090741:model-package/contextual-answers-1-0-001-a85d7d493b3a39e3a8a8ec734f2befae"
}

## 2. Check setting

In [2]:
import json
import ai21
import boto3
import sagemaker
from sagemaker import ModelPackage
from sagemaker import get_execution_role

#### Check the version of boto3 - must be v1.25.4 or higher
If you see a lower version number, pick another kernel to run the notebook, with Python 3.8 or above

In [3]:
region = boto3.Session().region_name
model_package_arn = model_package_map[region]
role = get_execution_role()
sagemaker_session = sagemaker.Session()
runtime_sm_client = boto3.client("runtime.sagemaker")
if region not in model_package_map.keys():raise ("UNSUPPORTED REGION")

In [4]:
print (f'boto3 ver.: {boto3.__version__}')
print (f'model_package_arn: {model_package_arn}')
print (f'role: {role}')
print (f'region: {region}')
print (f'sagemaker_session: {sagemaker_session}')

boto3 ver.: 1.26.155
model_package_arn: arn:aws:sagemaker:us-east-1:865070037744:model-package/contextual-answers-1-0-001-a85d7d493b3a39e3a8a8ec734f2befae
role: arn:aws:iam::843077212694:role/PepStack-CrossAccountRole55335AA5-1AMOSWE2L94A
region: us-east-1
sagemaker_session: <sagemaker.session.Session object at 0x7fd151b0b550>


## 3. Create an endpoint and perform real-time inference
If you want to understand how real-time inference with Amazon SageMaker works, see [Documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/deploy-model.html).

In [6]:
from datetime import datetime

In [7]:
time_stamp = datetime.now().strftime("%Y-%m-%d-%H-%M-%S")
endpoint_name = f"contextual-answers-{time_stamp}"
content_type = "application/json"
real_time_inference_instance_type = ("ml.g5.12xlarge")

In [8]:
print (f'endpoint_name: {endpoint_name}')
print (f'real_time_inference_instance_type: {real_time_inference_instance_type}')

endpoint_name: contextual-answers-2023-06-19-01-57-17
real_time_inference_instance_type: ml.g5.12xlarge


### 3.1. Create an endpoint
Once endpoint has been created, you would be able to perform real-time inference.

In [9]:
# create a deployable model from the model package.
model = ModelPackage(
    role=role,
    model_package_arn=model_package_arn,
    sagemaker_session=sagemaker_session
)

# Deploy the model
predictor = model.deploy(
    1,
    real_time_inference_instance_type,
    endpoint_name=endpoint_name, 
    model_data_download_timeout=3600,
    container_startup_health_check_timeout=600,
)

---------------!

### 3.2. Interact with the model (**examples**)

**AI21 Studio Contextual Answers model** allows you to access our high-quality question answering technology. It was designed to answer questions based on a specific document context provided by the customer. This avoids any factual issues that language models may have and makes sure the answers it provides are grounded in that context document.

This model receives document text, serving as a context, and a question and returns an answer based entirely on this context. This means that if the answer to your question is not in the document, the model will indicate it (instead of providing a false answer).

To get a sense of the model's behavior, let's use this toy example of asking what is the Eiffel tower height. Most language models will simply answer according to their training data.

This model, however, bases its answer solely on the context you provide. Let's use the following [Wikipedia paragraph](https://en.wikipedia.org/wiki/Eiffel_Tower#:~:text=The%20Eiffel%20Tower%20(%2F%CB%88a%C9%AA,from%20the%20Champ%20de%20Mars) as context, with small modifications:

### This is example for **Korean**

In [5]:
translate = boto3.client("translate")

In [6]:
def trans(text, target="en"):

    response=translate.translate_text(
        Text=text,
        SourceLanguageCode="Auto",
        TargetLanguageCode=target
    )
    
    text_translate = response["TranslatedText"]
    #print (text_translate)
    return text_translate

In [26]:
# Actual paragraph
context = "The tower is 330 metres (1,083 ft) tall,[6] about the same height as an 81-storey building, and the tallest structure in Paris. Its base is square, measuring 125 metres (410 ft) on each side. During its construction, the Eiffel Tower surpassed the Washington Monument to become the tallest human-made structure in the world, a title it held for 41 years until the Chrysler Building in New York City was finished in 1930. It was the first structure in the world to surpass both the 200-metre and 300-metre mark in height. Due to the addition of a broadcasting aerial at the top of the tower in 1957, it is now taller than the Chrysler Building by 5.2 metres (17 ft). Excluding transmitters, the Eiffel Tower is the second tallest free-standing structure in France after the Millau Viaduct."
context = trans(context, target="ko")

# The paragraph with manual changes of the height
false_context = "The tower is 3 metres (10 ft) tall,[6] about the same height as an 81-storey building, and the tallest structure in Paris. Its base is square, measuring 125 metres (410 ft) on each side. During its construction, the Eiffel Tower surpassed the Washington Monument to become the tallest human-made structure in the world, a title it held for 41 years until the Chrysler Building in New York City was finished in 1930. It was the first structure in the world to surpass both the 200-metre and 300-metre mark in height. Due to the addition of a broadcasting aerial at the top of the tower in 1957, it is now taller than the Chrysler Building by 5.2 metres (17 ft). Excluding transmitters, the Eiffel Tower is the second tallest free-standing structure in France after the Millau Viaduct."
false_context = trans(false_context, target="ko")

# The paragraph with the height omitted
partial_context = "Its base is square, measuring 125 metres (410 ft) on each side. During its construction, the Eiffel Tower surpassed the Washington Monument to become the tallest human-made structure in the world, a title it held for 41 years until the Chrysler Building in New York City was finished in 1930. It was the first structure in the world to surpass both the 200-metre and 300-metre mark in height. Due to the addition of a broadcasting aerial at the top of the tower in 1957, it is now taller than the Chrysler Building by 5.2 metres (17 ft). Excluding transmitters, the Eiffel Tower is the second tallest free-standing structure in France after the Millau Viaduct."
partial_context = trans(partial_context, target="ko")

Here is what the model will say when asked the same question in each context.

In [27]:
question = "에펠탑의 높이는 얼마입니까?"

True context

In [12]:
endpoint_name = "contextual-answers-2023-06-19-01-57-17"

In [28]:
# True context
response = ai21.Answer.execute(
    context=trans(context, target="en"),
    question=trans(question, target="en"),
    sm_endpoint=endpoint_name
)
response = trans(response.answer, target="ko")
print(response)

에펠탑의 높이는 1,083피트입니다.


False context

In [14]:
response = ai21.Answer.execute(
    context=trans(false_context, target="en"),
    question=trans(question, target="en"),
    sm_endpoint=endpoint_name
)
response = trans(response.answer, target="ko")
print(response)

에펠탑의 높이는 3미터 (10피트) 입니다.


Irrelevant context

In [15]:
response = ai21.Answer.execute(
    context=trans(partial_context, target="en"),
    question=trans(question, target="en"),
    sm_endpoint=endpoint_name
)
response = trans(response.answer, target="ko")
print(response)

송신기를 제외한 에펠탑은 현재 크라이슬러 빌딩보다 5.2미터 (17피트) 더 높아 프랑스에서 두 번째로 높은 독립형 건축물이 되었습니다.


In [17]:
context = """||E-Mart Yeonsu Branch Phone Number: 032-820-1234||\n||E-Mart Manchon Branch Phone Number: 053-602-1234||\n||E-Mart Iksan Branch Phone Number: 063-860-1234||\n||E-Mart Zayang phone number: 02-6742-1234||\n||E-Mart Bongseon Branch Phone Number: 062-611-1234||\n||E-Mart Yeoju phone number: 031-887-6123||\n||E-Mart Dongtan Branch Phone Number: 031-647-1234||\n||E-Mart Wangsimni phone number: 02-6438-1234||\n||E-Mart Mia Branch Phone Number: 02-944-1234||\n||E-Mart Hanam Branch Phone Number: 031-524-1234||\n||E-Mart private store phone number: 051-329-1234||\n||E-Mart Ulsan Branch Phone Number: 052-220-1234||\n||E-Mart Yangjae Branch phone number: 02-6040-1234||\n||E-Mart Suncheon Branch Phone Number: 061-749-1234||\n||E-Mart Seosuwon phone number: 031-895-1234||\n||E-Mart Yangsan Branch Phone Number: 055-371-8123||\n||E-Mart Laurel Branch phone number: 02-2092-1234||\n||E-Mart Chuncheon Branch Phone Number: 033-240-1234||\n||E-Mart Yeongdeungpo phone number: 02-3 468-1234||"""

In [21]:
question = "I want to know the phone number of E-Mart Ulsan"

In [22]:
response = ai21.Answer.execute(
    context=context,
    question=question,
    sm_endpoint=endpoint_name
)
response = trans(response.answer, target="ko")
print(response)

이마트 울산점 전화번호는 052-220-1234입니다


## 4. Clean-up

### 4.1. Delete the endpoint
Now that you have successfully performed a real-time inference, you do not need the endpoint any more. You can terminate the endpoint to avoid being charged.

In [88]:
model.sagemaker_session.delete_endpoint(endpoint_name)
model.sagemaker_session.delete_endpoint_config(endpoint_name)

ClientError: An error occurred (ValidationException) when calling the DeleteEndpoint operation: Could not find endpoint "arn:aws:sagemaker:us-east-1:843077212694:endpoint/contextual-answers-2023-06-16-09-57-18".

### 4.2. Delete the model

In [89]:
model.delete_model()