# Deploy the Model

The pipeline that was executed created a Model Package version within the specified Model Package Group. Of particular note, the registration of the model/creation of the Model Package was done so with approval status as `PendingManualApproval`.

As part of SageMaker Pipelines, data scientists can register the model with approved/pending manual approval as part of the CI/CD workflow.

We can also approve the model using the SageMaker Studio UI or programmatically as shown below.

In [24]:
import psutil

notebook_memory = psutil.virtual_memory()
print(notebook_memory)

if notebook_memory.total < 32 * 1000 * 1000 * 1000:
    print('*******************************************')    
    print('YOU ARE NOT USING THE CORRECT INSTANCE TYPE')
    print('PLEASE CHANGE INSTANCE TYPE TO  m5.2xlarge ')
    print('*******************************************')
else:
    correct_instance_type=True

svmem(total=33242578944, available=15628574720, percent=53.0, used=17139732480, free=13134417920, active=17031217152, inactive=2117357568, buffers=0, cached=2968428544, shared=2420736, slab=402223104)


In [25]:
from botocore.exceptions import ClientError

import os
import sagemaker
import logging
import boto3
import sagemaker
import pandas as pd

sess = sagemaker.Session()
bucket = sess.default_bucket()
region = boto3.Session().region_name

import botocore.config

config = botocore.config.Config(
    user_agent_extra='dsoaws/2.0'
)

sm = boto3.Session().client(service_name="sagemaker", 
                            region_name=region,
                            config=config)

# Retrieve model endpoint


In [26]:
%store -r pipeline_endpoint_name

In [27]:
try:
    pipeline_endpoint_name
except NameError:
    print("++++++++++++++++++++++++++++++++++++++++++++++++++++++++++")
    print("[ERROR] Please run previous notebooks before you continue.")
    print("++++++++++++++++++++++++++++++++++++++++++++++++++++++++++")

In [28]:
print(pipeline_endpoint_name)

model-from-registry-ep-1679861015


In [29]:
%store -r dataset_templates_name

In [30]:
try:
    dataset_templates_name
except NameError:
    print("++++++++++++++++++++++++++++++++++++++++++++++++++++++++++")
    print("[ERROR] Please run previous notebooks before you continue.")
    print("++++++++++++++++++++++++++++++++++++++++++++++++++++++++++")

In [31]:
print(dataset_templates_name)

amazon_us_reviews/Wireless_v1_00


In [32]:
%store -r prompt_template_name

In [33]:
try:
    prompt_template_name
except NameError:
    print("++++++++++++++++++++++++++++++++++++++++++++++++++++++++++")
    print("[ERROR] Please run previous notebooks before you continue.")
    print("++++++++++++++++++++++++++++++++++++++++++++++++++++++++++")

In [34]:
print(prompt_template_name)

Given the review body return a categorical rating


In [35]:
from IPython.core.display import display, HTML

display(
    HTML(
        '<b>Review <a target="blank" href="https://console.aws.amazon.com/sagemaker/home?region={}#/endpoints/{}">SageMaker REST Endpoint</a></b>'.format(
            region, pipeline_endpoint_name
        )
    )
)

# _Wait Until the Endpoint is Deployed_
_Note:  This will take a few minutes.  Please be patient._

In [36]:
%%time

waiter = sm.get_waiter("endpoint_in_service")
waiter.wait(EndpointName=pipeline_endpoint_name)

CPU times: user 25.2 ms, sys: 390 µs, total: 25.6 ms
Wall time: 153 ms


# _Wait Until the Endpoint ^^ Above ^^ is Deployed_

# Load sample test data

In [37]:
import pandas as pd
import csv
file = './data-tsv/amazon_reviews_us_Digital_Video_Games_v1_00.tsv.gz'

# Read the file
df = pd.read_csv(file, delimiter="\t", quoting=csv.QUOTE_NONE, compression="gzip")

df.isna().values.any()
df = df.dropna()
df = df.reset_index(drop=True)    

print("Shape of dataframe {}".format(df.shape))

# Convert Pandas dataframes into Datasets
import datasets
from datasets import Dataset

# Create Dataset objects (Arrow PyTables) from Pandas dataframes
dataset = Dataset.from_pandas(df)
df.head()

Shape of dataframe (145427, 15)


Unnamed: 0,marketplace,customer_id,review_id,product_id,product_parent,product_title,product_category,star_rating,helpful_votes,total_votes,vine,verified_purchase,review_headline,review_body,review_date
0,US,21269168,RSH1OZ87OYK92,B013PURRZW,603406193,Madden NFL 16 - Xbox One Digital Code,Digital_Video_Games,2,2,3,N,N,A slight improvement from last year.,I keep buying madden every year hoping they ge...,2015-08-31
1,US,133437,R1WFOQ3N9BO65I,B00F4CEHNK,341969535,Xbox Live Gift Card,Digital_Video_Games,5,0,0,N,Y,Five Stars,Awesome,2015-08-31
2,US,45765011,R3YOOS71KM5M9,B00DNHLFQA,951665344,Command & Conquer The Ultimate Collection [Ins...,Digital_Video_Games,5,0,0,N,Y,Hail to the great Yuri!,If you are prepping for the end of the world t...,2015-08-31
3,US,113118,R3R14UATT3OUFU,B004RMK5QG,395682204,Playstation Plus Subscription,Digital_Video_Games,5,0,0,N,Y,Five Stars,Perfect,2015-08-31
4,US,22151364,RV2W9SGDNQA2C,B00G9BNLQE,640460561,Saints Row IV - Enter The Dominatrix [Online G...,Digital_Video_Games,5,0,0,N,Y,Five Stars,Awesome!,2015-08-31


In [38]:
# Apply prompt    
from promptsource.templates import DatasetTemplates
prompt_templates = DatasetTemplates(dataset_templates_name) 

print('*** Available prompts:')

for template in prompt_templates.templates.values():
    print(template.get_name())

*** Available prompts:
Generate review headline based on review body
Generate review based on rating and category
Given the review headline return a categorical rating
Generate review headline based on rating
Given the review body return a categorical rating


In [39]:
from pprint import pprint

prompt = prompt_templates[prompt_template_name]
print('** Selected prompt name: {}'.format(prompt_template_name))

** Selected prompt name: Given the review body return a categorical rating


In [40]:
print('** Available prompt answers: {}'.format(prompt.answer_choices))

** Available prompt answers: 1 ||| 2 ||| 3 ||| 4 ||| 5


In [41]:
print('** Selected prompt template:')
pprint(prompt.__dict__)

** Selected prompt template:
{'answer_choices': '1 ||| 2 ||| 3 ||| 4 ||| 5',
 'id': 'e6a1bbde-715d-4dad-9178-e2bcfaf5c646',
 'jinja': 'Given the following review:\n'
          '{{review_body}}\n'
          'predict the associated rating from the following choices (1 being '
          'lowest and 5 being highest)\n'
          "- {{ answer_choices | join('\\n- ') }} \n"
          '|||\n'
          '{{answer_choices[star_rating-1]}}',
 'metadata': <promptsource.templates.Template.Metadata object at 0x7f030a454e90>,
 'name': 'Given the review body return a categorical rating',
 'reference': 'Given the review body, return a categorical rating. '}


# Prepare zero-shot, one-shot, and few-shot prompts for inference

In [42]:
dataset = dataset.select([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]).map(lambda row : {'prompt': prompt.apply(row)[0], 'label': prompt.apply(row)[1]})

prompt0 = dataset[0]
prompt1 = dataset[1]
prompt2 = dataset[2]
prompt3 = dataset[3]

zero_shot_prompt = 'PROMPT: ' + prompt2['prompt'] + '\nRESPONSE:'
one_shot_prompt = 'PROMPT: ' + prompt0['prompt'] + '\nRESPONSE: ' + prompt0['label'] + '\n\nPROMPT: ' + prompt2['prompt'] + '\nRESPONSE:'
few_shot_prompt = 'PROMPT: ' + prompt0['prompt'] + '\nRESPONSE: ' + prompt0['label'] + '\n\nPROMPT: ' + prompt1['prompt'] + '\nRESPONSE: ' + prompt1['label'] + '\n\nPROMPT: ' + prompt3['prompt'] + '\nRESPONSE: ' + prompt3['label'] + '\n\nPROMPT: ' + prompt2['prompt'] + '\nRESPONSE:'

  0%|          | 0/10 [00:00<?, ?ex/s]

In [21]:
prompt2['prompt']

'Given the following review:\nIf you are prepping for the end of the world this is one of those things that you should have installed on your-end-of-the-world-proof PC.  Hail to the great Yuri!\npredict the associated rating from the following choices (1 being lowest and 5 being highest)\n- 1\n- 2\n- 3\n- 4\n- 5'

In [48]:
prompt3['prompt']

'Given the following review:\nPerfect\npredict the associated rating from the following choices (1 being lowest and 5 being highest)\n- 1\n- 2\n- 3\n- 4\n- 5'

# Zero-shot

In [None]:
import json

from sagemaker import Predictor

predictor = Predictor(
    endpoint_name=pipeline_endpoint_name,
    sagemaker_session=sess,
)

In [None]:
response = predictor.predict(zero_shot_prompt,
        {
            "ContentType": "application/x-text",
            "Accept": "application/json",
        },
)

response_json = json.loads(response.decode('utf-8'))
print(response_json['generated_text'])

print('** EXPECTED RESPONSE **: {}'.format(prompt2['label']))

# Make many predictions and find the range of labels returned from this probabilistic (non-deterministic) generative model

## _THIS MAY TAKE A FEW MINUTES.  PLEASE BE PATIENT._

In [None]:
set_of_responses_for_prompt = {}
set_of_responses_for_prompt[zero_shot_prompt] = set()

for i in range(100):
    response = predictor.predict(zero_shot_prompt,
            {
                "ContentType": "application/x-text",
                "Accept": "application/json",
            },
    )

    response_json = json.loads(response.decode('utf-8'))
    response_label = response_json['generated_text']
#    print(response_label)
#    print('** EXPECTED RESPONSE **: {}'.format(prompt2['label']))
    
    set_of_responses_for_prompt[zero_shot_prompt].add(response_label)

print('Total responses from the model for prompt: {}'.format(zero_shot_prompt))
print(set_of_responses_for_prompt[zero_shot_prompt])
print('\n')

In [None]:
%store set_of_responses_for_prompt

# Advanced inference parameters

* **max_length:** Model generates text until the output length (which includes the input context length) reaches `max_length`. If specified, it must be a positive integer.
* **num_return_sequences:** Number of output sequences returned. If specified, it must be a positive integer.
* **num_beams:** Number of beams used in the greedy search. If specified, it must be integer greater than or equal to `num_return_sequences`.
* **no_repeat_ngram_size:** Model ensures that a sequence of words of `no_repeat_ngram_size` is not repeated in the output sequence. If specified, it must be a positive integer greater than 1.
* **temperature:** Controls the randomness in the output. Higher temperature results in output sequence with low-probability words and lower temperature results in output sequence with high-probability words. If `temperature` -> 0, it results in greedy decoding. If specified, it must be a positive float.
* **early_stopping:** If True, text generation is finished when all beam hypotheses reach the end of stence token. If specified, it must be boolean.
* **do_sample:** If True, sample the next word as per the likelyhood. If specified, it must be boolean.
* **top_k:** In each step of text generation, sample from only the `top_k` most likely words. If specified, it must be a positive integer.
* **top_p:** In each step of text generation, sample from the smallest possible set of words with cumulative probability `top_p`. If specified, it must be a float between 0 and 1.
* **seed:** Fix the randomized state for reproducibility. If specified, it must be an integer.

We may specify any subset of the parameters mentioned above while invoking an endpoint. Next, we show an example of how to invoke endpoint with these arguments

***

In [None]:
import json

payload = {
    "text_inputs": zero_shot_prompt,
    "num_return_sequences": 1,
    "top_k": 50,
    "top_p": 0.9,
    "do_sample": True,
}


def query_endpoint_with_json_payload(predictor, payload):
    """Query the model predictor with json payload."""

    encoded_payload = json.dumps(payload).encode("utf-8")

    query_response = predictor.predict(
        encoded_payload,
        {
            "ContentType": "application/json",
            "Accept": "application/json",
        },
    )
    return query_response


def parse_response_multiple_texts(query_response):
    """Parse response and return the generated texts."""

    model_predictions = json.loads(query_response)
    generated_texts = model_predictions["generated_texts"]
    return generated_texts


query_response = query_endpoint_with_json_payload(predictor, payload)
generated_texts = parse_response_multiple_texts(query_response)

newline, bold, unbold = "\n", "\033[1m", "\033[0m"
print(f"Input text: {zero_shot_prompt}{newline}" f"Generated text: {bold}{generated_texts}{unbold}{newline}")

print('** EXPECTED RESPONSE **: {}'.format(prompt0['label']))

# Release Resources

In [None]:
# sm.delete_endpoint(
#      EndpointName=pipeline_endpoint_name
# )

In [None]:
# %%html

# <p><b>Shutting down your kernel for this notebook to release resources.</b></p>
# <button class="sm-command-button" data-commandlinker-command="kernelmenu:shutdown" style="display:none;">Shutdown Kernel</button>

# <script>
# try {
#     els = document.getElementsByClassName("sm-command-button");
#     els[0].click();
# }
# catch(err) {
#     // NoOp
# }
# </script>