# Deploy the Model

The pipeline that was executed created a Model Package version within the specified Model Package Group. Of particular note, the registration of the model/creation of the Model Package was done so with approval status as `PendingManualApproval`.

As part of SageMaker Pipelines, data scientists can register the model with approved/pending manual approval as part of the CI/CD workflow.

We can also approve the model using the SageMaker Studio UI or programmatically as shown below.

In [2]:
import psutil

notebook_memory = psutil.virtual_memory()

if notebook_memory.total < 32 * 1024 * 1024:
    print('*******************************************')    
    print('YOU ARE NOT USING THE CORRECT INSTANCE TYPE')
    print('PLEASE CHANGE INSTANCE TYPE TO  m5.2xlarge ')
    print('*******************************************')
else:
    correct_instance_type=True
    print(notebook_memory)

svmem(total=32890294272, available=16511807488, percent=49.8, used=15906140160, free=11552395264, active=16947970048, inactive=3259547648, buffers=0, cached=5431758848, shared=1474560, slab=467062784)


In [3]:
from botocore.exceptions import ClientError

import os
import sagemaker
import logging
import boto3
import sagemaker
import pandas as pd

sess = sagemaker.Session()
bucket = sess.default_bucket()
region = boto3.Session().region_name

import botocore.config

config = botocore.config.Config(
    user_agent_extra='dsoaws/2.0'
)

sm = boto3.Session().client(service_name="sagemaker", 
                            region_name=region,
                            config=config)

# List Pipeline Execution Steps


In [4]:
%store -r pipeline_endpoint_name

In [5]:
print(pipeline_endpoint_name)

gpt3-model-from-registry-ep-1677609355


In [6]:
from IPython.core.display import display, HTML

display(
    HTML(
        '<b>Review <a target="blank" href="https://console.aws.amazon.com/sagemaker/home?region={}#/endpoints/{}">SageMaker REST Endpoint</a></b>'.format(
            region, pipeline_endpoint_name
        )
    )
)

# _Wait Until the Endpoint is Deployed_
_Note:  This will take a few minutes.  Please be patient._

In [7]:
%%time

waiter = sm.get_waiter("endpoint_in_service")
waiter.wait(EndpointName=pipeline_endpoint_name)

CPU times: user 19.2 ms, sys: 3.7 ms, total: 22.9 ms
Wall time: 158 ms


# _Wait Until the Endpoint ^^ Above ^^ is Deployed_

# Generate a sample review

In [8]:
import json

from sagemaker import Predictor

predictor = Predictor(
    endpoint_name=pipeline_endpoint_name,
    sagemaker_session=sess,
)

### Advanced text generation features

***
This model also supports many advanced parameters while performing inference. They include:

* **max_length:** Model generates text until the output length (which includes the input context length) reaches `max_length`. If specified, it must be a positive integer.
* **num_return_sequences:** Number of output sequences returned. If specified, it must be a positive integer.
* **num_beams:** Number of beams used in the greedy search. If specified, it must be integer greater than or equal to `num_return_sequences`.
* **no_repeat_ngram_size:** Model ensures that a sequence of words of `no_repeat_ngram_size` is not repeated in the output sequence. If specified, it must be a positive integer greater than 1.
* **temperature:** Controls the randomness in the output. Higher temperature results in output sequence with low-probability words and lower temperature results in output sequence with high-probability words. If `temperature` -> 0, it results in greedy decoding. If specified, it must be a positive float.
* **early_stopping:** If True, text generation is finished when all beam hypotheses reach the end of stence token. If specified, it must be boolean.
* **do_sample:** If True, sample the next word as per the likelyhood. If specified, it must be boolean.
* **top_k:** In each step of text generation, sample from only the `top_k` most likely words. If specified, it must be a positive integer.
* **top_p:** In each step of text generation, sample from the smallest possible set of words with cumulative probability `top_p`. If specified, it must be a float between 0 and 1.
* **seed:** Fix the randomized state for reproducibility. If specified, it must be an integer.

We may specify any subset of the parameters mentioned above while invoking an endpoint. Next, we show an example of how to invoke endpoint with these arguments

***

In [14]:
prompt =  '{"text_inputs": "Write a review for Norton Antivirus", "max_length": 100, "top_k": 50, "top_p": 0.9, "do_sample": true}'
            
response = predictor.predict(prompt,
        {
            "ContentType": "application/json",
            "Accept": "application/json",
        },
)

print("Response: {}".format(response.decode('utf-8')))

Response: {"generated_texts": ["Write a review for Norton Antivirus software that has not been improved much and is still a great product.The Norton Antivirus program is a good alternative to any other virus removal software out there.  With all the security and antivirus patches that I've come across so far, it has become very hard for me to keep track of the information.<br />Not that I had it with all the protection I've been using on the Mac since I upgraded to Windows.  The process is still easy, and"]}


In [15]:
prompt =  '{"text_inputs": "Write a review for Turbo Tax", "max_length": 100, "top_k": 50, "top_p": 0.9, "do_sample": true}'
            
response = predictor.predict(prompt,
        {
            "ContentType": "application/json",
            "Accept": "application/json",
        },
)

print("Response: {}".format(response.decode('utf-8')))

Response: {"generated_texts": ["Write a review for Turbo Tax is easy. I have purchased my return in 3 versions and they do everything I could ask for without ever getting in my way. I have a great return. I recommend TurboTax because I feel like I can depend on them to provide me with my return and I am very happy. The price was reasonable and the returns were extremely easy to make. I have been using TurboTax since the beginning and my wife's parents are pleased with the service. I am also very"]}


In [26]:
from promptsource.templates import DatasetTemplates

prompt_templates = DatasetTemplates('amazon_us_reviews/Wireless_v1_00')

In [27]:
for template in prompt_templates.templates.values():
    print(template.get_name())

Generate review headline based on review body
Generate review based on rating and category
Given the review headline return a categorical rating
Generate review headline based on rating
Given the review body return a categorical rating


In [28]:
prompt = prompt_templates["Given the review body return a categorical rating"]
print(prompt.answer_choices)

1 ||| 2 ||| 3 ||| 4 ||| 5


In [29]:
print(prompt.__dict__)

{'answer_choices': '1 ||| 2 ||| 3 ||| 4 ||| 5', 'id': 'e6a1bbde-715d-4dad-9178-e2bcfaf5c646', 'jinja': "Given the following review:\n{{review_body}}\npredict the associated rating from the following choices (1 being lowest and 5 being highest)\n- {{ answer_choices | join('\\n- ') }} \n|||\n{{answer_choices[star_rating-1]}}", 'metadata': <promptsource.templates.Template.Metadata object at 0x7f0fe9382610>, 'name': 'Given the review body return a categorical rating', 'reference': 'Given the review body, return a categorical rating. '}


In [30]:
from datasets import load_dataset
dataset = load_dataset("amazon_us_reviews", "Digital_Software_v1_00", split="train")

Found cached dataset amazon_us_reviews (/root/.cache/huggingface/datasets/amazon_us_reviews/Digital_Software_v1_00/0.1.0/17b2481be59723469538adeb8fd0a68b0ba363bbbdd71090e72c325ee6c7e563)


In [31]:
example = dataset[1]
print(example)

{'marketplace': 'US', 'customer_id': '10956619', 'review_id': 'R1W5OMFK1Q3I3O', 'product_id': 'B00HRJMOM4', 'product_parent': '162269768', 'product_title': 'ResumeMaker Professional Deluxe 18', 'product_category': 'Digital_Software', 'star_rating': 3, 'helpful_votes': 0, 'total_votes': 0, 'vine': 0, 'verified_purchase': 1, 'review_headline': 'Three Stars', 'review_body': 'Needs a little more work.....', 'review_date': '2015-08-31'}


In [32]:
result = prompt.apply(example)

In [33]:
print("INPUT: ", result[0])

INPUT:  Given the following review:
Needs a little more work.....
predict the associated rating from the following choices (1 being lowest and 5 being highest)
- 1
- 2
- 3
- 4
- 5


In [34]:
print("RESPONSE: ", result[1])

TARGET:  3


In [35]:
dataset.select([10, 20]).map(lambda row : {'prompt': '\n\nPROMPT: ' + prompt.apply(row)[0] + '\nRESPONSE: ' + prompt.apply(row)[1]})

Loading cached processed dataset at /root/.cache/huggingface/datasets/amazon_us_reviews/Digital_Software_v1_00/0.1.0/17b2481be59723469538adeb8fd0a68b0ba363bbbdd71090e72c325ee6c7e563/cache-fed6625f945cc43f.arrow


Dataset({
    features: ['marketplace', 'customer_id', 'review_id', 'product_id', 'product_parent', 'product_title', 'product_category', 'star_rating', 'helpful_votes', 'total_votes', 'vine', 'verified_purchase', 'review_headline', 'review_body', 'review_date', 'prompt'],
    num_rows: 2
})

In [36]:
prompt0 = prompt.apply(dataset[0])
prompt1 = prompt.apply(dataset[1])
prompt2 = prompt.apply(dataset[2])
prompt3 = prompt.apply(dataset[3])

In [61]:
few_shot_prompt = 'PROMPT: ' + prompt0[0] + '\nRESPONSE: ' + prompt0[1] + '\n\nPROMPT: ' + prompt1[0] + '\nRESPONSE: ' + prompt1[1] + '\n\nPROMPT: ' + prompt2[0] + '\nRESPONSE: ' + prompt2[1] + '\n\nPROMPT: ' + prompt3[0] + '\nRESPONSE: '
print(few_shot_prompt)



PROMPT: Given the following review:
So far so good
predict the associated rating from the following choices (1 being lowest and 5 being highest)
- 1
- 2
- 3
- 4
- 5
STAR_RATING: 4

PROMPT: Given the following review:
Needs a little more work.....
predict the associated rating from the following choices (1 being lowest and 5 being highest)
- 1
- 2
- 3
- 4
- 5
STAR_RATING: 3

PROMPT: Given the following review:
Please cancel.
predict the associated rating from the following choices (1 being lowest and 5 being highest)
- 1
- 2
- 3
- 4
- 5
STAR_RATING: 1

PROMPT: Given the following review:
Works as Expected!
predict the associated rating from the following choices (1 being lowest and 5 being highest)
- 1
- 2
- 3
- 4
- 5
STAR_RATING: 


In [65]:
# print(len(few_shot_prompt))
#max_length = len(few_shot_prompt) + 2

#prompt = '{"text_inputs": "{}", "max_length": {}, "top_k": 50, "top_p": 0.9, "do_sample": true}'.format(few_shot_prompt, max_length)

#prompt = '{"text_inputs": "Write a review for Turbo Tax", "max_length": 100, "top_k": 50, "top_p": 0.9, "do_sample": true}'

prompt = """
PROMPT: Given the following review:
So far so good
predict the associated rating from the following choices (1 being lowest and 5 being highest)
- 1
- 2
- 3
- 4
- 5
RESPONSE: 4

PROMPT: Given the following review:
Needs a little more work.....
predict the associated rating from the following choices (1 being lowest and 5 being highest)
- 1
- 2
- 3
- 4
- 5
RESPONSE: 3

PROMPT: Given the following review:
Please cancel.
predict the associated rating from the following choices (1 being lowest and 5 being highest)
- 1
- 2
- 3
- 4
- 5
RESPONSE: 1

PROMPT: Given the following review:
This is great!
predict the associated rating from the following choices (1 being lowest and 5 being highest)
- 1
- 2
- 3
- 4
- 5
RESPONSE:
"""

response = predictor.predict(prompt,
        {
            "ContentType": "application/x-text",
            "Accept": "application/json",
        },
)

print("Response: {}".format(response.decode('utf-8')))


Response: {"generated_text": "\nPROMPT: Given the following review:\nSo far so good\npredict the associated rating from the following choices (1 being lowest and 5 being highest)\n- 1\n- 2\n- 3\n- 4\n- 5\nSTAR_RATING: 4\n\nPROMPT: Given the following review:\nNeeds a little more work.....\npredict the associated rating from the following choices (1 being lowest and 5 being highest)\n- 1\n- 2\n- 3\n- 4\n- 5\nSTAR_RATING: 3\n\nPROMPT: Given the following review:\nPlease cancel.\npredict the associated rating from the following choices (1 being lowest and 5 being highest)\n- 1\n- 2\n- 3\n- 4\n- 5\nSTAR_RATING: 1\n\nPROMPT: Given the following review:\nThis is great!\npredict the associated rating from the following choices (1 being lowest and 5 being highest)\n- 1\n- 2\n- 3\n- 4\n- 5\nSTAR_RATING:\n1"}


# Release Resources

In [None]:
# sm.delete_endpoint(
#      EndpointName=pipeline_endpoint_name
# )

In [None]:
# %%html

# <p><b>Shutting down your kernel for this notebook to release resources.</b></p>
# <button class="sm-command-button" data-commandlinker-command="kernelmenu:shutdown" style="display:none;">Shutdown Kernel</button>

# <script>
# try {
#     els = document.getElementsByClassName("sm-command-button");
#     els[0].click();
# }
# catch(err) {
#     // NoOp
# }
# </script>