# Deploy the Model

The pipeline that was executed created a Model Package version within the specified Model Package Group. Of particular note, the registration of the model/creation of the Model Package was done so with approval status as `PendingManualApproval`.

As part of SageMaker Pipelines, data scientists can register the model with approved/pending manual approval as part of the CI/CD workflow.

We can also approve the model using the SageMaker Studio UI or programmatically as shown below.

In [2]:
import psutil

notebook_memory = psutil.virtual_memory()

if notebook_memory.total < 32 * 1024 * 1024:
    print('*******************************************')    
    print('YOU ARE NOT USING THE CORRECT INSTANCE TYPE')
    print('PLEASE CHANGE INSTANCE TYPE TO  m5.2xlarge ')
    print('*******************************************')
else:
    correct_instance_type=True
    print(notebook_memory)

svmem(total=32890294272, available=16411623424, percent=50.1, used=16006258688, free=11512184832, active=17034067968, inactive=3217072128, buffers=0, cached=5371850752, shared=1355776, slab=464343040)


In [3]:
from botocore.exceptions import ClientError

import os
import sagemaker
import logging
import boto3
import sagemaker
import pandas as pd

sess = sagemaker.Session()
bucket = sess.default_bucket()
region = boto3.Session().region_name

import botocore.config

config = botocore.config.Config(
    user_agent_extra='dsoaws/1.0'
)

sm = boto3.Session().client(service_name="sagemaker", 
                            region_name=region,
                            config=config)

# List Pipeline Execution Steps


In [4]:
%store -r pipeline_endpoint_name

In [5]:
print(pipeline_endpoint_name)

gpt3-model-from-registry-ep-1677516871


In [6]:
from IPython.core.display import display, HTML

display(
    HTML(
        '<b>Review <a target="blank" href="https://console.aws.amazon.com/sagemaker/home?region={}#/endpoints/{}">SageMaker REST Endpoint</a></b>'.format(
            region, pipeline_endpoint_name
        )
    )
)

# _Wait Until the Endpoint is Deployed_
_Note:  This will take a few minutes.  Please be patient._

In [7]:
%%time

waiter = sm.get_waiter("endpoint_in_service")
waiter.wait(EndpointName=pipeline_endpoint_name)

CPU times: user 21.1 ms, sys: 0 ns, total: 21.1 ms
Wall time: 155 ms


# _Wait Until the Endpoint ^^ Above ^^ is Deployed_

# Generate a sample review

In [8]:
import json

from sagemaker import Predictor

predictor = Predictor(
    endpoint_name=pipeline_endpoint_name,
    sagemaker_session=sess,
)

### Advanced text generation features

***
This model also supports many advanced parameters while performing inference. They include:

* **max_length:** Model generates text until the output length (which includes the input context length) reaches `max_length`. If specified, it must be a positive integer.
* **num_return_sequences:** Number of output sequences returned. If specified, it must be a positive integer.
* **num_beams:** Number of beams used in the greedy search. If specified, it must be integer greater than or equal to `num_return_sequences`.
* **no_repeat_ngram_size:** Model ensures that a sequence of words of `no_repeat_ngram_size` is not repeated in the output sequence. If specified, it must be a positive integer greater than 1.
* **temperature:** Controls the randomness in the output. Higher temperature results in output sequence with low-probability words and lower temperature results in output sequence with high-probability words. If `temperature` -> 0, it results in greedy decoding. If specified, it must be a positive float.
* **early_stopping:** If True, text generation is finished when all beam hypotheses reach the end of stence token. If specified, it must be boolean.
* **do_sample:** If True, sample the next word as per the likelyhood. If specified, it must be boolean.
* **top_k:** In each step of text generation, sample from only the `top_k` most likely words. If specified, it must be a positive integer.
* **top_p:** In each step of text generation, sample from the smallest possible set of words with cumulative probability `top_p`. If specified, it must be a float between 0 and 1.
* **seed:** Fix the randomized state for reproducibility. If specified, it must be an integer.

We may specify any subset of the parameters mentioned above while invoking an endpoint. Next, we show an example of how to invoke endpoint with these arguments

***

In [9]:
import json

prompt =  '{"text_inputs": "Write a review for Norton Antivirus", "max_length": 100, "top_k": 50, "top_p": 0.9, "do_sample": true}'
            
response = predictor.predict(prompt,
        {
            "ContentType": "application/json",
            "Accept": "application/json",
        },
)

print("Response: {}".format(response.decode('utf-8')))



In [10]:
import json

prompt =  '{"text_inputs": "Write a review for Turbo Tax", "max_length": 100, "top_k": 50, "top_p": 0.9, "do_sample": true}'
            
response = predictor.predict(prompt,
        {
            "ContentType": "application/json",
            "Accept": "application/json",
        },
)

print("Response: {}".format(response.decode('utf-8')))

Response: {"generated_texts": ["Write a review for Turbo Tax in no time!  It was the easiest way to get them to pay their invoices.This was a great idea, it was convenient. We loved it.  I am going to use it in the future too.Great productEasy to install and use.I downloaded and installed the upgrade and installed. Now i can see my payments right through and on my computer. I was very pleased with the program. I use my credit cards for all my bills so far.Excel"]}


# Release Resources

In [11]:
# sm.delete_endpoint(
#      EndpointName=pipeline_endpoint_name
# )

In [12]:
# %%html

# <p><b>Shutting down your kernel for this notebook to release resources.</b></p>
# <button class="sm-command-button" data-commandlinker-command="kernelmenu:shutdown" style="display:none;">Shutdown Kernel</button>

# <script>
# try {
#     els = document.getElementsByClassName("sm-command-button");
#     els[0].click();
# }
# catch(err) {
#     // NoOp
# }
# </script>