# Part 4a - Model testing with model deployed

In this notebook we will deploy the model that we have trained to a Sagemaker Endpoint. This allows us to have the model live and running and create summaries at any time. IT also allows us to access the model via http requests, if we wanted to.

First, we define some variables which we need for our Sagemaker setup

In [1]:
import boto3
import sagemaker

sess = sagemaker.Session()
# role = sagemaker.get_execution_role()
role = 'arn:aws:iam::595714217589:role/service-role/AmazonSageMaker-ExecutionRole-20220331T161122'
bucket = sess.default_bucket()
client = boto3.client('sagemaker')

print(f"IAM role arn used for running training: {role}")
print(f"S3 bucket used for storing artifacts: {sess.default_bucket()}")

IAM role arn used for running training: arn:aws:iam::595714217589:role/service-role/AmazonSageMaker-ExecutionRole-20220331T161122
S3 bucket used for storing artifacts: sagemaker-us-east-1-595714217589


This code below allows us to access the details of the last training job. In particular we are interested in the S3 loaction of the model.

In [2]:
training_job = client.list_training_jobs()['TrainingJobSummaries'][0]['TrainingJobName']
model_data = sess.describe_training_job(training_job)['ModelArtifacts']['S3ModelArtifacts']
model_data

's3://sagemaker-us-east-1-595714217589/huggingface-pytorch-training-2022-04-06-00-28-08-102/output/model.tar.gz'

Now we can deploy the model to the Sagemaker endpoint. Note that we use our own inference code for this example, as it allows us to finetune the summaries better.

In [3]:
from sagemaker.huggingface import HuggingFaceModel

model_for_deployment = HuggingFaceModel(entry_point='inference.py',
                                        source_dir='inference_code',
                                        model_data=model_data,
                                        role=role,
                                        pytorch_version='1.7.1',
                                        py_version='py36',
                                        transformers_version='4.6.1',
                                        )

In [4]:
predictor = model_for_deployment.deploy(initial_instance_count=1,
                                        instance_type='ml.g4dn.xlarge',
                                        serializer=sagemaker.serializers.JSONSerializer(),
                                        deserializer=sagemaker.deserializers.JSONDeserializer()
                                        )

ClientError: An error occurred (404) when calling the HeadObject operation: Not Found

Now it's time to test the model

In [None]:
import pandas as pd
df_test = pd.read_csv('data/test.csv')
ref_summaries = list(df_test['summary'])
texts = list(df_test['text'])

In [None]:
data = {"inputs":texts[0], "parameters_list":[{"min_length": 5, "max_length": 20}]}
predictor.predict(data)

In [None]:
ref_summaries[0]

In [None]:
candidate_summaries = []

for i, text in enumerate(texts):
    if i % 100 == 0:
        print(i)
    data = {"inputs":text, "parameters_list":[{"min_length": 5, "max_length": 20}]}
    candidate = predictor.predict(data)
    candidate_summaries.append(candidate[0][0])

In [None]:
file = open("summaries/model-summaries.txt", "w")
for s in candidate_summaries:
    file.write(s + "\n")
file.close()

In [None]:
from datasets import load_metric
metric = load_metric("rouge")

In [None]:
def calc_rouge_scores(candidates, references):
    result = metric.compute(predictions=candidates, references=references, use_stemmer=True)
    result = {key: round(value.mid.fmeasure * 100, 1) for key, value in result.items()}
    return result

In [None]:
calc_rouge_scores(candidate_summaries, ref_summaries)

As mentioned above, we can also fine-tune the summaries better using certain parameters. You can learn more about it in this blog post: https://huggingface.co/blog/how-to-generate. Let's try it out:

In [None]:
candidate_summaries_refined = []

for i, text in enumerate(texts):
    if i % 100 == 0:
        print(i)
    data = {"inputs":text, "parameters_list":[{"min_length": 5, "max_length": 20, "num_beams": 50, "top_p": 0.9, "do_sample": True}]}
    candidate = predictor.predict(data)
    candidate_summaries_refined.append(candidate[0][0])

In [None]:
file = open("summaries/model-summaries_refined.txt", "w")
for s in candidate_summaries_refined:
    file.write(s + "\n")
file.close()

In [None]:
calc_rouge_scores(candidate_summaries_refined, ref_summaries)