# Text summarization with small files with Anthropic Claude

## Overview

In this example, you are going to ingest a small amount of data (String data) directly into Amazon Bedrock API (using Anthropic Claude model) and give it an instruction to summarize the respective text.


In this notebook:

1. A small piece of text (or small file) is loaded
1. A foundation model processes the input data
1. Model returns a response with the summary of the ingested text
1. The same is done with a longer text, to test the model's capability

### Use case

This approach can be used to summarize call transcripts, meetings transcripts, books, articles, blog posts, and other relevant content.

## Setup

In [10]:
%pip install --no-build-isolation --force-reinstall \
    "boto3>=1.28.57" \
    "awscli>=1.29.57" \
    "botocore>=1.31.57" \
    "anthropic"

Collecting boto3>=1.28.57
  Using cached boto3-1.34.70-py3-none-any.whl.metadata (6.6 kB)
Collecting awscli>=1.29.57
  Using cached awscli-1.32.70-py3-none-any.whl.metadata (11 kB)
Collecting botocore>=1.31.57
  Using cached botocore-1.34.70-py3-none-any.whl.metadata (5.7 kB)
Collecting anthropic
  Using cached anthropic-0.21.3-py3-none-any.whl.metadata (17 kB)
Collecting datasets
  Downloading datasets-2.18.0-py3-none-any.whl.metadata (20 kB)
Collecting jmespath<2.0.0,>=0.7.1 (from boto3>=1.28.57)
  Using cached jmespath-1.0.1-py3-none-any.whl.metadata (7.6 kB)
Collecting s3transfer<0.11.0,>=0.10.0 (from boto3>=1.28.57)
  Using cached s3transfer-0.10.1-py3-none-any.whl.metadata (1.7 kB)
Collecting docutils<0.17,>=0.10 (from awscli>=1.29.57)
  Using cached docutils-0.16-py2.py3-none-any.whl.metadata (2.7 kB)
Collecting PyYAML<6.1,>=3.10 (from awscli>=1.29.57)
  Using cached PyYAML-6.0.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (2.1 kB)
Collecting colorama<0.4

In [39]:
%pip install --upgrade "PyMuPDF"

Collecting PyMuPDF
  Downloading PyMuPDF-1.24.0-cp310-none-manylinux2014_x86_64.whl.metadata (3.4 kB)
Collecting PyMuPDFb==1.24.0 (from PyMuPDF)
  Downloading PyMuPDFb-1.24.0-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (1.4 kB)
Downloading PyMuPDF-1.24.0-cp310-none-manylinux2014_x86_64.whl (3.9 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m3.9/3.9 MB[0m [31m54.0 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m
[?25hDownloading PyMuPDFb-1.24.0-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (30.8 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m30.8/30.8 MB[0m [31m69.2 MB/s[0m eta [36m0:00:00[0m:00:01[0m00:01[0m
[0mInstalling collected packages: PyMuPDFb, PyMuPDF
Successfully installed PyMuPDF-1.24.0 PyMuPDFb-1.24.0
Note: you may need to restart the kernel to use updated packages.


In [40]:
# restart kernel
from IPython.core.display import HTML
HTML("<script>Jupyter.notebook.kernel.restart()</script>")

In [41]:
import json
import os
import sys

import boto3

module_path = ".."
sys.path.append(os.path.abspath(module_path))
from utils import bedrock, print_ww

# ---- ⚠️ Un-comment and edit the below lines as needed for your AWS setup ⚠️ ----

# os.environ["AWS_DEFAULT_REGION"] = "<REGION_NAME>"  # E.g. "us-east-1"
# os.environ["AWS_PROFILE"] = "<YOUR_PROFILE>"
# os.environ["BEDROCK_ASSUME_ROLE"] = "<YOUR_ROLE_ARN>"  # E.g. "arn:aws:..."


boto3_bedrock = bedrock.get_bedrock_client(
    assumed_role=os.environ.get("BEDROCK_ASSUME_ROLE", None),
    region=os.environ.get("AWS_DEFAULT_REGION", None)
)

Create new client
  Using region: us-east-1
boto3 Bedrock client successfully created!
bedrock-runtime(https://bedrock-runtime.us-east-1.amazonaws.com)


## Summarizing a short text with boto3

In [33]:
# remove all the blank lines from small_text
small_text="""Once upon a time, there was a little girl who lived in a village near the forest.  Whenever she went out, the little girl wore a red riding cloak, so everyone in the village called her Little Red Riding Hood.
One morning, Little Red Riding Hood asked her mother if she could go to visit her grandmother as it had been awhile since they'd seen each other.
"That's a good idea," her mother said.  So they packed a nice basket for Little Red Riding Hood to take to her grandmother. 
When the basket was ready, the little girl put on her red cloak and kissed her mother goodbye. "Remember, go straight to Grandma's house," her mother cautioned.
"Don't dawdle along the way and please don't talk to strangers! The woods are dangerous."  "Don't worry, mommy," said Little Red Riding Hood, "I'll be careful." Little Red Riding Hood StoryBut when Little Red Riding Hood noticed some lovely flowers in the woods, she forgot her promise to her mother.
She picked a few, watched the butterflies flit about for awhile, listened to the frogs croaking and then picked a few more.
Little Red Riding Hood was enjoying the warm summer day so much, that she didn't notice a dark shadow approaching out of the forest behind her...  Suddenly, the wolf appeared beside her.  "What are you doing out here, little girl?" the wolf asked in a voice as friendly as he could muster.  "I'm on my way to see my Grandma who lives through the forest, near the brook,"  Little Red Riding Hood replied.  Then she realized how late she was and quickly excused herself, rushing down the path to her Grandma's house.   The wolf, in the meantime, took a shortcut...  The wolf, a little out of breath from running, arrived at Grandma's and knocked lightly at the door.  "Oh thank goodness dear!  Come in, come in!  I was worried sick that something had happened to you in the forest," said Grandma thinking that the knock was her granddaughter.  The wolf let himself in.  Poor Granny did not have time to say another word, before the wolf gobbled her up!  The wolf let out a satisfied burp, and then poked through Granny's wardrobe to find a nightgown that he liked.  He added a frilly sleeping cap, and for good measure, dabbed some of Granny's perfume behind his pointy ears.  A few minutes later, Red Riding Hood knocked on the door.  The wolf jumped into bed and pulled the covers over his nose.  "Who is it?" he called in a cackly voice.  "It's me, Little Red Riding Hood."  "Oh how lovely!  Do come in, my dear," croaked the wolf.  When Little Red Riding Hood entered the little cottage, she could scarcely recognize her Grandmother.  "Grandmother!  Your voice sounds so odd.  Is something the matter?" she asked.  "Oh, I just have touch of a cold," squeaked the wolf adding a cough at the end to prove the point.  "But Grandmother!  What big ears you have," said Little Red Riding Hood as she edged closer to the bed.  "The better to hear you with, my dear," replied the wolf.  "But Grandmother!  What big eyes you have," said Little Red Riding Hood.  "The better to see you with, my dear," replied the wolf.  "But Grandmother!  What big teeth you have," said Little Red Riding Hood her voice quivering slightly.  "The better to eat you with, my dear," roared the wolf and he leapt out of the bed and began to chase the little girl.  Almost too late, Little Red Riding Hood realized that the person in the bed was not her Grandmother, but a hungry wolf. 
She ran across the room and through the door, shouting, "Help!  Wolf!" as loudly as she could.
A woodsman who was chopping logs nearby heard her cry and ran towards the cottage as fast as he could.
He grabbed the wolf and made him spit out the poor Grandmother who was a bit frazzled by the whole experience,
but still in one piece."Oh Grandma, I was so scared!" sobbed Little Red Riding Hood, "I'll never speak to strangers or dawdle in the forest again."
"There, there, child.  You've learned an important lesson.  Thank goodness you shouted loud enough for this kind woodsman to hear you!"
The woodsman knocked out the wolf and carried him deep into the forest where he wouldn't bother people any longer.
Little Red Riding Hood and her Grandmother had a nice lunch and a long chat."""

Let's look at the length of this text using the Anthropic client

In [34]:
from anthropic import Anthropic
client = Anthropic()
def count_tokens(text):
    return client.count_tokens(text)

print(count_tokens(small_text))

1034


Now let's build the prompt for requesting a summary, using the text we just saw

In [35]:
prompt = f"""

Human: Please provide a summary of the following text.
<text>
{small_text}
</text>

Assistant:"""

## Creating request body with prompt and inference parameters 

Following the request syntax of `invoke_model`, you create request body with the above prompt and inference parameters.

In [36]:
body = json.dumps({"prompt": prompt,
                 "max_tokens_to_sample":4096,
                 "temperature":0.5,
                 "top_k":250,
                 "top_p":0.5,
                 "stop_sequences":[]
                  }) 

## Invoke foundation model via Boto3

Here sends the API request to Amazon Bedrock with specifying request parameters `modelId`, `accept`, and `contentType`. Following the prompt, the foundation model in Amazon Bedrock summarizes the text.

In [37]:
modelId = 'anthropic.claude-v2' # change this to use a different version from the model provider
accept = 'application/json'
contentType = 'application/json'

response = boto3_bedrock.invoke_model(body=body, modelId=modelId, accept=accept, contentType=contentType)
response_body = json.loads(response.get('body').read())

print_ww(response_body.get('completion'))

 Here is a summary of the story:

Little Red Riding Hood is a young girl who lives in a village near the forest. One day her mother
asks her to visit her sick grandmother and bring her a basket of food. On her way through the woods,
Little Red Riding Hood is distracted by the scenery and wildlife. A wolf approaches her and learns
that she is going to her grandmother's house. The wolf takes a shortcut and devours the grandmother
before disguising himself as her when Little Red Riding Hood arrives. Little Red Riding Hood senses
something is wrong when she interacts with the "grandmother" but it is too late when she realizes it
is a wolf. The wolf chases her but a woodsman hears her cries for help and rescues her and her
grandmother. Little Red Riding Hood learns not to talk to strangers or dawdle on the way to her
grandmother's house again.


### Let's see what's the length of the summary in tokens:

In [None]:
print(f"Summary length: {count_tokens(response_body.get('completion'))} tokens")

In the above the Bedrock service generates the entire summary for the given prompt in a single output, this can be slow if the output contains large amount of tokens. 

Below we explore the option how we can use Bedrock to stream the output such that the user could start consuming it as it is being generated by the model. For this Bedrock supports `invoke_model_with_response_stream` API providing `ResponseStream` that streams the output in form of chunks.

In [None]:
response = boto3_bedrock.invoke_model_with_response_stream(body=body, modelId=modelId, accept=accept, contentType=contentType)
stream = response.get('body')
output = list(stream)
output

Instead of generating the entire output, Bedrock sends smaller chunks from the model. This can be displayed in a consumable manner as well.

In [None]:
from IPython.display import display_markdown,Markdown,clear_output

In [None]:
response = boto3_bedrock.invoke_model_with_response_stream(body=body, modelId=modelId, accept=accept, contentType=contentType)
stream = response.get('body')
output = []
i = 1
if stream:
    for event in stream:
        chunk = event.get('chunk')
        if chunk:
            chunk_obj = json.loads(chunk.get('bytes').decode())
            text = chunk_obj['completion']
            clear_output(wait=True)
            output.append(text)
            display_markdown(Markdown(''.join(output)))
            i+=1

## Let's try with a longer text and see if the model still can perform the task well

### We can use text from an existing dataset of contracts and work on one of them having a bigger size in tokens

In [42]:
import fitz
doc = fitz.open("../data/attention_is_all_you_need.pdf")

In [50]:
total_text=""
for num,page in enumerate(doc):
    if num==10:
        break
    total_text+=page.get_text()

In [51]:
print(total_text)

Provided proper attribution is provided, Google hereby grants permission to
reproduce the tables and figures in this paper solely for use in journalistic or
scholarly works.
Attention Is All You Need
Ashish Vaswani∗
Google Brain
avaswani@google.com
Noam Shazeer∗
Google Brain
noam@google.com
Niki Parmar∗
Google Research
nikip@google.com
Jakob Uszkoreit∗
Google Research
usz@google.com
Llion Jones∗
Google Research
llion@google.com
Aidan N. Gomez∗†
University of Toronto
aidan@cs.toronto.edu
Łukasz Kaiser∗
Google Brain
lukaszkaiser@google.com
Illia Polosukhin∗‡
illia.polosukhin@gmail.com
Abstract
The dominant sequence transduction models are based on complex recurrent or
convolutional neural networks that include an encoder and a decoder. The best
performing models also connect the encoder and decoder through an attention
mechanism. We propose a new simple network architecture, the Transformer,
based solely on attention mechanisms, dispensing with recurrence and convolutions
entirely. Exper

In [52]:
print(f"Doc length: {count_tokens(total_text)} tokens")

Doc length: 7541 tokens


In [53]:
prompt = f"""

Human: Please provide a summary of the following text.
<text>
{total_text}
</text>

Assistant:"""

In [54]:
body = json.dumps({"prompt": prompt,
                 "max_tokens_to_sample":4096,
                 "temperature":0.5,
                 "top_k":250,
                 "top_p":0.5,
                 "stop_sequences":[]
                  }) 

In [None]:
modelId = 'anthropic.claude-v2' # change this to use a different version from the model provider
accept = 'application/json'
contentType = 'application/json'

response = boto3_bedrock.invoke_model(body=body, modelId=modelId, accept=accept, contentType=contentType)
response_body = json.loads(response.get('body').read())

print_ww(response_body.get('completion'))

### Let's see what's the length of the summary in tokens:

In [None]:
print(f"Summary length: {count_tokens(response_body.get('completion'))} tokens")

## Conclusion
You have now experimented with using `boto3` SDK which provides a vanilla exposure to Amazon Bedrock API. Using this API you have seen the use case of generating a summary of text you prefer.

### Take aways
- Adapt this notebook to experiment with different models available through Amazon Bedrock such as Amazon Titan and AI21 Labs Jurassic models.
- Change the prompts to your specific usecase and evaluate the output of different models.
- Play with the token length to understand the latency and responsiveness of the service.
- Apply different prompt engineering principles to get better outputs.