# Task 2: Caching Bedrock responses
In this assignment, you will explore the Amazon Bedrock Converse API through an interactive chat session. You will learn how to cache responses and conserve tokens to reduce the overall cost of using Bedrock.

## Task 2.1: Setting up Bedrock
In the Terminal application, run the following command: jupyter notebook

In the Jupyter window, select File, New, and then Notebook.

Select Untitled and enter genai-exercise7-caching.ipynb as the name for your newly created notebook.

Choose Rename.

In the first cell, paste the following code. After you have pasted the code, select the cell, and then choose the play button at the top of the notebook to run the this block of code. This will set up the variables and connections that will be needed in future steps.

In [1]:
import boto3
import json
from datetime import datetime

bedrock=boto3.client('bedrock-runtime',region_name='us-east-1')

MODEL_ID = "amazon.nova-micro-v1:0"

# Task 2.2: Training Bedrock
Create a new cell in your Jupyter notebook.

Paste the following code in the newly created cell. After you have pasted, select the cell, and then choose the play button at the top of the notebook to run the this block of code.

In [2]:
system_prompt = """
 You are an assistant that summarizes music reviews for a record company.
 Here are examples:

 Review: The latest album by The New Wave Band is a masterpiece! Every track is a hit.
 Summary: Reviewer praises the latest album as a masterpiece with hit tracks.

 Review: I was disappointed with the new single; it lacked the energy of their previous work.
 Summary: Reviewer expresses disappointment, noting a lack of energy compared to previous work.
 """

user_input_review = "This EP is a solid effort with a few standout songs, though some tracks feel repetitive."

## Task 2.3: No caching
Create a new cell in your Jupyter notebook.

Paste the following code in the newly created cell:

In [None]:
# Define the payload for the API request without caching
no_cache_payload = {
     "system": [ # Start of the system prompt list
         {"text": system_prompt} # The system prompt text
     ],
     "messages": [ # Start of the user messages list
         {
             "role": "user", # Role of the message sender
             "content": [ # Content of the user message
                 {"text": user_input_review + "\nSummary:"} # The user's review text followed by "Summary:"
             ]
         }
     ]
 }

# Make an API call to the bedrock service using the defined payload
no_cache_response = bedrock.converse(
     modelId=MODEL_ID, # Specify the model ID to use
     system=no_cache_payload["system"], # Pass the system prompt from the payload
     messages=no_cache_payload["messages"] # Pass the user messages from the payload
 )

# Extract the generated text output from the API response
no_cache_output = no_cache_response['output']['message']['content'][0]['text']
# Extract the number of input tokens used from the API response
no_cache_input_tokens = no_cache_response['usage']['inputTokens']
# Extract the number of output tokens generated from the API response
no_cache_output_tokens = no_cache_response['usage']['outputTokens']
# Calculate the total number of tokens used (input + output)
no_cache_tokens = no_cache_input_tokens + no_cache_output_tokens

# Print a header indicating the summary is without caching
print("[No Caching] Generated Summary:")
# Print the generated summary text
print(no_cache_output)
# Print the total number of tokens used for this request
print(f"Total Tokens Used (No Cache): {no_cache_tokens}")

[No Caching] Generated Summary:
Reviewer acknowledges the EP as a solid effort with a few standout songs, but notes that some tracks feel repetitive.
Total Tokens Used (No Cache): 128


## Task 2.4: With caching
This task will compare the token usage of Bedrock when caching responses versus the token usage of an uncached API call.

Create a new cell in your Jupyter notebook.

Paste the following code in the newly created cell:

In [None]:
# Define a dictionary for caching configuration.
# This specifies that a default cache point should be used for the current request.
cache_point = {"cachePoint": {"type": "default"}}

# Construct the payload for the bedrock.converse API call, including caching.
payload_with_caching = {
    "system": [
        # The system prompt provides context or instructions to the model.
        {"text": system_prompt},
        # Integrate the cache point configuration into the system messages.
        # This tells Bedrock to check and potentially store the response in the cache.
        cache_point
    ],
    "messages": [
        {
            "role": "user",
            "content": [
                # The user's input, which includes the review to be summarized.
                # Adding "Summary:" at the end prompts the model for a summary.
                {"text": user_input_review + "\nSummary:"}
            ]
        }
    ]
}

# Call the Amazon Bedrock Converse API with the model ID and the constructed payload.
# This sends the prompt to the model, and due to `cache_point`, it will leverage caching.
response = bedrock.converse(
    modelId=MODEL_ID,
    system=payload_with_caching["system"],
    messages=payload_with_caching["messages"]
)

# Extract the generated summary text from the model's response.
cache_output = response['output']['message']['content'][0]['text']
# Extract the number of input tokens used for the request.
cache_input_tokens = response['usage']['inputTokens']
# Extract the number of output tokens generated by the model.
cache_output_tokens = response['usage']['outputTokens']
# Calculate the total number of tokens (input + output) used for this interaction.
cache_tokens = cache_input_tokens + cache_output_tokens

# Print a header indicating that the following output is with caching enabled.
print("[With Caching] Generated Summary:")
# Print the summary generated by the model.
print(cache_output)
# Print the total number of tokens consumed by this cached interaction.
print(f"Total Tokens Used: {cache_tokens}")

[With Caching] Generated Summary:
Reviewer acknowledges the EP as a solid effort with a few standout songs, but notes that some tracks feel repetitive.
Total Tokens Used: 44


## Task 3: Changing your Bedrock prompt
In this task, you'll modify your Bedrock instructions to observe how caching behaves when you change your original request.

Return to the Jupyter cell that contains this text: system_prompt.

Replace the content of that cell with the following text:

In [7]:
system_prompt = """
 You are a pet expert that will tell the user what type of pet they have based on a description. You'll keep your responses short and generalized.
 Here are examples:

 Description: My pet barks and has four legs.
 Summary: Your pet is a dog.

 Review: My pet meows and uses a litter box.
 Summary: Your pet is a cat.
 """

user_input_review = "My pet sings and has wings."

In [12]:
cache_point = {"cachePoint": {"type": "default"}}

payload_with_caching = {
     "system": [
         {"text": system_prompt},
         cache_point
     ],
     "messages": [
         {
             "role": "user",
             "content": [
                 {"text": user_input_review + "\nSummary:"}
             ]
         }
     ]
 }

response = bedrock.converse(
     modelId=MODEL_ID,
     system=payload_with_caching["system"],
     messages=payload_with_caching["messages"]
 )

cache_output = response['output']['message']['content'][0]['text']
cache_input_tokens = response['usage']['inputTokens']
cache_output_tokens = response['usage']['outputTokens']
cache_tokens = cache_input_tokens + cache_output_tokens

print("[With Caching] Generated Summary:")
print(cache_output)
print(f"Total Tokens Used: {cache_tokens}")

[With Caching] Generated Summary:
Your pet is a bird.
Total Tokens Used: 17
