# Prompt Engineering with Amazon Bedrock

Based on the AWS Samples: Amazon Bedrock Module 1 - Text Generation, we want to introduce how prompt engineering are done in Amazon Bedrock.

# Introduction

Welcome to this introduction to building conversational AI with Amazon Bedrock's Converse API! The primary goal of this chapter is to provide a comprehensive introduction to Amazon Bedrock APIs for generating text. While we'll explore various use cases like summarization and code generation, our focus is on understanding the API patterns.

In this notebook, you will:
* Learn the basics of the Amazon Bedrock Invoke API
* Explore the more powerful Converse API and it's features like multi-turn conversation, streaming, or  function calling
* Apply these APIs across various foundation models
* Compare results across different state-of-the-art models

Pre-requisites
* This notebook requires permissions to access Amazon Bedrock.
* Please ensure that you have added the libraries: Boto3 and AWS-CLI SDK and added the IAM credentials to access Amazon Bedrock with your AWS account. 
* Please make sure to enable the following models in your Amazon Bedrock Console (Go to Model Access and select the models you want to use from Amazon Bedrock) before running this notebook:

Anthropic Claude 3.7 Sonnet
Anthropic Claude 3.5 Sonnet
Anthropic Claude 3.5 Haiku
Amazon Nova Pro
Amazon Nova Micro
DeepSeek-R1
Meta LLama 3.1 70B Instruct

This notebook uses these models to demonstrate various capabilities of the Amazon Bedrock APIs.

In [1]:
import json
import boto3
import botocore
from IPython.display import display, Markdown
import time

In [2]:
# Initialize Bedrock client
session = boto3.session.Session()
region = session.region_name
bedrock = boto3.client(service_name='bedrock-runtime', region_name=region)

In [3]:
# Define model IDs that will be used in this module
MODELS = {
    "Claude 3.7 Sonnet": "us.anthropic.claude-3-7-sonnet-20250219-v1:0",
    "Claude 3.5 Sonnet": "us.anthropic.claude-3-5-sonnet-20240620-v1:0",
    "Claude 3.5 Haiku": "us.anthropic.claude-3-5-haiku-20241022-v1:0",
    "Amazon Nova Pro": "us.amazon.nova-pro-v1:0",
    "Amazon Nova Micro": "us.amazon.nova-micro-v1:0",
    "DeepSeek-R1": "us.deepseek.r1-v1:0",
    "Meta Llama 3.1 70B Instruct": "us.meta.llama3-1-70b-instruct-v1:0"
}

In [4]:
# Utility function to display model responses in a more readable format
def display_response(response, model_name=None):
    if model_name:
        display(Markdown(f"### Response from {model_name}"))
    display(Markdown(response))
    print("\n" + "-"*80 + "\n")

# Text Summarization with Foundation Models

Let's start by exploring how to leverage Amazon Bedrock APIs for text summarization. We'll first use the basic Invoke API, then introduce the more powerful Converse API.

As an example, let's take a paragraph about Amazon Bedrock from an AWS blog post.

In [6]:
text_to_summarize = """
AWS took all of that feedback from customers, and today we are excited to announce Amazon Bedrock, \
a new service that makes FMs from AI21 Labs, Anthropic, Stability AI, and Amazon accessible via an API. \
Bedrock is the easiest way for customers to build and scale generative AI-based applications using FMs, \
democratizing access for all builders. Bedrock will offer the ability to access a range of powerful FMs \
for text and images—including Amazons Titan FMs, which consist of two new LLMs we're also announcing \
today—through a scalable, reliable, and secure AWS managed service. With Bedrock's serverless experience, \
customers can easily find the right model for what they're trying to get done, get started quickly, privately \
customize FMs with their own data, and easily integrate and deploy them into their applications using the AWS \
tools and capabilities they are familiar with, without having to manage any infrastructure (including integrations \
with Amazon SageMaker ML features like Experiments to test different models and Pipelines to manage their FMs at scale).
"""

In [7]:
# Create prompt for summarization
prompt = f"""Please provide a summary of the following text. Do not add any information that is not mentioned in the text below.
<text>
{text_to_summarize}
</text>
"""

In [8]:
# Create request body for Claude 3.7 Sonnet
claude_body = json.dumps({
    "anthropic_version": "bedrock-2023-05-31",
    "max_tokens": 1000,
    "temperature": 0.5,
    "top_p": 0.9,
    "messages": [
        {
            "role": "user",
            "content": [{"type": "text", "text": prompt}]
        }
    ],
})

In [9]:
# Send request to Claude 3.7 Sonnet
try:
    response = bedrock.invoke_model(
        modelId=MODELS["Claude 3.7 Sonnet"],
        body=claude_body,
        accept="application/json",
        contentType="application/json"
    )
    response_body = json.loads(response.get('body').read())
    
    # Extract and display the response text
    claude_summary = response_body["content"][0]["text"]
    display_response(claude_summary, "Claude 3.7 Sonnet (Invoke Model API)")
    
except botocore.exceptions.ClientError as error:
    if error.response['Error']['Code'] == 'AccessDeniedException':
        print(f"\x1b[41m{error.response['Error']['Message']}\
            \nTo troubleshoot this issue please refer to the following resources.\
            \nhttps://docs.aws.amazon.com/IAM/latest/UserGuide/troubleshoot_access-denied.html\
            \nhttps://docs.aws.amazon.com/bedrock/latest/userguide/security-iam.html\x1b[0m\n")
    else:
        raise error

### Response from Claude 3.7 Sonnet (Invoke Model API)

# Summary

Amazon announced Amazon Bedrock, a new service that provides API access to foundation models (FMs) from AI21 Labs, Anthropic, Stability AI, and Amazon. Bedrock aims to democratize access to generative AI by offering an easy way for customers to build and scale applications. The service provides access to various text and image models, including Amazon's new Titan large language models (LLMs). As a serverless AWS managed service, Bedrock allows customers to find appropriate models, customize them with private data, and integrate them into applications using familiar AWS tools without managing infrastructure. It includes integrations with Amazon SageMaker features like Experiments and Pipelines.


--------------------------------------------------------------------------------



# Text Summarization using the Converse API (Recommended Approach)
While the Invoke Model API allows direct access to foundation models, it has several limitations:
* it uses different request/response formats for each model family;
there is no built-in support for multi-turn conversations;
it requires custom handling for different model capabilities.
The Converse API addresses these limitations by providing a unified interface. Let's explore it on our text summarization task:

In [10]:
# Create a converse request with our summarization task
converse_request = {
    "messages": [
        {
            "role": "user",
            "content": [
                {
                    "text": f"Please provide a concise summary of the following text in 2-3 sentences. Text to summarize: {text_to_summarize}"
                }
            ]
        }
    ],
    "inferenceConfig": {
        "temperature": 0.1,
        "topP": 0.95,
        "maxTokens": 500
    }
}

In [11]:
# Call Claude 3.7 Sonnet with Converse API
try:
    response = bedrock.converse(
        modelId=MODELS["Claude 3.7 Sonnet"],
        messages=converse_request["messages"],
        inferenceConfig=converse_request["inferenceConfig"]
    )
    
    # Extract the model's response
    claude_converse_response = response["output"]["message"]["content"][0]["text"]
    display_response(claude_converse_response, "Claude 3.7 Sonnet (Converse API)")
except botocore.exceptions.ClientError as error:
    if error.response['Error']['Code'] == 'AccessDeniedException':
        print(f"\x1b[41m{error.response['Error']['Code']}: {error.response['Error']['Message']}\x1b[0m")
        print("Please ensure you have the necessary permissions for Amazon Bedrock.")
    else:
        raise error

### Response from Claude 3.7 Sonnet (Converse API)

AWS has launched Amazon Bedrock, a new service providing API access to foundation models (FMs) from various AI companies, including Amazon's own Titan LLMs. The service democratizes generative AI by offering a serverless experience where customers can easily find, customize, and deploy text and image models without managing infrastructure. Bedrock integrates with existing AWS tools, allowing users to test different models and manage FMs at scale.


--------------------------------------------------------------------------------



# Overview of the Converse API

Now, that we have used the Converse API, let's take some time to take a closer look. To use the Converse API, you use the Converse or ConverseStream (for streaming responses) operations to send messages to a model. While, it is possible to use the existing base inference operations (```InvokeModel``` or ```InvokeModelWithResponseStream```) for conversation applications as well, we recommend using the Converse API as it provides consistent API, that works with all Amazon Bedrock models that support messages. This means you can write code once and use it with different models. 

Should a model have unique inference parameters, the Converse API also allows you to pass those unique parameters in a model specific structure. You can use the Amazon Bedrock Converse API to create conversational applications that send and receive messages to and from an Amazon Bedrock model. For example, you can create a chat bot that maintains a conversation over many turns and uses a persona or tone customization that is unique to your needs, such as a helpful technical support assistant. The Converse API also supports other Bedrock capabilites, like tool use and guardrails.

Let's break down its key components (you can also review the documentation for a full list of parameters):

```
{
  "modelId": "us.anthropic.claude-3-7-sonnet-20250219-v1:0",
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "text": "Your prompt or message here"
        }
      ]
    }
  ],
  "system": [
    {
      "text": "You are a helpful AI assistant."
    }
  ],
  "inferenceConfig": {
    "temperature": 0.7,
    "topP": 0.9,
    "maxTokens": 2000,
    "stopSequences": []
  },
  "toolConfig": {
    "tools": [],
    "toolChoice": {
      "auto": {}
    }
  }
}
```

# Easily switch between models

One of the biggest advantages of the Converse API is the ability to easily switch between models using the exact same request format. Let's compare summaries across different foundation models by looping over the model dictionary we defined above:

In [12]:
import time  # Add this import at the top

# Define model IDs that will be used in this module
MODELS = {
    "Claude 3.7 Sonnet": "us.anthropic.claude-3-7-sonnet-20250219-v1:0",
    "Claude 3.5 Sonnet": "us.anthropic.claude-3-5-sonnet-20240620-v1:0",
    "Claude 3.5 Haiku": "us.anthropic.claude-3-5-haiku-20241022-v1:0",
    "Amazon Nova Pro": "us.amazon.nova-pro-v1:0",
    "Amazon Nova Micro": "us.amazon.nova-micro-v1:0",
    "DeepSeek-R1": "us.deepseek.r1-v1:0",
    "Meta Llama 3.1 70B Instruct": "us.meta.llama3-1-70b-instruct-v1:0"
}

# call different models with the same converse request
results = {}    
for model_name, model_id in MODELS.items(): # looping over all models defined above
        try:
            start_time = time.time()
            response = bedrock.converse(
                modelId=model_id,
                messages=converse_request["messages"],
                inferenceConfig=converse_request["inferenceConfig"] if "inferenceConfig" in converse_request else None
            )
            end_time = time.time()
            
            # Extract the model's response using the correct structure
            model_response = response["output"]["message"]["content"][0]["text"]
            response_time = round(end_time - start_time, 2)
            
            results[model_name] = {
                "response": model_response,
                "time": response_time
            }
            
            print(f"✅ Successfully called {model_name} (took {response_time} seconds)")
            
        except Exception as e:
            print(f"❌ Error calling {model_name}: {str(e)}")
            results[model_name] = {
                "response": f"Error: {str(e)}",
                "time": None
            }

✅ Successfully called Claude 3.7 Sonnet (took 2.94 seconds)
✅ Successfully called Claude 3.5 Sonnet (took 3.01 seconds)
✅ Successfully called Claude 3.5 Haiku (took 2.46 seconds)
✅ Successfully called Amazon Nova Pro (took 1.64 seconds)
✅ Successfully called Amazon Nova Micro (took 0.74 seconds)
✅ Successfully called DeepSeek-R1 (took 2.55 seconds)
✅ Successfully called Meta Llama 3.1 70B Instruct (took 4.15 seconds)


In [13]:
# Display results in a formatted way
for model_name, result in results.items():
    if "Error" not in result["response"]:
        display(Markdown(f"### {model_name} (took {result['time']} seconds)"))
        display(Markdown(result["response"]))
        print("-" * 80)

### Claude 3.7 Sonnet (took 2.94 seconds)

AWS has launched Amazon Bedrock, a new service providing API access to foundation models (FMs) from various AI companies, including Amazon's own Titan LLMs. The service enables customers to easily build and scale generative AI applications by finding suitable models, customizing them with private data, and integrating them using familiar AWS tools without managing infrastructure.

--------------------------------------------------------------------------------


### Claude 3.5 Sonnet (took 3.01 seconds)

Amazon has announced Bedrock, a new service that provides API access to foundation models (FMs) from various AI companies and Amazon itself. Bedrock aims to democratize access to generative AI by offering a user-friendly, serverless platform for building and scaling AI applications. The service allows customers to easily find, customize, and deploy FMs for text and image tasks using familiar AWS tools and infrastructure, without the need to manage complex systems.

--------------------------------------------------------------------------------


### Claude 3.5 Haiku (took 2.46 seconds)

AWS has launched Amazon Bedrock, a new service that provides easy access to foundation models from various AI companies through an API, enabling customers to build and scale generative AI applications. Bedrock offers a serverless experience that allows users to access, customize, and deploy foundation models privately and securely without managing infrastructure. The service includes models from AI21 Labs, Anthropic, Stability AI, and Amazon, including two new Amazon Titan large language models.

--------------------------------------------------------------------------------


### Amazon Nova Pro (took 1.64 seconds)

AWS has launched Amazon Bedrock, a new service that provides easy access to Foundation Models (FMs) from various providers like AI21 Labs, Anthropic, Stability AI, and Amazon via an API, enabling customers to build and scale generative AI applications. Bedrock offers a range of powerful text and image FMs, including Amazon's new Titan FMs, through a secure, scalable AWS managed service, allowing users to customize, integrate, and deploy models seamlessly using familiar AWS tools without infrastructure management.

--------------------------------------------------------------------------------


### Amazon Nova Micro (took 0.74 seconds)

AWS has launched Amazon Bedrock, a new service that provides easy access to generative AI models from AI21 Labs, Anthropic, Stability AI, and Amazon via an API, enabling developers to quickly build and scale AI-based applications without managing infrastructure. Bedrock offers scalable, secure access to a range of powerful models for text and images, including Amazon's new Titan models, and integrates seamlessly with AWS tools like SageMaker.

--------------------------------------------------------------------------------


### DeepSeek-R1 (took 2.55 seconds)



Amazon Bedrock is a new AWS service providing API access to powerful foundation models (FMs) from AI21 Labs, Anthropic, Stability AI, and Amazon, including Amazon's new Titan text models. It enables developers to build and scale generative AI applications effortlessly with a serverless, no-infrastructure platform, allowing customization using private data and integration with familiar AWS tools like SageMaker for testing and deployment. Bedrock democratizes access to advanced AI by combining scalable, secure managed services with streamlined model selection and customization.

--------------------------------------------------------------------------------


### Meta Llama 3.1 70B Instruct (took 4.15 seconds)



Here is a 3-sentence summary of the text:

Amazon Web Services (AWS) has launched Amazon Bedrock, a new service that provides access to foundation models (FMs) from leading AI labs via an API. Bedrock makes it easy for customers to build and scale generative AI-based applications using FMs, with a scalable, reliable, and secure managed service. The service offers a serverless experience, allowing customers to easily find, customize, and deploy FMs into their applications without managing infrastructure.

--------------------------------------------------------------------------------


# Cross-Regional Inference in Amazon Bedrock

Amazon Bedrock offers Cross-Regional Inference which automatically selects the optimal AWS Region within your geography to process your inference requests. Cross-Regional Inference offers higher throughput limits (up to 2x allocated quotas) and seamlessly manages traffic bursts by dynamically routing requests across multiple AWS regions, enhancing application resilience during peak demand periods without additional routing or data transfer costs. Customers can control where their inference data flows by selecting from a pre-defined set of regions, helping them comply with applicable data residency requirements and sovereignty laws. Moreover, this capability prioritizes the connected Bedrock API source region when possible, helping to minimize latency and improve responsiveness. As a result, customers can enhance their applications' reliability, performance, and efficiency. Please review the list of supported regions and models for inference profiles.

To use Cross-Regional Inference, you simply need to specify a cross-region inference profile as the modelId when making a request. Cross-region inference profiles are identified by including a region prefix (e.g., us. or eu.) before the model name.

For example:
```
{
    "Amazon Nova Pro": "amazon.nova-pro-v1:0",  # Regular model ID
    "Amazon Nova Pro (CRIS)": "us.amazon.nova-pro-v1:0"  # Cross-regional model ID
}
```
Let's see how easy it is to use Cross Region Inference by invoking the Claude 3.5 Sonnet model:

In [14]:
# Regular model invocation (standard region)
standard_response = bedrock.converse(
    modelId="anthropic.claude-3-5-sonnet-20240620-v1:0",  # Standard model ID
    messages=converse_request["messages"]
)

# Cross-region inference (note the "us." prefix)
cris_response = bedrock.converse(
    modelId="us.anthropic.claude-3-5-sonnet-20240620-v1:0",  # Cross-region model ID with regional prefix
    messages=converse_request["messages"]
)

# Print responses
print("Standard response:", standard_response["output"]["message"]["content"][0]["text"])
print("Cross-region response:", cris_response["output"]["message"]["content"][0]["text"])

Standard response: Amazon has announced Bedrock, a new service providing API access to foundation models (FMs) from various AI companies and Amazon itself. Bedrock aims to democratize access to generative AI by offering a user-friendly, serverless platform for building and scaling AI applications. It allows customers to easily select, customize, and deploy FMs for text and image tasks using familiar AWS tools and infrastructure.
Cross-region response: Amazon has announced Bedrock, a new service that provides easy access to various foundation models (FMs) from different AI companies through an API. This service aims to democratize generative AI by allowing developers to build and scale applications using these models, including Amazon's own newly announced Titan FMs. Bedrock offers a serverless experience, enabling customers to select, customize, and integrate FMs into their applications using familiar AWS tools and without managing infrastructure.


# Multi-turn Conversations

The Converse API makes multi-turn conversations simple. Let's see it in action:

In [15]:
# Example of a multi-turn conversation with Converse API
multi_turn_messages = [
    {
        "role": "user",
        "content": [{"text": f"Please summarize this text: {text_to_summarize}"}]
    },
    {
        "role": "assistant",
        "content": [{"text": results["Claude 3.7 Sonnet"]["response"]}]
    },
    {
        "role": "user",
        "content": [{"text": "Can you make this summary even shorter, just 1 sentence?"}]
    }
]

try:
    response = bedrock.converse(
        modelId=MODELS["Claude 3.7 Sonnet"],
        messages=multi_turn_messages,
        inferenceConfig={"temperature": 0.2, "maxTokens": 500}
    )
    
    # Extract the model's response using the correct structure
    follow_up_response = response["output"]["message"]["content"][0]["text"]
    display_response(follow_up_response, "Claude 3.7 Sonnet (Multi-turn conversation)")
    
except Exception as e:
    print(f"Error: {str(e)}")

### Response from Claude 3.7 Sonnet (Multi-turn conversation)

Amazon Bedrock is AWS's new serverless service that provides API access to foundation models from multiple providers, allowing customers to easily build, customize, and deploy generative AI applications without managing infrastructure.


--------------------------------------------------------------------------------



# Streaming Responses with ConverseStream API

For longer generations, you might want to receive the content as it's being generated. The ConverseStream API supports streaming, which allows you to process the response incrementally:

In [16]:
# Example of streaming with Converse API
def stream_converse(model_id, messages, inference_config=None):
    if inference_config is None:
        inference_config = {}
    
    print("Streaming response (chunks will appear as they are received):\n")
    print("-" * 80)
    
    full_response = ""
    
    try:
        response = bedrock.converse_stream(
            modelId=model_id,
            messages=messages,
            inferenceConfig=inference_config
        )
        response_stream = response.get('stream')
        if response_stream:
            for event in response_stream:

                if 'messageStart' in event:
                    print(f"\nRole: {event['messageStart']['role']}")

                if 'contentBlockDelta' in event:
                    print(event['contentBlockDelta']['delta']['text'], end="")

                if 'messageStop' in event:
                    print(f"\nStop reason: {event['messageStop']['stopReason']}")

                if 'metadata' in event:
                    metadata = event['metadata']
                    if 'usage' in metadata:
                        print("\nToken usage")
                        print(f"Input tokens: {metadata['usage']['inputTokens']}")
                        print(
                            f":Output tokens: {metadata['usage']['outputTokens']}")
                        print(f":Total tokens: {metadata['usage']['totalTokens']}")
                    if 'metrics' in event['metadata']:
                        print(
                            f"Latency: {metadata['metrics']['latencyMs']} milliseconds")

                
            print("\n" + "-" * 80)
        return full_response
    
    except Exception as e:
        print(f"Error in streaming: {str(e)}")
        return None

In [17]:
# Let's try streaming a longer summary
streaming_request = [
    {
        "role": "user",
        "content": [
            {
                "text": f"""Please provide a detailed summary of the following text, explaining its key points and implications:
                
                {text_to_summarize}
                
                Make your summary comprehensive but clear.
                """
            }
        ]
    }
]

In [18]:
# Only run this when you're ready to see streaming output
streamed_response = stream_converse(
    MODELS["Claude 3.7 Sonnet"], 
    streaming_request, 
    inference_config={"temperature": 0.4, "maxTokens": 1000}
)


Streaming response (chunks will appear as they are received):

--------------------------------------------------------------------------------

Role: assistant
# Summary: Amazon Bedrock - Democratizing Access to Foundation Models

## Key Points

Amazon has launched Amazon Bedrock, a new service that provides API access to Foundation Models (FMs) from multiple providers including AI21 Labs, Anthropic, Stability AI, and Amazon itself. This launch responds directly to customer feedback requesting easier access to these powerful AI models.

The service offers:

1. **Diverse Model Access**: Users can access various powerful foundation models for text and image generation, including Amazon's newly announced Titan Large Language Models (LLMs).

2. **Serverless Architecture**: Bedrock provides a fully managed, serverless experience, eliminating the need for customers to manage infrastructure.

3. **Customization Capabilities**: Users can privately customize foundation models with their own da