# AWS Bedrock


AWS Bedrock is a fully managed service that provides access to foundation models (FMs) from leading AI companies. It allows you to build and scale generative AI applications using these models. In this notebook, we will explore how to use AWS Bedrock with the `boto3` library in Python.

## LLMs and Foundation Models
Foundation models are large-scale machine learning models that are trained on vast amounts of data and can be fine-tuned for specific tasks. AWS Bedrock provides access to various foundation models, including text generation, image generation, and more. </br>

* LLM is a stateless function. When "memory" or "context" is required, it is passed as part of the prompt.
* The operation of fetching information from a database or other source is called "retrieval augmentation generation" (RAG).
* The datasource used to fetch information is called a knowledge base.
* LLMs [pricing](https://aws.amazon.com/bedrock/pricing/) depends on the model complexity and the number of tokens (~4 chars ≅ word) in the prompt and the response.
* The longer the prompt, the more expensive and the slower it is.

### Using AWS Bedrock models and regions

AWS LLM models availability is region specific. `us-east-1` or `N. Virginia` is the region where all models are available. </br>
In order to use a model of a specific provider, you need to request access to that model in the [Model Access](https://us-east-1.console.aws.amazon.com/bedrock/home?region=us-east-1#/modelaccess) section </br>
In this tutorial we'll be using `us-east-1` region with several different models.

### Cross-region inference
Cross-region inference is a new AWS feature that enable LLM requests  to be processed in a different geographical region than where the request originated.
Instead of being limited to the models and compute resources available in a single region, cross-region inference can automatically route your inference requests to other available regions.
You can use the `Inference Profile ID` instead of the `Model ID` to specify the model you want to use. The Inference Profile ID is a unique identifier for a specific model and its associated compute resources in a specific region. </br>
For list of available models and their Inference Profile IDs, please refer to the [AWS Cross-region inference Console](https://us-east-1.console.aws.amazon.com/bedrock/home?region=us-east-1#/inference-profiles).



## Zero Shot
The following example is called `zero-shot` prompting. </br>
There are no examples or context provided to the model. The model is expected to understand the task and generate a response based on its own training. </br>

> change the models in order to switch between different LLMs. </br>

In [1]:
import boto3
from botocore.exceptions import ClientError

# Create a Bedrock Runtime client in the AWS Region you want to use.
client = boto3.client("bedrock-runtime", region_name="us-east-1")

##> Show the differences between
# model_id = "anthropic.claude-3-5-sonnet-20240620-v1:0" # Model ID
# model_id = "us.anthropic.claude-3-5-sonnet-20241022-v2:0" # Inference Profile ID
# model_id = "amazon.titan-text-express-v1"
model_id = "amazon.nova-lite-v1:0"

# Start a conversation with the user message.
user_message = """Meeting transcript:
Miguel: Hi Brant, I want to discuss the workstream  for our new product launch
Brant: Sure Miguel, is there anything in particular you want to discuss?
Miguel: Yes, I want to talk about how users enter into the product.
Brant: Ok, in that case let me add in Namita.
Namita: Hey everyone
Brant: Hi Namita, Miguel wants to discuss how users enter into the product.
Miguel: its too complicated and we should remove friction. for example, why do I need to fill out additional forms?  I also find it difficult to find where to access the product when I first land on the landing page.
Brant: I would also add that I think there are too many steps.
Namita: Ok, I can work on the landing page to make the product more discoverable but brant can you work on the additional forms?
Brant: Yes but I would need to work with James from another team as he needs to unblock the sign up workflow.  Miguel can you document any other concerns so that I can discuss with James only once?
Miguel: Sure.

From the meeting transcript above, Create a list of action items for each person.
"""
conversation = [
    {
        "role": "user",
        "content": [{"text": user_message}],
    }
]

try:
    # Send the message to the model, using a basic inference configuration.
    #  https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/bedrock-runtime/client/converse.html
    response = client.converse(
        modelId=model_id,
        messages=conversation,
        inferenceConfig={"maxTokens": 4096, "stopSequences": ["User:"], "temperature": 0, "topP": 1},
        additionalModelRequestFields={}
    )

    # Extract and print the response text.
    response_text = response["output"]["message"]["content"][0]["text"]
    print(response_text)

except (ClientError, Exception) as e:
    print(f"ERROR: Can't invoke '{model_id}'. Reason: {e}")
    exit(1)


Based on the meeting transcript, here is a list of action items for each person:

**Miguel:**
1. Document any other concerns related to the user entry process, specifically focusing on the sign-up workflow and landing page, to share with Brant for a single discussion with James.

**Brant:**
1. Collaborate with James from another team to address the issues with the sign-up workflow.
2. Ensure that the discussion with James includes all concerns documented by Miguel.

**Namita:**
1. Work on improving the landing page to make the product more discoverable for users.

**James (from another team):**
1. Unblock the sign-up workflow in collaboration with Brant.


### Understanding the response

The LLM response is a JSON object that contains important information in addition to the generated text. </br>
When analyzing and comparing LLM responses, look for the following fields:

* `latency`: The time taken to process the request and generate a response.
* `usage`: The number of tokens used in the prompt and the response. This is important for cost estimation.
* `RequestId`: A unique identifier for the request. This can be useful for debugging and tracking purposes.


In [11]:
    # print the response with json format and indentation
    import json

    print(json.dumps(response, indent=4, sort_keys=True))


{
    "ResponseMetadata": {
        "HTTPHeaders": {
            "connection": "keep-alive",
            "content-length": "859",
            "content-type": "application/json",
            "date": "Sun, 20 Apr 2025 14:33:26 GMT",
            "x-amzn-requestid": "ca4f75e1-822b-4ef4-920f-0894744c3ec5"
        },
        "HTTPStatusCode": 200,
        "RequestId": "ca4f75e1-822b-4ef4-920f-0894744c3ec5",
        "RetryAttempts": 0
    },
    "metrics": {
        "latencyMs": 869
    },
    "output": {
        "message": {
            "content": [
                {
                    "text": "Based on the meeting transcript, here is a list of action items for each person:\n\n**Miguel:**\n1. Document any other concerns related to the user entry process, specifically focusing on the sign-up workflow and landing page, to share with Brant for a single discussion with James.\n\n**Brant:**\n1. Collaborate with James from another team to address the issues with the sign-up workflow.\n2. Ensure t

## One Shot
One-shot prompting is a technique used to provide a single example to the model, helping it understand the task better. </br>

In this example, we will use one-shot prompting to create a meeting summary and action items. </br>
We will use the `assistant` role to provide the model with a system message that describes its role and the task it needs to perform. </br>
In addition, we will provide a user message that contains the meeting transcript. </br>


In [4]:
import boto3
from botocore.exceptions import ClientError

client = boto3.client("bedrock-runtime", region_name="us-east-1")
model_id = "us.anthropic.claude-3-5-sonnet-20241022-v2:0" # Inference Profile ID

# Start a conversation with the user message.
system_message = """You are a meeting assistant that helps to summarize the meeting and create action items for each person.
You are given a meeting transcript and you need to create a list of action items for each person in the meeting.
The action items should be in the following format:

=== Miguel ===
    - action item 1
    - action item 2
=== Brant ===
    - action item1
    - action item 2
=== Namita ===
    - action item 1
    - action item 2

The action items should be clear and concise.
The action items should be based on the meeting transcript and should not include any additional information.
In the end of the response, mention the meeting participants and their roles in a JSON format.
In addition rank their involvement in the meeting from 1 to 5, where 5 is the most involved and 1 is the least involved.
{
    {"name": "Miguel", "role": "Product Manager", "involvement": 5},
    {"name": "Brant", "role": "Software Engineer", "involvement": 4},
    {"name": "Namita", "role": "UX Designer", "involvement": 3}
}

"""

user_message = """Meeting transcript:
Miguel: Hi Brant, I want to discuss the workstream  for our new product launch
Brant: Sure Miguel, is there anything in particular you want to discuss?
Miguel: Yes, I want to talk about how users enter into the product.
Brant: Ok, in that case let me add in Namita.
Namita: Hey everyone
Brant: Hi Namita, Miguel wants to discuss how users enter into the product.
Miguel: its too complicated and we should remove friction. for example, why do I need to fill out additional forms?  I also find it difficult to find where to access the product when I first land on the landing page.
Brant: I would also add that I think there are too many steps.
Namita: Ok, I can work on the landing page to make the product more discoverable but brant can you work on the additional forms?
Brant: Yes but I would need to work with James from another team as he needs to unblock the sign up workflow.  Miguel can you document any other concerns so that I can discuss with James only once?
Miguel: Sure.
"""

user_message2 = """Meeting transcript:
Attendees: Shimon (PM), Igor (CTO), Avi (Tech Lead)

Shimon: Right, let's discuss integrating code coverage into the main pipeline. What are the main benefits and drawbacks?

Igor: The major pro is improved code quality and maintainability. It gives us objective data on test effectiveness, reducing future bugs and technical debt. It’s a standard best practice.

Avi: Agreed, Igor. The con is potential friction – longer build times initially, and developers needing to potentially refactor or add more tests, impacting velocity slightly. We need to manage thresholds carefully.

Shimon: So, increased confidence in quality versus a possible short-term slowdown?

Igor: Exactly. A worthwhile investment for long-term stability.

Avi: We can mitigate the impact by starting with warnings, not blockers.
Barak: I dont code coverage is a good idea. It will slow down our development.
Shimon: Barak, can you elaborate on that?
Barak: The master has spoken.
"""

conversation = [
    {
        "role": "assistant",
        "content": [{"text": system_message}],
    },
    {
        "role": "user",
        "content": [{"text": user_message2}],
    }
]

try:
    # Send the message to the model, using a basic inference configuration.
    #  https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/bedrock-runtime/client/converse.html
    response = client.converse(
        modelId=model_id,
        messages=conversation,
        inferenceConfig={"maxTokens": 4096, "stopSequences": ["User:"], "temperature": 0, "topP": 1},
        additionalModelRequestFields={}
    )

    # Extract and print the response text.
    response_text = response["output"]["message"]["content"][0]["text"]
    print(response_text)

except (ClientError, Exception) as e:
    print(f"ERROR: Can't invoke '{model_id}'. Reason: {e}")
    exit(1)


Based on the meeting transcript, here are the action items for each participant:

=== Shimon ===
    - Create a proposal document outlining the code coverage implementation plan
    - Schedule a follow-up meeting to discuss specific threshold values
    - Address Barak's concerns about development slowdown

=== Igor ===
    - Research and recommend code coverage tools suitable for the pipeline
    - Prepare technical documentation for code coverage integration
    - Define initial metrics for measuring the impact on build times

=== Avi ===
    - Create guidelines for test coverage thresholds
    - Prepare documentation for developers about new testing requirements

=== Barak ===
    - Provide detailed feedback about specific concerns regarding code coverage
    - Document specific examples where code coverage might impact development speed

Meeting Participants and Roles:
{
    {"name": "Shimon", "role": "Product Manager", "involvement": 4},
    {"name": "Igor", "role": "CTO", "involv

## Few Shots
Few-shot prompting is a technique used to provide multiple examples to the model, helping it understand the task better. </br>
Each example is called a `shot` and it contains a prompt and a desired response. </br>
The `shots` can be based on real-world examples or synthetic examples that exemplify the task. </br>


In [2]:

import boto3
from botocore.exceptions import ClientError

client = boto3.client("bedrock-runtime", region_name="us-east-1")
model_id = "us.anthropic.claude-3-5-sonnet-20241022-v2:0" # Inference Profile ID

# Start a conversation with the user message.

prompt = """You are a product manager for a website builder platform (like Wix). For each new feature request, you need to outline the functional requirements, highlight important UI/UX considerations, and explain the business advantages. Respond in the following format for each feature request:

**Functional Requirement:**
[Clearly describe what the feature should do.]

**UI/UX Important Points:**
[List key considerations for the user interface and user experience of this feature.]

**Business Advantage:**
[Explain how this feature benefits the website builder platform.]

----
Examples:

**Feature Request:** Add a built-in image editor with basic cropping and resizing tools.

**Functional Requirement:**
Users should be able to crop and resize images directly within the website builder without needing to upload pre-edited files. Supported formats should include JPG, PNG, and GIF. The editor should offer standard aspect ratio presets and freeform resizing.

**UI/UX Important Points:**
- The image editor should be easily accessible within the image settings panel.
- Controls for cropping and resizing should be intuitive and visually clear.
- Users should see a real-time preview of their edits.
- An option to revert to the original image should be available.

**Business Advantage:**
- Improves user convenience and efficiency by eliminating the need for external image editing tools.
- Can attract users who need quick image adjustments without complex software.
- May reduce support requests related to image sizing issues.

---

**Feature Request:** Implement a library of pre-designed website sections (e.g., headers, footers, contact forms).

**Functional Requirement:**
Users should be able to browse and insert professionally designed website sections into their pages with a single click. The library should include various categories and styles, and users should be able to customize the content and styling of these sections.

**UI/UX Important Points:**
- The section library should be well-organized and easy to navigate, possibly with categories and search functionality.
- Previews of the sections should be clear and representative of the final design.
- The insertion process should be seamless and not disrupt the user's workflow.
- Users should have clear visual cues on how to customize the content of the inserted sections.

**Business Advantage:**
- Speeds up the website creation process for users, making it more appealing to beginners.
- Provides users with professionally designed elements, potentially leading to more visually appealing websites.
- Can encourage users to build more comprehensive websites by offering readily available components.

---

**Feature Request:** Allow users to embed social media feeds (e.g., Instagram, Twitter) directly onto their websites.

**Functional Requirement:**
Users should be able to connect their social media accounts and display their latest posts on their website. They should have options to customize the number of posts displayed and the layout of the feed.

**UI/UX Important Points:**
- The connection process to social media accounts should be secure and straightforward.
- Embedding options should be easily accessible within the website editor.
- Users should have control over the visual presentation of the feed to match their website's design.
- The embedded feed should be responsive and display correctly on different devices.

**Business Advantage:**
- Enhances user engagement by allowing them to showcase their social media presence.
- Can drive traffic between the user's website and their social media profiles.
- Adds dynamic content to websites, making them more lively and up-to-date.

---

Here is the actual Feature Request: {0}
"""

conversation = [
    {
        "role": "user",
        "content": [{"text": prompt.format("Integrate with a third-party payment processor (e.g., Stripe, PayPal) to enable e-commerce functionality.")}],
        # "content": [{"text": prompt.format("Implement A built-in image editor with basic cropping and resizing tools.")}],
    }
]

try:
    # Send the message to the model, using a basic inference configuration.
    #  https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/bedrock-runtime/client/converse.html
    response = client.converse(
        modelId=model_id,
        messages=conversation,
        inferenceConfig={"maxTokens": 4096, "stopSequences": ["User:"], "temperature": 0, "topP": 1},
        additionalModelRequestFields={}
    )

    # Extract and print the response text.
    response_text = response["output"]["message"]["content"][0]["text"]
    print(response_text)

except (ClientError, Exception) as e:
    print(f"ERROR: Can't invoke '{model_id}'. Reason: {e}")
    exit(1)


**Functional Requirement:**
Users should be able to set up and manage e-commerce functionality by connecting their Stripe/PayPal accounts to their websites. The integration should support:
- Secure payment processing for multiple currencies
- Product catalog management
- Order tracking and management
- Automatic email notifications for orders
- Basic inventory tracking
- Shopping cart functionality
- Payment status monitoring
- Refund processing capabilities

**UI/UX Important Points:**
- Simple step-by-step wizard for payment processor account connection
- Clear security badges and certifications to build trust
- Intuitive product upload and management interface
- Easy-to-customize checkout page templates
- Mobile-responsive shopping experience
- Clear error messages and payment status indicators
- Dashboard for order management and sales analytics
- Simple refund process interface
- Clear documentation and help resources

**Business Advantage:**
- Opens up a new revenue stream throug

## Tool Use / Function Calling
LLMs can be enhanced with tools and APIs to provide additional functionality. </br>
For example, you can use APIs to fetch data from external sources, perform calculations, or interact with other services. </br>
In this example, we will use the `python` tool to perform calculations and the `weather` API to fetch weather data. </br>

**Tool Use Workflow:**

1.  **Define Tools:** Specify tools with names, descriptions, and argument schemas. Include a user prompt (e.g., "What's the weather like in New York today?").
2.  **LLM Decides:** The LLM determines if a tool is necessary and halts text generation if so.
3.  **JSON Call:** The LLM outputs a JSON object containing the selected tool and its parameter values.
4.  **Execute & Return:** The system extracts parameters, runs the tool, and returns the output to the LLM.
5.  **Generate Answer:** The LLM uses the tool output to create a final response.

![Tool Calling](../resources/images/tool-call-schema-removebg.webp)
### References

* [AWS Bedrock: Converse API tool use examples](https://docs.aws.amazon.com/bedrock/latest/userguide/tool-use-examples.html)
* [Guide to Tool Calling](https://www.analyticsvidhya.com/blog/2024/08/tool-calling-in-llms/)




In [67]:
import requests
import boto3
import json

client = boto3.client("bedrock-runtime", region_name="us-east-1")
model_id = "us.anthropic.claude-3-5-sonnet-20241022-v2:0" # Inference Profile ID

class LocationNotFoundException(Exception):
    """Raised when a location is not found."""
    pass

# Weather API: Fetch weather data from weatherapi.com, api key from user: mistriela@yopmail.com
def fetch_weather(city):
    print('Fetching weather data for city:', city)
    base_url = "https://api.weatherapi.com/v1/current.json"
    params = {
        "q": city,
        "key": "6e7b99ab0f454283ab9125132252104",
        "aqi": "no"  # Get temperature in Celsius
    }
    try:
        response = requests.get(base_url, params=params)
        response.raise_for_status()  # Raise an error for bad HTTP responses
        weather_data = response.json()
        return {
            "city": weather_data["location"]["name"],
            "temperature": weather_data["current"]["temp_c"],
            "description": weather_data["current"]["condition"]["text"]
        }
    except requests.exceptions.RequestException as e:
        raise LocationNotFoundException(f"Error fetching weather data: {e}")


def invoke_bedrock_llm_with_function_calling(prompt):
    """
    Invokes an LLM on AWS Bedrock with function calling to get weather information.

    Args:
        prompt (str): The user's query (e.g., "What's the weather in London?").
        model_id (str): The ID of the Bedrock LLM model to use.
        region_name (str): The AWS region.

    Returns:
        str: The LLM's response, which may include the weather information
             or an error message.
    """

    tool_config = {
        "tools": [
            {
                "toolSpec": {
                    "name": "fetch_weather",
                    "description": "fetch weather information for a given city",
                    "inputSchema": {
                        "json": {
                            "type": "object",
                            "properties": {
                                "city": {
                                    "type": "string",
                                    "description": "The name of the city for which to fetch weather information."
                                }
                            },
                            "required": [
                                "city"
                            ]
                        }
                    }
                }
            }
        ]
    }


    # Note: there is no usage of `assistant` role as part of the prompt. It's the LLM responsibility to understand that a tool is required.
    input_messages = [{
        "role": "user",
        "content": [{"text": prompt}]
    }]

    # Send the initial message to the model. if there is a weather request, the model will stop and return a tool use request.
    # Several tools requests can be sent in a single response.
    response = client.converse(
        modelId=model_id,
        messages=input_messages,
        toolConfig=tool_config
    )
    output_message = response['output']['message']
    input_messages.append(output_message)
    stop_reason = response['stopReason']

    print(f"LLM Message: {json.dumps(response, indent=4)}")
    # Print the LLM message and stop reason.
    # See if any response starts with a json object root node: "toolUse"
    if stop_reason == 'tool_use':
        # Tool use requested. Call the tool and send the result to the model.
        tool_requests = output_message['content']
        for tool_request in tool_requests:
            if 'toolUse' in tool_request:
                tool = tool_request['toolUse']
                print(f"Requesting tool {tool['name']} Request: {tool['toolUseId']} ")

                if tool['name'] == 'fetch_weather':
                    tool_result = {}
                    try:
                        weather_data = fetch_weather(tool['input']['city'])
                        tool_result = {
                            "toolUseId": tool['toolUseId'],
                            "content": [{"json": weather_data}]
                        }
                    except LocationNotFoundException as err:
                        tool_result = {
                            "toolUseId": tool['toolUseId'],
                            "content": [{"text":  err.args[0]}],
                            "status": 'error'
                        }

                    tool_result_message = {
                        "role": "user",
                        "content": [{"toolResult": tool_result}]
                    }
                    input_messages.append(tool_result_message)

                    # Send the tool result to the model.
                    response = client.converse(
                        modelId=model_id,
                        messages=input_messages,
                        toolConfig=tool_config
                    )
                    output_message = response['output']['message']

    # print the final response from the model.
    # for content in output_message['content']:
    #     print(json.dumps(content, indent=4))

    return output_message

user_query = "What is the weather like in London?"
llm_response = invoke_bedrock_llm_with_function_calling(user_query)
print(llm_response['content'][0]['text'])


LLM Message: {
    "ResponseMetadata": {
        "RequestId": "64d74cf3-4255-447d-8413-55d6ce7e8e1a",
        "HTTPStatusCode": 200,
        "HTTPHeaders": {
            "date": "Sun, 27 Apr 2025 16:57:55 GMT",
            "content-type": "application/json",
            "content-length": "485",
            "connection": "keep-alive",
            "x-amzn-requestid": "64d74cf3-4255-447d-8413-55d6ce7e8e1a"
        },
        "RetryAttempts": 0
    },
    "output": {
        "message": {
            "role": "assistant",
            "content": [
                {
                    "text": "I'll help you check the current weather in London using the fetch_weather function."
                },
                {
                    "toolUse": {
                        "toolUseId": "tooluse_4Ol_16JJQjGjvm4X1r4YWQ",
                        "name": "fetch_weather",
                        "input": {
                            "city": "London"
                        }
                    }
   

## Tool Chaining
Tool chaining is a technique used to combine multiple tools or functions in a sequence to achieve a more complex task. </br>
In this example, we will use tool chaining **recursively** to fetch the weather data and then determine the dress code based on the temperature. </br>

### Execution Flow
We will ask the model to provide a dress code for the hotter city. </br>
The model will need to understand that it needs to call the `fetch_weather` tool to get the weather data for **both** cities. </br>
Then, it will call the `get_dress_code` tool to determine the dress code based on the temperature. </br>

### Important Lookouts
* Experiment with the different models, check if `lite` models can be used instead of `pro` models.
* Check the input/output of the llm - see how the conversation is built gradually.

In [2]:
import requests
import boto3
import json

client = boto3.client("bedrock-runtime", region_name="us-east-1")
# model_id = "us.anthropic.claude-3-5-sonnet-20241022-v2:0"
# model_id = "amazon.nova-pro-v1:0"
model_id = "amazon.nova-lite-v1:0"

class LocationNotFoundException(Exception):
    """Raised when a location is not found."""
    pass

# Weather API: Fetch weather data from weatherapi.com, api key from user: mistriela@yopmail.com
def fetch_weather(city):
    print('Fetching weather data for city:', city)
    base_url = "https://api.weatherapi.com/v1/current.json"
    params = {
        "q": city,
        "key": "6e7b99ab0f454283ab9125132252104",
        "aqi": "no"  # Get temperature in Celsius
    }
    try:
        response = requests.get(base_url, params=params)
        response.raise_for_status()  # Raise an error for bad HTTP responses
        weather_data = response.json()
        return {
            "city": weather_data["location"]["name"],
            "temperature": weather_data["current"]["temp_c"],
            "description": weather_data["current"]["condition"]["text"]
        }
    except requests.exceptions.RequestException as e:
        raise LocationNotFoundException(f"Error fetching weather data: {e}")

def get_dress_code(temperature: float) -> str:
    """
    Determine the dress code based on the temperature.
    Temperature ranges: -20 to 50 degrees Celsius.
    Args:
        temperature (float): The temperature in Celsius.

    Returns:
        str: The recommended dress code.
    """
    print('Getting dress code for temperature:', temperature)
    if temperature < -10:
        return "Wear a heavy winter coat, gloves, and a warm hat."
    elif -10 <= temperature < 0:
        return "Wear a warm coat and gloves."
    elif 0 <= temperature < 10:
        return "Wear a light jacket."
    elif 10 <= temperature < 20:
        return "Wear a long-sleeve shirt."
    elif 20 <= temperature < 30:
        return "Wear a short-sleeve shirt."
    else:
        return "Wear summer clothes."

tool_config = {
    "tools": [
        {
            "toolSpec": {
                "name": "fetch_weather",
                "description": "fetch weather information for a given city",
                "inputSchema": {
                    "json": {
                        "type": "object",
                        "properties": {
                            "city": {
                                "type": "string",
                                "description": "The name of the city for which to fetch weather information."
                            }
                        },
                        "required": ["city"]
                    }
                }
            }
        },
        {
            "toolSpec": {
                "name": "get_dress_code",
                "description": "Get the dress code based on the temperature.",
                "inputSchema": {
                    "json": {
                        "type": "object",
                        "properties": {
                            "temperature": {
                                "type": "number",
                                "description": "The temperature in Celsius."
                            }
                        },
                        "required": ["temperature"]
                    }
                }
            }
        }
    ]
}


def invoke_bedrock_llm_with_multiple_function_calling(prompt, input_messages=None):
    if input_messages is None:
        input_messages = [{"role": "user", "content": [{"text": prompt}]}]

    # print('input messages:', json.dumps(input_messages, indent=4))

    response = client.converse(
        modelId=model_id,
        messages=input_messages,
        toolConfig=tool_config
    )
    output_message = response['output']['message']
    # print('output_message:', json.dumps(response['output'], indent=4))
    input_messages.append(output_message)
    stop_reason = response['stopReason']

    if stop_reason == 'tool_use':
        response_contents = output_message['content']
        tool_responses = []
        for response_content in response_contents:
            if 'toolUse' in response_content: ## if there is a tool use request in the response content
                tool_request = response_content['toolUse']
                print(f"Requesting tool '{tool_request['name']}' ID[{tool_request['toolUseId']}] Input: {tool_request['input']}")

                tool_result = {}
                if tool_request['name'] == 'fetch_weather':
                    try:
                        tool_result = fetch_weather(tool_request['input']['city'])

                    except LocationNotFoundException as err:
                        tool_result = {"error": err.message}

                elif tool_request['name'] == 'get_dress_code':
                    tool_result = {"text": get_dress_code(tool_request['input']['temperature'])}


                tool_result_response = {
                    "toolResult": {
                        "toolUseId": tool_request['toolUseId'],
                        "content": [{"json": tool_result}]
                    }
                }

                tool_responses.append(tool_result_response)

        tool_result_message = {
            "role": "user",
            "content": tool_responses
        }

        input_messages.append(tool_result_message)

        # Recursively call the function to process the next step
        return invoke_bedrock_llm_with_multiple_function_calling(prompt, input_messages)

    # If no more tools are required, return the final response
    return output_message


user_query = """I want to travel to a cold location.
I'm thinking Tel-Aviv or Moscow. What is the colder city? and what should I wear there?
Provide concise textual answer.
"""

llm_response = invoke_bedrock_llm_with_multiple_function_calling(user_query)
print("\n")
print(llm_response['content'][0]['text'])


Requesting tool 'fetch_weather' ID[tooluse_8baoSFhITE2qxT0BwoIrfw] Input: {'city': 'Tel-Aviv'}
Fetching weather data for city: Tel-Aviv
Requesting tool 'fetch_weather' ID[tooluse_8QmzWgnlS7iOpR-PVqBLxA] Input: {'city': 'Moscow'}
Fetching weather data for city: Moscow
Requesting tool 'get_dress_code' ID[tooluse_cn7_wbkkT-mv6OQSjFE4jA] Input: {'temperature': 3.2}
Getting dress code for temperature: 3.2


<thinking> I have the dress code for Moscow based on its current temperature. I can now provide the User with the information they requested. </thinking>

Hi there! Based on the current temperatures, Moscow is colder than Tel-Aviv. You should wear a light jacket if you travel to Moscow.


## Structured Output
Structured output is a technique used to format the output of the LLM in a specific way. </br>
This can be useful for various applications, such as generating JSON or XML responses. </br>
It also helps to ensure that the output is consistent and compliant with the expected format. </br>

In [19]:
import boto3
from botocore.exceptions import ClientError

# Create a Bedrock Runtime client in the AWS Region you want to use.
client = boto3.client("bedrock-runtime", region_name="us-east-1")

##> Show the differences between
# model_id = "anthropic.claude-3-5-sonnet-20240620-v1:0" # Model ID
# model_id = "us.anthropic.claude-3-5-sonnet-20241022-v2:0" # Inference Profile ID
# model_id = "amazon.titan-text-express-v1"
model_id = "amazon.nova-lite-v1:0"

system_message = """You are a car expert with ability to provide technical details about a specific car you are asked for.
The response must be in a JSON format."""

# Start a conversation with the user message.
# user_message = """Hi i would like to know the technical details about the car Tesla Model Y."""
user_message = """Hi i would like to know the technical details about the car Toyota Prius 2010."""

assistant_message = """```json
{
    "car": {
        "name": "Car name",
        "type": "Type of the car. E.g: Electric SUV",
        "range": "The range of the car in miles",
        "top_speed": "Top speed of the car in mph",
        "acceleration": "0-60 mph time in seconds",
        "battery_capacity": "For electric cars, mention the battery capacity in kWh",
        "features": Array of features of the car. E.g: "Autopilot","All-wheel drive", "Premium interior", "Panoramic glass roof", "Advanced safety features",  "Over-the-air software updates", "ABS brakes"

    }
}
```"""

conversation = [
    {
        "role": "user",
        "content": [{"text": system_message}],
    },
    {
        "role": "assistant",
        "content": [{"text": assistant_message}],
    },
    {
        "role": "user",
        "content": [{"text": user_message}],
    },
]

try:
    # Send the message to the model, using a basic inference configuration.
    #  https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/bedrock-runtime/client/converse.html
    response = client.converse(
        modelId=model_id,
        messages=conversation,
        inferenceConfig={"maxTokens": 4096, "stopSequences": ["User:"], "temperature": 0, "topP": 1},
        additionalModelRequestFields={}
    )

    # Extract and print the response text.
    response_text = response["output"]["message"]["content"][0]["text"]
    print(response_text)

except (ClientError, Exception) as e:
    print(f"ERROR: Can't invoke '{model_id}'. Reason: {e}")
    exit(1)


```json
{
    "car": {
        "name": "Toyota Prius 2010",
        "type": "Hybrid Sedan",
        "range": "700 miles",
        "top_speed": "110 mph",
        "acceleration": "9.9 seconds",
        "battery_capacity": "4.4 kWh",
        "features": [
            "Hybrid Synergy Drive",
            "Regenerative braking",
            "Eco, Normal, and Power driving modes",
            "Smart key entry",
            "Bluetooth connectivity",
            "Six airbags",
            "ABS with EBD and BA",
            "Rearview camera",
            "Power moonroof",
            "12V DC power outlet",
            "Six-speaker audio system"
        ]
    }
}
```


## Prompt Caching
Prompt caching is a technique used to store and reuse previously generated prompts or responses. </br>
This can help to reduce the costs and improve performance. </br>