# Creating a `.env` File and Using Environment Variables

This notebook demonstrates how to create a `.env` file, install required Python packages, and load environment variables using `python-dotenv`.

In [None]:
%%file .env
AWS_ACCESS_KEY_ID=""
AWS_SECRET_ACCESS_KEY=""
AWS_SESSION_TOKEN=""

# Amazon Bedrock boto3 Prerequisites

> This notebook is designed to run seamlessly in **`SageMaker Studio Lab`**.

---

In this demo notebook, we demonstrate how to use the [`boto3` Python SDK](https://boto3.amazonaws.com/v1/documentation/api/latest/index.html) to interact with [Amazon Bedrock](https://aws.amazon.com/bedrock/) Foundation Models. SageMaker Studio Lab provides an ideal environment for experimenting with AWS SDKs like `boto3` for machine learning and AI workflows.

---

<div class="alert alert-block alert-info">
⚠️ <b>Before you get started with this workshop...</b> make sure you have manually enabled access to the following models in the "Model access" tab:

<ul>
  <li>All Titan Models</li>
  <li>All Nova Models</li>
  <li>Claude 3 Haiku</li>
  <li>Claude 3 Sonnet</li>
  <li>Claude 3.5 Haiku</li>
  <li>Claude 3.5 Sonnet</li>
  <li>Stable Diffusion</li>
  <li>Llama 2 13B</li>
  <li>Llama 2 70B</li>  
</ul>  

</div>

<p>Remember that you can modify model access at any point of time, but if you are running this workshop during a <b>Serverless Day event</b>, it is likely you have limited access to third-party models.</p>

## Prerequisites

Run the cells in this section to install the required packages for the notebooks in this workshop. ⚠️ You might encounter pip dependency warnings or conflicts during installation—these can be safely ignored for the purpose of this workshop. ⚠️

_NOTE:_ The warnings about pip's dependency resolver not handling all installed packages correctly are expected and do not impact the functionality of the notebook. Proceed with the workshop without concern.

In [2]:
%%capture
%pip install --no-build-isolation --force-reinstall -r requirements.txt

## Create the boto3 Client

Interaction with the Bedrock API is performed via the AWS SDK for Python: [boto3](https://boto3.amazonaws.com/v1/documentation/api/latest/index.html).

### Use Different Clients
The `boto3` library provides distinct clients for Amazon Bedrock to perform various operations:
- Actions like [`InvokeModel`](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModel.html) and [`InvokeModelWithResponseStream`](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModelWithResponseStream.html) are handled via the **Amazon Bedrock Runtime** client.
- Operations like [ListFoundationModels](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_ListFoundationModels.html) are supported by the **Amazon Bedrock Client**.

### Load Credentials from `.env`

In this notebook, we use a `.env` file to store and load AWS credentials. Use the `dotenv` package to load and access environment variables from the `.env` file.

In [None]:
from dotenv import load_dotenv
import botocore.exceptions
import os
import json
import sys
import boto3

aws_region = "us-east-1"

load_dotenv(".env")

bedrock_runtime_client = boto3.client("bedrock-runtime", region_name=aws_region)
bedrock_management_client = boto3.client('bedrock', region_name=aws_region)
bedrock_agent_client = boto3.client('bedrock-agent', region_name=aws_region)
bedrock_agent_runtime_client = boto3.client('bedrock-agent-runtime', region_name=aws_region)
cloudformation_client = boto3.client('cloudformation', region_name=aws_region)

boto3.__version__

#### Validate the connection

We can check the client works by trying out the `list_foundation_models()` method, which will tell us all the models available for us to use 

<details>
<summary>Click here for the solution</summary>
    
```python
for model in boto3_bedrock.list_foundation_models().get('modelSummaries', []):
    print(model.get('modelId', 'Unknown Model ID'))
```
    
</details>

---

## `InvokeModel` body and output

The `invoke_model()` method of the Amazon Bedrock runtime client (`InvokeModel` API) will be the primary method we use for most of our Text Generation and Processing tasks - whichever model we're using.

Although the method is shared, the format of input and output varies depending on the foundation model used - as described below:

### Amazon Titan Large and Premier

#### Input
```json
{   
    "inputText": "<prompt>",
    "textGenerationConfig" : { 
        "maxTokenCount": 512,
        "stopSequences": [],
        "temperature": 0.1,  
        "topP": 0.9
    }
}
```

#### Output

```json
{
    "inputTextTokenCount": 613,
    "results": [{
        "tokenCount": 219,
        "outputText": "<output>"
    }]
}
```

### Anthropic Claude

#### Input

```json
{
    "prompt": "\n\nHuman:<prompt>\n\nAnswer:",
    "max_tokens_to_sample": 300,
    "temperature": 0.5,
    "top_k": 250,
    "top_p": 1,
    "stop_sequences": ["\n\nHuman:"]
}
```

#### Output

```json
{
    "completion": "<output>",
    "stop_reason": "stop_sequence"
}
```

### Stability AI Stable Diffusion XL

#### Input

```json
{
    "text_prompts": [
        {"text": "this is where you place your input text"}
    ],
    "cfg_scale": 10,
    "seed": 0,
    "steps": 50
}
```

#### Output

```json
{ 
    "result": "success", 
    "artifacts": [
        {
            "seed": 123, 
            "base64": "<image in base64>",
            "finishReason": "SUCCESS"
        },
        //...
    ]
}
```

---

## Common inference parameter definitions

### Randomness and Diversity

Foundation models generally support the following parameters to control randomness and diversity in the 
response.

**Temperature** – Large language models use probability to construct the words in a sequence. For any 
given next word, there is a probability distribution of options for the next word in the sequence. When 
you set the temperature closer to zero, the model tends to select the higher-probability words. When 
you set the temperature further away from zero, the model may select a lower-probability word. 

This makes the output more deterministic and often leads to repetitive or predictable responses, as the model consistently chooses the "most likely" option. On the other hand, when you set the temperature further away from zero, the model is more likely to select lower-probability words. This introduces more variety and creativity into the output, resulting in responses that are less predictable and more diverse.

In technical terms, the temperature modulates the probability density function for the next tokens, 
implementing the temperature sampling technique. This parameter can deepen or flatten the density 
function curve. A lower value results in a steeper curve with more deterministic responses, and a higher 
value results in a flatter curve with more random responses.

![images/Capture-2024-12-09-180845.png](<images/Capture-2024-12-09-180845.png>)

**Top K** – Temperature defines the probability distribution of potential words, and Top K defines the cut 
off where the model no longer selects the words. For example, if K=50, the model selects from 50 of the 
most probable words that could be next in a given sequence. This reduces the probability that an unusual 
word gets selected next in a sequence.
In technical terms, Top K is the number of the highest-probability vocabulary tokens to keep for Top-
K-filtering - This limits the distribution of probable tokens, so the model chooses one of the highest-
probability tokens.

**Top P** – Top P defines a cut off based on the sum of probabilities of the potential choices. If you set Top 
P below 1.0, the model considers the most probable options and ignores less probable ones. Top P is 
similar to Top K, but instead of capping the number of choices, it caps choices based on the sum of their 
probabilities.
For the example prompt "I hear the hoof beats of ," you may want the model to provide "horses," 
"zebras" or "unicorns" as the next word. If you set the temperature to its maximum, without capping 
Top K or Top P, you increase the probability of getting unusual results such as "unicorns." If you set the 
temperature to 0, you increase the probability of "horses." If you set a high temperature and set Top K or 
Top P to the maximum, you increase the probability of "horses" or "zebras," and decrease the probability 
of "unicorns."

![images/Capture-2024-12-09-180610.png](<images/Capture-2024-12-09-180610.png>)

### Length

The following parameters control the length of the generated response.

**Response length** – Configures the minimum and maximum number of tokens to use in the generated 
response.

**Length penalty** – Length penalty optimizes the model to be more concise in its output by penalizing 
longer responses. Length penalty differs from response length as the response length is a hard cut off for 
the minimum or maximum response length.

In technical terms, the length penalty penalizes the model exponentially for lengthy responses. 0.0 
means no penalty. Set a value less than 0.0 for the model to generate longer sequences, or set a value 
greater than 0.0 for the model to produce shorter sequences.

### Repetitions

The following parameters help control repetition in the generated response.

**Repetition penalty (presence penalty)** – Prevents repetitions of the same words (tokens) in responses. 
1.0 means no penalty. Greater than 1.0 decreases repetition.

---

## Try out the models

With some theory out of the way, let's see the models in action! Run the cells below to see basic, synchronous example invocations for each model:

### Amazon Titan Text

In [None]:
import ipywidgets as widgets

agent_foundation_model_selector = widgets.Dropdown(
    options=[
        (model['modelName'], model['modelId']) 
        for model in bedrock_management_client.list_foundation_models(
            byProvider="Amazon",
            byOutputModality="TEXT",
            byInferenceType="ON_DEMAND"
        ).get('modelSummaries', []) if "Nova" not in model['modelName'] 
    ],
    value='amazon.titan-tg1-large',
    description='FM:',
    disabled=False,
)
agent_foundation_model_selector

In [6]:
model_id = agent_foundation_model_selector.value
prompt_data = """Command: Write me a blog about making strong business decisions as a leader.

Blog:
"""

Next, we will construct the body with the `prompt_data` above, and add a optional parameters like `topP` and `temperature`:


<details>
<summary>Click here for the solution</summary>
    
```python
body = json.dumps({
    "inputText": prompt_data,
    "textGenerationConfig": {
        "topP": 0.95,
        "temperature": 0.2
    }
})
```
    
</details>

The Amazon Bedrock API provides you with an API `invoke_model` which accepts the following:
- `modelId`: This is the model ARN for the various foundation models available under Amazon Bedrock
- `accept`: The type of input request
- `contentType`: The content type of the output
- `body`: A json string consisting of the prompt and the configurations

Check [documentation](https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids-arns.html) for Available text generation model Ids

In [13]:
def invoke_bedrock_model(body, model_id):
    try:
        accept = "application/json"
        content_type = "application/json"

        response = bedrock_runtime_client.invoke_model(
            body=body,
            modelId=model_id,
            accept=accept,
            contentType=content_type
        )

        response_body = json.loads(response.get("body").read())
        output_text = response_body.get("results")[0].get("outputText")
        return output_text

    except botocore.exceptions.ClientError as error:
        if error.response['Error']['Code'] == 'AccessDeniedException':
            print(
                f"\x1b[41m{error.response['Error']['Message']}\n"
                f"To troubleshoot this issue, refer to the following resources:\n"
                f"https://docs.aws.amazon.com/IAM/latest/UserGuide/troubleshoot_access-denied.html\n"
                f"https://docs.aws.amazon.com/bedrock/latest/userguide/security-iam.html\x1b[0m\n"
            )
        else:
            raise error

To invoke the Bedrock model, use the `invoke_bedrock_model` function with the appropriate parameters.

### Stability Stable Diffusion XL

In [None]:
import ipywidgets as widgets

agent_foundation_model_selector = widgets.Dropdown(
    options=[
        (model['modelName'], model['modelId']) 
        for model in bedrock_management_client.list_foundation_models(
            byProvider="Stability AI",
            byInferenceType="ON_DEMAND"
        ).get('modelSummaries', []) if "Nova" not in model['modelName'] 
    ],
    value='stability.stable-diffusion-xl-v1',
    description='FM:',
    disabled=False,
)
agent_foundation_model_selector

In [16]:
model_id = agent_foundation_model_selector.value
prompt_data = "a landscape with trees"

Next, we will construct the body using the `prompt_data` variable and include optional parameters such as `cfg_scale`, `seed`, and `steps`:

<details>
<summary>Click here for the solution</summary>
    
```python
body = json.dumps({
    "text_prompts": [{"text": prompt_data}],
    "cfg_scale": 10,
    "seed": 20,
    "steps": 50
})
```
    
</details>

The `invoke_model` API in Amazon Bedrock requires parameters such as `modelId`, `accept`, `contentType`, and a `body` JSON with the prompt and configurations. For available model IDs, refer to the [documentation](https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids-arns.html).

In [24]:
def invoke_bedrock_model_with_artifact(body, model_id):
    try:
        accept = "application/json"
        content_type = "application/json"

        response = bedrock_runtime_client.invoke_model(
            body=body,
            modelId=model_id,
            accept=accept,
            contentType=content_type
        )

        response_body = json.loads(response.get("body").read())
        result_text = response_body.get("result", "No result found")
        artifact_base64 = response_body.get("artifacts", [{}])[0].get("base64", "")

        print("Result:")
        print(result_text)
        print(f"Base64 Snippet: {artifact_base64[:80]}...")

        return response_body

    except botocore.exceptions.ClientError as error:
        if error.response["Error"]["Code"] == "AccessDeniedException":
            print(
                f"\x1b[41m{error.response['Error']['Message']}\n"
                f"To troubleshoot this issue, refer to the following resources:\n"
                f"https://docs.aws.amazon.com/IAM/latest/UserGuide/troubleshoot_access-denied.html\n"
                f"https://docs.aws.amazon.com/bedrock/latest/userguide/security-iam.html\x1b[0m\n"
            )
        else:
            raise error

To invoke the Bedrock model, use the `invoke_bedrock_model` function with the appropriate parameters.

**Note:** The output is a [base64 encoded](https://docs.python.org/3/library/base64.html) string of the image data. You can use any image processing library (such as [Pillow](https://pillow.readthedocs.io/en/stable/)) to decode the image as in the example below:


<details>
<summary>Click here for the solution</summary>
    
```python
import base64
import io
from PIL import Image

base_64_img_str = response_body.get("artifacts")[0].get("base64")
image = Image.open(io.BytesIO(base64.decodebytes(bytes(base_64_img_str, "utf-8"))))
```
    
</details>

## Generate streaming output

For large language models, it can take noticeable time to generate long output sequences. Rather than waiting for the entire response to be available, latency-sensitive applications may like to **stream** the response to users.

In [None]:
import ipywidgets as widgets

agent_foundation_model_selector = widgets.Dropdown(
    options=[
        (model['modelName'], model['modelId']) 
        for model in bedrock_management_client.list_foundation_models(
            byProvider="Amazon",
            byOutputModality="TEXT",
            byInferenceType="ON_DEMAND"
        ).get('modelSummaries', []) if "Nova" not in model['modelName'] 
    ],
    value='amazon.titan-text-premier-v1:0',
    description='FM:',
    disabled=False,
)
agent_foundation_model_selector

In [11]:
model_id = agent_foundation_model_selector.value
prompt_data = """Command: Write me a blog about making strong business decisions as a leader.

Blog:
"""

Run the code below to see how you can achieve this with Bedrock's `invoke_model_with_response_stream()` method - returning the response body in separate chunks.

<details>
<summary>Click here for the solution</summary>
    
```python
body = json.dumps({"inputText": prompt_data})
```
    
</details>

In [50]:
from IPython.display import display_markdown, Markdown

def invoke_amazon_model_with_streaming(body, model_id):
    accept = "application/json"
    content_type = "application/json"

    try:
        response = bedrock_runtime_client.invoke_model_with_response_stream(
            body=body,
            modelId=model_id,
            accept=accept,
            contentType=content_type
        )

        stream = response.get("body")

        if stream:
            print("Generated Content:\n")
            for event in stream:
                chunk = event.get("chunk")
                if chunk:
                    chunk_data = json.loads(chunk.get("bytes").decode())
                    text = chunk_data.get("outputText", "")
                    display_markdown(Markdown(print(text, end='')))
        else:
            print("No response stream received.")

    except botocore.exceptions.ClientError as error:
        if error.response["Error"]["Code"] == "AccessDeniedException":
            print(
                f"\x1b[41m{error.response['Error']['Message']}\n"
                f"To troubleshoot this issue, refer to the following resources:\n"
                f"https://docs.aws.amazon.com/IAM/latest/UserGuide/troubleshoot_access-denied.html\n"
                f"https://docs.aws.amazon.com/bedrock/latest/userguide/security-iam.html\x1b[0m\n"
            )
        else:
            raise error

### Anthropic Claude (messages API)

In [None]:
import ipywidgets as widgets

agent_foundation_model_selector = widgets.Dropdown(
    options=[
        (model['modelName'], model['modelId']) 
        for model in bedrock_management_client.list_foundation_models(
            byProvider="Anthropic",
            byOutputModality="TEXT",
            byInferenceType="ON_DEMAND"
        ).get('modelSummaries', []) if "Nova" not in model['modelName'] 
    ],
    value='anthropic.claude-3-haiku-20240307-v1:0',
    description='FM:',
    disabled=False,
)
agent_foundation_model_selector

In [57]:
model_id = agent_foundation_model_selector.value
prompt_data = """Human: Write me 500 word paragraph about making strong business decisions as a leader.

Assistant:
"""

Run the code below to see how you can achieve this with Bedrock's `invoke_model_with_response_stream()` method - returning the response body in separate chunks.


<details>
<summary>Click here for the solution</summary>
    
```python
body = json.dumps({
    "anthropic_version": "bedrock-2023-05-31",
    "max_tokens": int(500 / 0.75),
    "messages": [
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": prompt_data
                }
            ]
        }
    ]
})
```
    
</details>

In [62]:
from IPython.display import display_markdown, Markdown

def invoke_anthropic_model_with_streaming(body, model_id):
    accept = "application/json"
    content_type = "application/json"

    try:
        response = bedrock_runtime_client.invoke_model_with_response_stream(
            body=body,
            modelId=model_id,
            accept=accept,
            contentType=content_type
        )

        stream = response.get("body")

        if stream:
            print("Generated Content:\n")
            for event in stream:
                chunk = event.get("chunk")
                if chunk:
                    chunk_obj = json.loads(chunk.get("bytes").decode())
                    delta_obj = chunk_obj.get("delta")
                    if delta_obj:
                        text = delta_obj.get("text")
                        if text:
                            display_markdown(Markdown(print(text, end='')))
        else:
            print("No response stream received.")

    except botocore.exceptions.ClientError as error:
        if error.response["Error"]["Code"] == "AccessDeniedException":
            print(
                f"\x1b[41m{error.response['Error']['Message']}\n"
                f"To troubleshoot this issue, refer to the following resources:\n"
                f"https://docs.aws.amazon.com/IAM/latest/UserGuide/troubleshoot_access-denied.html\n"
                f"https://docs.aws.amazon.com/bedrock/latest/userguide/security-iam.html\x1b[0m\n"
            )
        else:
            raise error

### Amazon Nova (messages API)

In [None]:
import ipywidgets as widgets

agent_foundation_model_selector = widgets.Dropdown(
    options=[
        (model['modelName'], model['modelId']) 
        for model in bedrock_management_client.list_foundation_models(
            byProvider="Amazon",
            byOutputModality="TEXT",
            byInferenceType="ON_DEMAND"
        ).get('modelSummaries', []) if "Nova" in model['modelName'] 
    ],
    value='amazon.nova-pro-v1:0',
    description='FM:',
    disabled=False,
)
agent_foundation_model_selector

In [128]:
def invoke_nova_model(body, model_id, use_streaming=False):
    try:
        print("Generated Content:\n")
        if use_streaming:
            response = bedrock_runtime_client.invoke_model_with_response_stream(
                modelId=model_id, body=body
            )
            stream = response.get("body")
            chunk_count = 0
            time_to_first_token = None

            if stream:
                for event in stream:
                    chunk = event.get("chunk")
                    if chunk:
                        chunk_obj = json.loads(chunk.get("bytes").decode())
                        delta_obj = chunk_obj.get("contentBlockDelta")
                        if delta_obj:
                            text = delta_obj.get("delta", {}).get("text", "")
                            if text:
                                display_markdown(Markdown(print(text, end='')))
            return None
        else:
            response = bedrock_runtime_client.invoke_model(
                modelId=model_id, body=body
            )
            request_id = response["ResponseMetadata"]["RequestId"]
            model_response = json.loads(response["body"].read())

            content_text = model_response["output"]["message"]["content"][0]["text"]
            print(content_text)
            return model_response

    except Exception as e:
        print(f"An error occurred: {e}")
        return None

#### Example API request

```json
{
  "modelId": "string", 
  "contentType": "string", 
  "accept": "string", 
  "body": {
    "system": [
      {
        "text": "string"
      }
    ],
    "messages": [
      {
        "role": "string", 
        "content": [
          {
            "text": "string"
          }
        ]
      }
    ],
    "inferenceConfig": {
      "max_new_tokens": "integer",
      "top_p": "float",
      "top_k": "integer",
      "temperature": "float"
    }
  }
}
```

In [129]:
model_id = agent_foundation_model_selector.value
system_promt = "You should respond to all messages in french"
user_request = "Write me 500 word paragraph about making strong business decisions as a leader."
inf_params = {"max_new_tokens": 300, "top_p": 0.9, "top_k": 20, "temperature": 0.7}

### Create a JSON Request Body

The following example demonstrates how to construct a JSON request body for your use case. It includes placeholders for system prompts, user requests, and inference configuration parameters:

<details>
<summary>Click here for the solution</summary>
    
```python
body = json.dumps({
    "system": [
        {"text": "<your-system-prompt-here>"}
    ],
    "messages": [
        {
            "role": "user",
            "content": [
                {"text": "<your-user-request-here>"}
            ]
        }
    ],
    "inferenceConfig": inf_params
})
```
    
</details>

## Generate embeddings

Use text embeddings to convert text into meaningful vector representations. You input a body of text 
and the output is a (1 x n) vector, where n is the number of dimensions in the embedding space. You can use embedding vectors for a wide variety of applications. 
Bedrock currently offers Titan Embeddings for text embedding that supports text similarity (finding the 
semantic similarity between bodies of text) and text retrieval (such as search).

At the time of writing you can use `amazon.titan-embed-text-v1` as embedding model via the API. The input text size is 8192 tokens and the output vector length is 1536.

To use a text embeddings model, use the InvokeModel API operation or the Python SDK.
Use InvokeModel to retrieve the vector representation of the input text from the specified model.



#### Input

```json
{
    "inputText": "<text>"
}
```

#### Output

```json
{
    "embedding": []
}
```


Let's see how to generate embeddings of some text:

In [None]:
import ipywidgets as widgets

agent_foundation_model_selector = widgets.Dropdown(
    options=[
        (model['modelName'], model['modelId']) 
        for model in bedrock_management_client.list_foundation_models(
            byOutputModality="EMBEDDING",
            byInferenceType="ON_DEMAND"
        ).get('modelSummaries', [])
    ],
    value='amazon.titan-embed-g1-text-02',
    description='FM:',
    disabled=False,
)
agent_foundation_model_selector

In [23]:
model_id = agent_foundation_model_selector.value
prompt_data = "Amazon Bedrock supports foundation models from industry-leading providers such as \
AI21 Labs, Anthropic, Stability AI, and Amazon. Choose the model that is best suited to achieving \
your unique goals."

In [132]:
def fetch_embedding(body, model_id):
    accept = "application/json"
    content_type = "application/json"

    try:
        response = bedrock_runtime_client.invoke_model(
            body=body,
            modelId=model_id,
            accept=accept,
            contentType=content_type
        )

        response_body = json.loads(response.get("body").read())
        embedding = response_body.get("embedding")

        if embedding:
            print(f"The embedding vector has {len(embedding)} values")
            print(f"Preview: {embedding[:3]} ... {embedding[-3:]}")
            return {"embedding": embedding, "message": "Embedding retrieved successfully."}
        else:
            print("No embedding found in the response.")
            return {"embedding": None, "message": "No embedding found in the response."}

    except botocore.exceptions.ClientError as error:
        if error.response['Error']['Code'] == 'AccessDeniedException':
            message = (
                f"\x1b[41m{error.response['Error']['Message']}\n"
                f"To troubleshoot this issue, refer to the following resources:\n"
                f"https://docs.aws.amazon.com/IAM/latest/UserGuide/troubleshoot-access-denied.html\n"
                f"https://docs.aws.amazon.com/bedrock/latest/userguide/security-iam.html\x1b[0m\n"
            )
            print(message)
            return {"embedding": None, "message": message}
        else:
            raise error

### Create a JSON Request Body

The following example demonstrates how to construct a JSON request body for your use case. 

<details>
<summary>Click here for the solution</summary>
    
```python
body = json.dumps({"inputText": prompt_data})
```
    
</details>

## Getting Started with the Converse API in Amazon Bedrock

The Converse API (or ConverseStream API) in Amazon Bedrock provides a unified structured text interface for invoking Bedrock LLMs. It simplifies interactions with various model providers by using a universal syntax and message-structured prompts.

As a continuation of the previous setup, let's proceed to explore how to use the Converse API for structured interactions with Bedrock models. If you haven't already installed or updated boto3, ensure it's set up before proceeding.

In [None]:
MODEL_IDS = [
    "amazon.titan-tg1-large",
    "anthropic.claude-3-haiku-20240307-v1:0",
    "anthropic.claude-3-sonnet-20240229-v1:0",
]

We are now ready to configure the Converse API action in Amazon Bedrock. This API uses a consistent syntax for all supported models, including messages-formatted prompts and inference parameters. Additionally, the output format remains the same regardless of the model being used.

Optionally, you can include model-specific request fields that are unique to certain providers, allowing for greater flexibility. For more details, refer to the Bedrock Converse API documentation.

### One-Shot Invocations with Converse API

In [139]:
def invoke_bedrock_model(client, model_id, prompt, max_tokens=2000, temperature=0, top_p=0.9):
    """
    Invokes a Bedrock model using the Converse API with the provided parameters.

    Args:
        client: The Bedrock client instance.
        model_id: The ID of the model to invoke.
        prompt: The input prompt for the model.
        max_tokens: The maximum number of tokens to generate (default is 2000).
        temperature: Controls the randomness of responses (default is 0).
        top_p: Controls the diversity of responses (default is 0.9).

    Returns:
        str: The result text including latency and token usage, or an error message.
    """
    try:
        response = client.converse(
            modelId=model_id,
            messages=[
                {
                    "role": "user",
                    "content": [
                        {"text": prompt}
                    ]
                }
            ],
            inferenceConfig={
                "temperature": temperature,
                "maxTokens": max_tokens,
                "topP": top_p
            }
        )
    except Exception as e:
        print(f"Model invocation error: {e}")
        return "Model invocation error"

    try:
        result_text = response["output"]["message"]["content"][0]["text"]
        latency = response["metrics"]["latencyMs"]
        input_tokens = response["usage"]["inputTokens"]
        output_tokens = response["usage"]["outputTokens"]

        result = (
            f"{result_text}\n"
            f"--- Latency: {latency} ms - Input tokens: {input_tokens} - Output tokens: {output_tokens} ---\n"
        )
        return result
    except Exception as e:
        print(f"Output parsing error: {e}")
        return "Output parsing error"

Now, let's test our model invocation.

Use the same prompt to evaluate the capabilities of all text models available in Bedrock at the time of writing.

<details>
<summary>Click here for the solution</summary>
    
```python
prompt = "What is the capital of Italy?"
print(f"### Prompt:\n\n{prompt}\n")

for model_id in MODEL_IDS:
    print(f"\n\n### Testing Model: {model_id}\n")
    response = invoke_bedrock_model(bedrock_runtime, model_id, prompt)
    print(f"Response: {response}{'-'*100}\n")
```
    
</details>

In [140]:
MODEL_IDS = [
    "amazon.titan-tg1-large",
    "anthropic.claude-3-haiku-20240307-v1:0",
    "anthropic.claude-3-sonnet-20240229-v1:0",
]

### ConverseStream for streaming invocations

We can also use the Converse API for streaming invocations. In this case we rely on the ConverseStream action.

In [143]:
def invoke_bedrock_model_stream(client, model_id, prompt, max_tokens=2000, temperature=0, top_p=0.9):
    """
    Invokes a Bedrock model using the ConverseStream API for streaming responses.

    Args:
        client: The Bedrock client instance.
        model_id: The ID of the model to invoke.
        prompt: The input prompt for the model.
        max_tokens: Maximum tokens to generate (default: 2000).
        temperature: Controls response randomness (default: 0).
        top_p: Controls response diversity (default: 0.9).

    Returns:
        None. Prints the streamed response to the console in real-time.
    """
    try:
        response = client.converse_stream(
            modelId=model_id,
            messages=[
                {
                    "role": "user",
                    "content": [{"text": prompt}]
                }
            ],
            inferenceConfig={
                "temperature": temperature,
                "maxTokens": max_tokens,
                "topP": top_p
            }
        )

        for event in response['stream']:
            if 'contentBlockDelta' in event:
                chunk = event['contentBlockDelta']
                sys.stdout.write(chunk['delta']['text'])
                sys.stdout.flush()

    except Exception as e:
        print(f"\nError invoking model {model_id}: {e}")

Use the `invoke_bedrock_model_stream` function to test different models.


<details>
<summary>Click here for the solution</summary>
    
```python
prompt = "What is the capital of Italy?"
print(f"### Prompt:\n\n{prompt}\n")

for model_id in MODEL_IDS:
    print(f"\n\n### Testing Model: {model_id}\n")
    invoke_bedrock_model_stream(bedrock_runtime, model_id, prompt)
    print("\n" + "-" * 100)
```
    
</details>

### **Challenge 1: Text Generation with Amazon Titan or Nova**

**Objective:** Leverage the **Amazon Titan** or **Nova** foundation model to craft a creative short story based on a given prompt.

**Scenario:**
As a content creator, you aim to generate engaging content for your blog using AI. Utilize the **Amazon Titan** or **Nova** large model to create a captivating story with the following prompt:

> **"Write a short story about an adventurous fox who discovers a hidden treasure in an enchanted forest."**

**Tasks:**

1. Use the **Amazon Titan Large** or **Amazon Nova** model to generate the story.
2. Experiment with the following parameters to fine-tune the output:
   - **`temperature`**: Adjust the randomness and creativity of the response.
   - **`topP`**: Control the diversity of word choices to ensure a coherent yet imaginative narrative.
   - **`maxTokenCount`**: Limit the length of the generated story to ensure concise storytelling (e.g., a maximum of 500 tokens).
3. Compare outputs by varying parameter values and analyze how they influence the tone, creativity, and engagement level of the story.
4. Document the parameter settings that resulted in the most creative and engaging version of the story.

**Bonus Task:**

- Use the **streaming API** (e.g., `invoke_model_with_response_stream`) to generate the story incrementally.
- Observe the real-time output and analyze how the story develops progressively.

**Deliverable:**

1. **Generated Stories**: Include multiple versions of the story for comparison.
2. **Analysis**: Document how parameter changes (e.g., higher/lower `temperature` or `topP`) affected the output.
3. **Best Settings**: Highlight the parameter configuration that produced the most engaging result.

### **Challenge 2: Image Generation with Amazon Nova Canvas**
**Objective:** Use **Amazon Nova Canvas** to generate an image from a text prompt and render it in your environment.

**Scenario:**
You are designing a cover image for your fantasy novel. Use **Amazon Nova Canvas** to create an image based on the following description:
> "A mystical castle surrounded by glowing blue mushrooms under a starry night sky."

**Tasks:**
1. Use the **Amazon Nova Canvas** tool to generate the image by making an API call. Here’s an example request structure:
   ```json
   {
     "modelId": "amazon.nova-canvas-v1:0",
     "contentType": "application/json",
     "accept": "application/json",
     "body": "{\"textToImageParams\":{\"text\":\"A mystical castle surrounded by glowing blue mushrooms under a starry night sky\"},\"taskType\":\"TEXT_IMAGE\",\"imageGenerationConfig\":{\"cfgScale\":8,\"seed\":42,\"quality\":\"standard\",\"width\":1280,\"height\":720,\"numberOfImages\":3}}"
   }
   ```

2. Customize the following parameters in the API call:
   - `text`: Modify the prompt to adjust the scene or add new elements (e.g., "a dragon flying over the castle").
   - `cfgScale`: Adjust how closely the image adheres to the prompt (default: 8).
   - `width` and `height`: Configure the dimensions to suit your design.
   - `numberOfImages`: Specify how many variations you want.

3. Decode and render the image in Python using libraries like `base64` and `Pillow` or any other preferred method.

4. Save the generated image to a file and display it in the notebook for review.

**Bonus:**
Experiment with changing the `text` parameter to include new narrative elements (e.g., "a dragon flying over the castle") and compare the results. Discuss how **Amazon Nova Canvas** handles variations and the fidelity of generated images.

## Next Steps

In this notebook, we demonstrated how to invoke Amazon Bedrock models using the AWS Python SDK (`boto3`). We covered both one-shot and streaming invocations, including examples of using the **Converse API** for structured and consistent interactions with various models.