Objective:
- Generate captions for images using Claude 3 Haiku on Amazon Bedrock
- Learn more about how to use this notebook [in this blog post](https://community.aws/content/2gYbjrz8gMwChbnT9Cuv7Y69joB/sending-images-to-claude-3-using-amazon-bedrock)

What are we using?
- Python
- AWS
  - Amazon Bedrock
  - IAM
  - Boto3 (AWS SDK for Python)
- Claude 3 Haiku

What do you need to install?
- `pip install python-dotenv`
  - to load our credentials from environment variables instead of hardcoding them in our code (security best practice)
- `pip install boto3`
- `pip install pybase64`


In [1]:
import os
import dotenv
import boto3
import json
import base64

In [2]:
# load environment variables
# we use override=True to ensure that the values are refreshed if we edit them on the external configuration file since there seems to be a bug with the Jupyter extension for VS Code where it doesn't reload them even if you close and open the notebook again
dotenv.load_dotenv(".env", override=True)

True

In [3]:
# set our credentials from the environment values loaded form the .env file
AWS_ACCESS_KEY_ID = os.environ.get('AWS_ACCESS_KEY_ID')
AWS_SECRET_ACCESS_KEY = os.environ.get('AWS_SECRET_ACCESS_KEY')
AWS_REGION = os.environ.get('AWS_REGION')

In [4]:
# instantiate a bedrock client using boto3 (AWS' official Python SDK)
bedrock_runtime_client = boto3.client(
    'bedrock-runtime',
    aws_access_key_id=AWS_ACCESS_KEY_ID,
    aws_secret_access_key=AWS_SECRET_ACCESS_KEY,
    region_name=AWS_REGION
)

We now need to find the right model id that we can use to send prompts to Claude 3 Haiku. 
Bedrock is a serverless portal to many Foundation Models and the way you distinguish between them is by their unique model ids. 

You can find these in two ways:

1/ Via the Bedrock Console:
+ navigate to the AWS Console
+ navigate to Amazon Bedrock
+ find the menu where it lists the Foundation Models
+ Each model has an API request sample as part of their details where you can copy the model id from

2/ Via the AWS CLI:
+ type 

    `aws bedrock list-foundation-models`

+ scroll till you find the one you want
+ copy the model id
+ BONUS TIP: you can filter results ahead of time by using the --by-provider option. In our case, since we want to find out the model id for Anthropic's Claude 3 Haiku model, we could type the following instead to make our lives easier:

    `aws bedrock list-foundation-models --by-provider Anthropic`

In [5]:
# select the model id
model_id = "anthropic.claude-3-haiku-20240307-v1:0"

We first need to load our image and convert it to base64

In [7]:
# read our image as binary data first
with open('data/aws-serverless-api-architecture-diagram.png', 'rb') as image_file:
    encoded_image = base64.b64encode(image_file.read()).decode()

In [20]:
payload = {
    "messages": [
        {
            "role": "user",
            "content": [
                {
                    "type": "image",
                    "source": {
                        "type": "base64",
                        "media_type": "image/jpeg",
                        "data": encoded_image
                    }
                },
                {
                    "type": "text",
                    "text": "Explain this AWS architecture diagram."
                }
            ]
        }
    ],
    "max_tokens": 10000,
    "anthropic_version": "bedrock-2023-05-31"
}

One of the difference between Foundation Models is the way you interact with them. Each has their own way of receiving input so you need to look up the correct way to send a payload depending on the model you're using.

For Claude 3 models, the template is the following:

```
payload = {
    "messages": [
        {
            "role": "",
            "content": []
        }
    ],
    "anthropic_version": ""
}
```

Messages is an array of json objects which must contain at least one item following. Each message must strictly follow the schema and declare:
- role: this can be either user, or system. 
- content: this is also an array as you can send multiple content items in one API request to Claude. At minimum you will have one.


In [21]:
# we're ready to invoke the model!
response = bedrock_runtime_client.invoke_model(
    modelId=model_id,
    contentType="application/json",
    body=json.dumps(payload)
)

In [22]:
# now we need to read the response. It comes back as a stream of bytes so if we want to display the response in one go we need to read the full stream first
# then convert it to a string as json and load it as a dictionary so we can access the field containing the content without all the metadata noise
output_binary = response["body"].read()
output_json = json.loads(output_binary)
output = output_json["content"][0]["text"]


In [23]:
print(output)

As an AWS technical consultant, I'd be happy to explain this architecture diagram in detail to help you and your team of developers understand it better.

The diagram depicts a high-level AWS architecture for a web application. Let's go through the various components and their roles:

1. Application: This is the core of your web application, likely built using a framework or programming language of your choice. It's the main logic that handles user requests and processes data.

2. API Gateway: The API Gateway is a key component that serves as the entry point for your application. It provides a secure and scalable way to expose your application's APIs to the client-side. The API Gateway handles tasks like authentication, authorization, and request/response transformation.

3. Lambda Functions: These are serverless compute services provided by AWS. The diagram shows two Lambda functions: "GET" and "DELETE". These functions encapsulate specific application logic and are triggered by event