# Generative AI Conversations Made Simple: Overview of Amazon Bedrock Converse API

In this demo we will look at how the [Converse API](https://docs.aws.amazon.com/bedrock/latest/userguide/conversation-inference.html) on Amazon Bedrock and the problems it solves when performing inference on all text generation models on AWS. 

[Amazon Bedrock](https://docs.aws.amazon.com/bedrock/latest/userguide/what-is-bedrock.html)  is a fully managed service that offers a choice of high-performing foundation models (FMs) from leading AI companies through a single API. 

## The Challenge

As AI models evolve, developers face significant hurdles in keeping pace with changes and leveraging multiple models effectively. 

This complexity impacts four key areas:

1. **Model Versioning:** Staying current with various models and their APIs requires constant learning and adaptation.

2. **Conversation Management:** Implementing features like memory for ongoing dialogues adds another layer of complexity.

3. **Multi-Model Integration:** Coordinating conversations across different AI models presents technical challenges.

4. **API Integration:** Incorporating external data from APIs to enhance AI responses requires sophisticated orchestration.

## The Solution:

The Converse API was developed to simplify this process for developers. 

By providing the following:

1. **Unified API:** Provides a unified API for all models on AWS, allowing seamless integration as models evolve.
2. **Simplified handling of multi-turn conversations:** Built-in conversation management, easily enabling memory features.
3. **Effortless multi-model interations:** With a unified API communication between models is simplifed.
4. **Streamlined tools integration:** Facilitate smooth integration with external APIs to enrich AI responses.

### Prerequisites

Before you begin, ensure all prerequisites are in place. You should have:

- An [AWS Account](https://aws.amazon.com/free)
- The [AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html) installed and configured with your credentials
- Python 3.7+ installed
- Requested [access to the model](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access.html) you want to use

For this demo ensure the following models are enabled: 
- `anthropic.claude-v2:1`
- `anthropic.claude-3-haiku-20240307-v1:0`
- `meta.llama3-1-8b-instruct-v1:0`
- `meta.llama2-13b-chat-v1`

## 1. Setup

>NOTE: If you are running this on your local system I recommend using a python virtual environment.

In [None]:
!pip install boto3 pytz

In [None]:
# Import libraries
import boto3
import json

# Create a Bedrock Runtime client (You can specify the region_name)
bedrock_client = boto3.client("bedrock-runtime")

## 2. Invoke Model API vs. Converse API

### 2.1 Invoke Model API Example

- [InvokeModel API](https://docs.aws.amazon.com/bedrock/latest/userguide/inference-invoke.html). 

In [None]:
# Invoke model example using Anthropic's Claude Messages API
# https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-anthropic-claude-messages.html

def invoke_model_example(model_id, prompt):
    native_request = {
        "anthropic_version": "bedrock-2023-05-31", 
        "max_tokens": 512,
        "messages": [
            {
                "role": "user",
                "content": [
                    {
                        "type": "text",
                        "text": prompt
                    }
                ]
            }
        ]
    }
    body = json.dumps(native_request)
    
    try:
        response = bedrock_client.invoke_model(
            modelId=model_id,
            contentType="application/json",
            accept="application/json",
            body=body
        )
        return json.loads(response["body"].read())
    except Exception as e:
        return f"ERROR: {str(e)}"

    
# Example usage
# Define the model ID
model_id = "anthropic.claude-3-haiku-20240307-v1:0"

# Define the prompt for the model.
prompt = "What is Amazon VPC?"
response = invoke_model_example(model_id, prompt)

# Print the response
response

**Explanation:**

1. The Invoke Model API requires formatting the request according to the specific model's native API structure.
2. We create a dictionary with the prompt and model parameters, then convert it to JSON.
3. We call `invoke_model` with the model ID and the formatted request body.
4. The response has to then be parsed from JSON and returned.

Note: This method requires knowledge of each model's specific API structure and parameters. So this would only work with the Claude 2.1 model that uses the Text Completions API. 

In [None]:
# Invoke multiple models
# https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-anthropic-claude-messages.html
# https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-meta.html

def invoke_multiple_models_example(model_id, prompt):
    # Different request structures for different models
    # Remove and discuss that Claude 2 support native messages API and call it out that text completitions as legacy
    if "anthropic.claude" in model_id:
        native_request = {
            "anthropic_version": "bedrock-2023-05-31",
            "max_tokens": 300,
            "temperature": 0.5,
            "messages": [
                {
                    "role": "user",
                    "content": [{"type": "text", "text": prompt}],
                }
            ],
        }
    # llama 2 uses [INST] {prompt} [/INST]
    elif "meta.llama2" in model_id:
        native_request = {
            "prompt": f"[INST] {prompt} [/INST]",
            "max_gen_len": 300,
            "temperature": 1,
            "top_p": 0.9
        }
    # llama 3 uses <|begin_of_text|>
    elif "meta.llama3" in model_id:
        native_request = {
            "prompt": f"""
                <|begin_of_text|>
                <|start_header_id|>user<|end_header_id|>
                {prompt}
                <|eot_id|>
                <|start_header_id|>assistant<|end_header_id|>
            """,
            "max_gen_len": 300,
            "temperature": 1
        }
    else:
        return "Unsupported model"

    body = json.dumps(native_request)
    
    try:
        response = bedrock_client.invoke_model(
            modelId=model_id,
            contentType="application/json",
            accept="application/json",
            body=body
        )
        response_body = json.loads(response["body"].read())
        
        # Different response structures for different models
        if "anthropic.claude" in model_id:
            return response_body["content"][0].get("text", "No message found")
        elif "meta.llama" in model_id:
            return response_body.get("generation", "No generation found") 
        else:
            return "Unsupported model response"
    except Exception as e:
        return f"ERROR: {str(e)}"


# Example usage
models = [
    "anthropic.claude-v2:1",
    "anthropic.claude-3-haiku-20240307-v1:0",
    "meta.llama2-70b-chat-v1",
    "meta.llama3-1-8b-instruct-v1:0"
]
prompt = "What is Amazon VPC?"

for model in models:
    response = invoke_multiple_models_example(model, prompt)
    print(f"model_id: {model}\n{response}\n", "-"*100, "\n")


**Explanation:**

1. The Invoke Model API requires different request structures for different model providers.
2. We need to format the prompt differently for Claude models ("Human: ... Assistant:") and Llama models ("[INST] ... [/INST]").
3. The parameter names differ between models (e.g., "max_tokens_to_sample" vs "max_gen_len").
4. The response structure also varies between models, requiring different parsing logic.
5. This approach requires maintaining model-specific code, making it harder to switch between or experiment with different models.

Challenges:
- Need to know and handle each model's specific API structure and parameters.
- Code becomes more complex and harder to maintain as you add support for more models.
- Switching between models requires changing multiple parts of the code.
- Inconsistent parameter names and response structures across models.

These challenges highlight why the Converse API was developed to provide a more unified and consistent interface across different models.


### 2.2 Converse API Example

The [converse API](https://docs.aws.amazon.com/bedrock/latest/userguide/conversation-inference.html) was created to solve the following problems when performing inference on all text generation models on AWS:

- One streamlined unified API for all models on AWS
- Standardized way of handling message conversations
- Clear LLM System Message handling
- Tools integration
- Document upload capability

In [None]:
# Single prompt with converse API
# Not just for conversations although they are supported
import json
import boto3


message_list = []

initial_message = {
    "role": "user",
    "content": [
        { "text": "How are you today?" } 
    ],
}

message_list.append(initial_message)

response = bedrock_client.converse(
    modelId="anthropic.claude-3-sonnet-20240229-v1:0",
    messages=message_list,
    inferenceConfig={
        "maxTokens": 512,
        "temperature": 0
    },
)

response_message = response['output']['message']
print(json.dumps(response_message, indent=4))


### 2.3 Converse API Streaming Example

If you call `ConverseStream` to stream the response from a model, the stream is returned in the stream response field. The stream emits the following events in the following order.

1. `messageStart` (MessageStartEvent). The start event for a message. Includes the role for the message.
2. `contentBlockStart` (ContentBlockStartEvent). A Content block start event. Tool use only.
3. `contentBlockDelta` (ContentBlockDeltaEvent). A Content block delta event. Includes the partial text that the model generates or the partial input json for tool use.
4. `contentBlockStop` (ContentBlockStopEvent). A Content block stop event.
5. `messageStop` (MessageStopEvent). The stop event for the message. Includes the reason why the model stopped generating output.
6. `metadata` (ConverseStreamMetadataEvent). Metadata for the request. The metadata includes the token usage in usage (TokenUsage) and metrics for the call in metrics (ConverseStreamMetadataEvent).

ConverseStream streams a complete content block as a ContentBlockStartEvent event, one or more ContentBlockDeltaEvent events, and a ContentBlockStopEvent event. Use the contentBlockIndex field as an index to correlate the events that make up a content block.

In [None]:
# ConverseStream 
import boto3
import json

bedrock_client = boto3.client('bedrock-runtime')

def stream_conversation(model_id, messages, system_prompt="You are a helpful AI assistant."):
    try:
        response = bedrock_client.converse_stream(
            modelId=model_id,
            messages=messages,
            system=[{"text": system_prompt}],
            inferenceConfig={
                "temperature": 0.7,
                "maxTokens": 500
            }
        )
        
        full_response = ""
        # This loop processes the stream of events from the API. Each event represents a piece of the response.
        for event in response.get('stream'):
            if 'messageStart' in event:
                print("Response started:", event['messageStart']['role'])
            elif 'contentBlockDelta' in event:
                chunk = event['contentBlockDelta']['delta'].get('text', '')
                full_response += chunk
                print(chunk, end='', flush=True)  # Print chunks as they arrive
            elif 'messageStop' in event:
                print("\n\nResponse completed. Stop reason:", event['messageStop']['stopReason'])
            elif 'metadata' in event:
                print("\nMetadata:", json.dumps(event['metadata'], indent=2))
        
        return full_response

    except Exception as e:
        print(f"An error occurred: {str(e)}")
        return None

# Example usage
model_id = "anthropic.claude-3-haiku-20240307-v1:0"
messages = [
    {
        "role": "user",
        "content": [{"text": "What is the Amazon VPC?"}]
    }
]

print("Streaming response:\n")
stream_conversation(model_id, messages)


## 4. Key Features of Converse API

### 4.1 Unified API Across Models

In [None]:
# Conversation turn 
def converse_api_example(model_id, prompt, system_prompt="You are an AI assistant that specializes in AWS."):
    messages = [
        {
            "role": "user",
            "content": [{"text": prompt}]
        }
    ]
    system_prompts = [{"text": system_prompt}]
    inference_config = {
        "temperature": 1,
        "maxTokens": 256
    }
    try:
        response = bedrock_client.converse(
            modelId=model_id,
            messages=messages,
            system=system_prompts,
            inferenceConfig=inference_config
        )
        return response["output"]["message"]
    except Exception as e:
        return f"ERROR: {str(e)}"



def unified_api_example(prompt):
    # List of models to send prompts
    models = [
        "anthropic.claude-3-haiku-20240307-v1:0",
        "meta.llama3-1-8b-instruct-v1:0"
    ]

    for model in models:
        response = converse_api_example(model, prompt)
        print(f"model_id: {model}\nprompt: {prompt}\n{response}\n", "-"*100, "\n")


# Example usage
prompt = "What is cloud computing?"
unified_api_example(prompt)

### 4.2 Multi-turn Conversations

In this example we will use the Converse API to add memory for model conversations.

A conversation is a series of messages between the user and the model. You start a conversation by sending a message as a user (user role) to the model. The model, acting as an assistant (assistant role), then generates a response that it returns in a message. If desired, you can continue the conversation by sending further user role messages to the model. To maintain the conversation context, be sure to include any assistant role messages that you receive from the model in subsequent requests. For example code, see [Converse API examples](https://docs.aws.amazon.com/bedrock/latest/userguide/conversation-inference.html#conversation-inference-examples).

>Messages must have a `[{"role": "user", "content": [{"text": "prompt"}]}, {"role": "assistant", "content": [{"text": "response"}]}]` alternating structure.

In [None]:

def multi_turn_conversation(model_id):
    messages = []
    # Include system message
    system_prompt = "You are a helpful AI assistant."

    while True:
        user_input = input("You: ")
        if user_input.lower() == 'exit':
            print("Here is the entire message history: ")
            for message in messages:
                print(message)
            break

        messages.append({"role": "user", "content": [{"text": user_input}]})

        response = bedrock_client.converse(
            modelId=model_id,
            messages=messages,
            system=[{"text": system_prompt}],
            inferenceConfig={"temperature": 0.7, "maxTokens": 256}
        )

        ai_response = response["output"]["message"]["content"][0]["text"]
        messages.append({"role": "assistant", "content": [{"text": ai_response}]})

        print(f"AI: {ai_response}")


# Example usage
model_id = "anthropic.claude-3-haiku-20240307-v1:0"
multi_turn_conversation(model_id)

### 4.3 Call a tool with the Converse API

When you using models they 

This requires you to route requests to and from the model and to and from the APIs in your code.

List date and getVPC info



In [None]:
import boto3
import json
import datetime
import pytz

# Initialize the Bedrock client
bedrock_client = boto3.client('bedrock-runtime')

def get_date_time():
    # Define the Eastern Time Zone
    eastern = pytz.timezone('America/New_York')

    # Get the current time in UTC and convert it to Eastern Time
    now = datetime.datetime.now(pytz.utc).astimezone(eastern)

    return {
        "date": now.strftime("%Y-%m-%d"),
        "time": now.strftime("%H:%M:%S"),
        "timezone": "Eastern Time (ET)"
    }

def conversation(model_id, messages):
    response = bedrock_client.converse(
        modelId=model_id,
        system=[{"text": "You are an AI assistant with access to tools that can provide the current date and time. When a user asks about the current date or time, you should use these tools to provide accurate, up-to-date information. Always strive to give helpful and accurate responses when needed use your available tools."}],
        messages=messages,
        toolConfig={
            "tools": [{
                "toolSpec": {
                    "name": "get_date_time",
                    "description": "Get the current date and time.",
                    "inputSchema": {
                        "json": {
                            "type": "object",
                            "properties": {
                            "date": {"type": "string", "description": "Today's current date."},
                            "time": {"type": "string", "description": "Today's current time."}
                        },
                            "required": []
                        }
                    }
                }
            }],
            "toolChoice": {"auto": {}}
        }
    )
    return response['output']['message']

def process_response(ai_response, messages):
    for content in ai_response['content']:
        if "text" in content:
            print(f"AI: {content['text']}")
        if "toolUse" in content:
            tool_use = content['toolUse']
            print(f"Checking date and time.....")
            if tool_use['name'] == "get_date_time":
                date_time = get_date_time()
                tool_result = {
                    "role": "user",
                    "content": [{
                        "toolResult": {
                            "toolUseId": tool_use['toolUseId'],
                            "content": [{"json": date_time}],
                            "status": "success"
                        }
                    }]
                }
                messages.append(tool_result)
                result = conversation(model_id, messages)
                messages.append(result)
                print(f"AI: {result}")
                return messages
    return messages

# Example usage
model_id = "anthropic.claude-3-haiku-20240307-v1:0"
messages = []

while True:
    user_input = input("You: ")
    if user_input.lower() == 'exit':
        break
    
    messages.append({
        "role": "user",
        "content": [{"text": user_input}]
    })
    
    ai_response = conversation(model_id, messages)
    messages.append(ai_response)
    messages = process_response(ai_response, messages)

    print(messages)

print("Conversation ended.")

## Now go try it out!

**Call to action:**
1. [Sign Up for AWS](https://aws.amazon.com/free): If you have not already, create an AWS account. 
2. Access Bedrock: Request model access to Amazon Bedrock in your AWS Console
3. [Check out the Documentation](https://docs.aws.amazon.com/bedrock/latest/userguide/what-is-bedrock.html): Visit the Amazon Bedrock Developer Guide
4. [Try the Converse API](https://docs.aws.amazon.com/bedrock/latest/userguide/conversation-inference.html): Use our sample code to make your first Converse API call
5. Share your code on social media with the hashtag #AmazonBedrock



### Resources

- [Amazon Bedrock Documentation](https://docs.aws.amazon.com/bedrock/latest/userguide/what-is-bedrock.html)
- [Use the Converse API](https://docs.aws.amazon.com/bedrock/latest/userguide/conversation-inference.html)
- [Invoke model](https://docs.aws.amazon.com/bedrock/latest/userguide/inference-invoke.html)
- [Anthropic Claude Messages API](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-anthropic-claude-messages.html#api-inference-examples-claude-messages-code-examples)
- [A developer's guide to Bedrock's new Converse API](https://community.aws/content/2dtauBCeDa703x7fDS9Q30MJoBA/amazon-bedrock-converse-api-developer-guide)
