## Getting started with the Converse API in Amazon Bedrock

> *This notebook should work well with the **`Python 3`** kernel in SageMaker Studio*

In this notebook, we'll explore the basics of the Converse API in Amazon Bedrock. The Converse or ConverseStream API is a unified structured text API action that allows you simplifying the invocations to Bedrock LLMs, using a universal syntax and message structured prompts for any of the supported model providers.

Let's start by installing or updating boto3. You just need to run this cell the first time.

In [None]:
%pip install --force-reinstall -q -r ./utils/requirements.txt

In [1]:
import boto3
import sys
print('Running boto3 version:', boto3.__version__)

Running boto3 version: 1.35.13


Let's define the region and models to use. We can also setup our boto3 client.

In [2]:
boto3_session = boto3.session.Session()
region = boto3_session.region_name

print('Using region: ', region)

bedrock = boto3.client(
    service_name = 'bedrock-runtime',
    region_name = region,
    )

MODEL_IDS = [
    "amazon.titan-tg1-large",
    "anthropic.claude-3-haiku-20240307-v1:0",
    "anthropic.claude-3-sonnet-20240229-v1:0",
]

Using region:  us-west-2


We're now ready to setup our Converse API action in Bedrock. Note that we use the same syntax for any model, including the messages-formatted prompts, and the inference parameters. Also note that we read the output in the same way independently of the model used.

Optionally, we could define additional model specific request fields that are not common across all providers. For more information on this check the Bedrock Converse API documentation.

### Converse for one-shot invocations

In [3]:
def invoke_bedrock_model(client, id, prompt, max_tokens=2000, temperature=0, top_p=0.9):
    response = ""
    try:
        response = client.converse(
            modelId=id,
            messages=[
                {
                    "role": "user",
                    "content": [
                        {
                            "text": prompt
                        }
                    ]
                }
            ],
            inferenceConfig={
                "temperature": temperature,
                "maxTokens": max_tokens,
                "topP": top_p
            }
            #additionalModelRequestFields={
            #}
        )
    except Exception as e:
        print(e)
        result = "Model invocation error"
    try:
        result = response['output']['message']['content'][0]['text'] \
        + '\n--- Latency: ' + str(response['metrics']['latencyMs']) \
        + 'ms - Input tokens:' + str(response['usage']['inputTokens']) \
        + ' - Output tokens:' + str(response['usage']['outputTokens']) + ' ---\n'
        return result
    except Exception as e:
        print(e)
        result = "Output parsing error"
    return result

Finally, we can test our invocation.

In this example, we run the same prompt across all the text models supported in Bedrock by the time of writing this example.

In [4]:
prompt = ("What is the capital of Italy?")
print(f'Prompt: {prompt}\n')

for i in MODEL_IDS:
    response = invoke_bedrock_model(bedrock, i, prompt)
    print(f'Model: {i}\n{response}')

Prompt: What is the capital of Italy?

Model: amazon.titan-tg1-large
The capital of Italy is Rome. It is the fourth most populous city in the European Union.
--- Latency: 1375ms - Input tokens:10 - Output tokens:24 ---

Model: anthropic.claude-3-haiku-20240307-v1:0
The capital of Italy is Rome.
--- Latency: 224ms - Input tokens:14 - Output tokens:10 ---

Model: anthropic.claude-3-sonnet-20240229-v1:0
The capital of Italy is Rome.
--- Latency: 283ms - Input tokens:14 - Output tokens:10 ---



### ConverseStream for streaming invocations

We can also use the Converse API for streaming invocations. In this case we rely on the ConverseStream action.

In [5]:
MODEL_IDS = [
    "amazon.titan-tg1-large",
    "anthropic.claude-3-haiku-20240307-v1:0",
    "anthropic.claude-3-sonnet-20240229-v1:0",
]

In [6]:
def invoke_bedrock_model_stream(client, id, prompt, max_tokens=2000, temperature=0, top_p=0.9):
    response = ""
    response = client.converse_stream(
        modelId=id,
        messages=[
            {
                "role": "user",
                "content": [
                    {
                        "text": prompt
                    }
                ]
            }
        ],
        inferenceConfig={
            "temperature": temperature,
            "maxTokens": max_tokens,
            "topP": top_p
        }
    )
    # Extract and print the response text in real-time.
    for event in response['stream']:
        if 'contentBlockDelta' in event:
            chunk = event['contentBlockDelta']
            sys.stdout.write(chunk['delta']['text'])
            sys.stdout.flush()
    return

In [7]:
prompt = ("What is the capital of Italy?")
print(f'Prompt: {prompt}\n')

for i in MODEL_IDS:
    print(f'\n\nModel: {i}')
    invoke_bedrock_model_stream(bedrock, i, prompt)

Prompt: What is the capital of Italy?



Model: amazon.titan-tg1-large
The capital of Italy is Rome. It is the fourth most populous city in the European Union.

Model: anthropic.claude-3-haiku-20240307-v1:0
The capital of Italy is Rome.

Model: anthropic.claude-3-sonnet-20240229-v1:0
The capital of Italy is Rome.

As you can see, the Converse API allow us to easily run the invocations with the same syntax across all the models.