# File Name: simple_converse_api.ipynb
### Location: Chapter 4
### Purpose: 
#####             1. Understanding Amazon Bedrock client and Amazon Bedrock runtime client.
#####             2. Example of Amazon Titan LLM foundation model with and without parameters using the Converse API.
#####             3. Example of Anthropic LLM foundation model with and without parameters using the Converse API.
#####             4. Example of Amazon Titan LLM foundation model with streaming API with and with out parameters using the Converse API.
##### Dependency: Not Applicable
# <ins>-----------------------------------------------------------------------------------</ins>

# <ins>Amazon SageMaker Classic</ins>
#### Those who are new to Amazon SageMaker Classic. Follow the link for the details. https://docs.aws.amazon.com/sagemaker/latest/dg/studio.html

# <ins>Environment setup of Kernel</ins>
##### Fill "Image" as "Data Science"
##### Fill "Kernel" as "Python 3"
##### Fill "Instance type" as "ml-t3-medium"
##### Fill "Start-up script" as "No Scripts"
##### Click "Select"

###### Refer https://docs.aws.amazon.com/sagemaker/latest/dg/notebooks-create-open.html for details.

# <ins>Mandatory installation on the kernel through pip</ins>

##### This lab will work with below software version. But, if you are trying with latest version of boto3, awscli, and botocore. This code may fail. You might need to change the corresponding api. 

##### You will see pip dependency errors. you can safely ignore these errors and continue executing rest of the cell. 

In [None]:
%pip install --no-build-isolation --force-reinstall -q \
    "boto3>=1.28.57" \
    "awscli>=1.29.57" \
    "botocore>=1.31.57"

# <ins>Disclaimer</ins>

##### You will see pip dependency errors. you can safely ignore these errors and continue executing rest of the cell.

# <ins>Restart the kernel</ins>

In [None]:
# restart kernel
from IPython.core.display import HTML
HTML("<script>Jupyter.notebook.kernel.restart()</script>")

# <ins>Python package import</ins>

##### boto3 offers various clients for Amazon Bedrock to execute various actions.
##### botocore is a low-level interface to AWS tools, while boto3 is built on top of botocore and provides additional features

In [None]:
import json
import os
import botocore
import boto3
import warnings
import time
import random
import sys

### Ignore warning 

In [None]:
warnings.filterwarnings('ignore')

# <ins>Amazon Bedrock Runtime Client</ins>

##### Purpose: used for making inference requests for models hosted in Amazon Bedrock. 
##### Refer https://docs.aws.amazon.com/bedrock/latest/APIReference/API_Operations_Amazon_Bedrock_Runtime.html for details about Amazon Bedrock runtime client 

## Define important environment variable

# <ins>Amazon Bedrock Client</ins>

##### Purpose: used for managing, training, and deploying models on Amazon Bedrock
##### Refer https://docs.aws.amazon.com/bedrock/latest/APIReference/API_Operations_Amazon_Bedrock.html for details about Amazon Bedrock client 

In [None]:
%%time

# Try-except block to handle potential errors
try:
    # Create a new Boto3 session to interact with AWS services
    boto3_session_name = boto3.session.Session()

    # Retrieve the current AWS region from the session
    aws_region_name = boto3_session_name.region_name
    
    # Create a new Boto3 bedrock client to interact with AWS services
    boto3_bedrock_client = boto3.client('bedrock')
    
    # Create a new Boto3 bedrock runtime client to interact with AWS services
    boto3_bedrock_runtime_client = boto3.client('bedrock-runtime', region_name = aws_region_name,)
    
    # Generate a random suffix number between 200 and 900
    random_suffix = random.randrange(200, 900)
    
    # Store all variables in a dictionary
    variables_store = {
        "boto3_session_name": boto3_session_name,
        "aws_region_name": aws_region_name,
        "boto3_bedrock_client": boto3_bedrock_client,
        "random_suffix": random_suffix,
        "boto3_bedrock_runtime_client": boto3_bedrock_runtime_client
    }

    # Print all variables
    for var_name, value in variables_store.items():
        print(f"{var_name}: {value}")

except Exception as e:
    print(f"An unexpected error occurred: {e}")


# <ins>Amazon Bedrock Runtime Client</ins>

##### Purpose: used for making inference requests for models hosted in Amazon Bedrock. 
##### Refer https://docs.aws.amazon.com/bedrock/latest/APIReference/API_Operations_Amazon_Bedrock_Runtime.html for details about Amazon Bedrock runtime client 

# <ins>Example of Amazon Titan LLM foundation model</ins>

##### This example is based on Titan Text G1 - Express v1 foundation model. 
##### Model ID: amazon.titan-text-express-v1
##### Refer https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-titan-text.html

##### API: converse
##### Refer https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html

# <ins>Disclaimer</ins> 

##### Make sure that amazon.titan-text-express-v1 is allowlisted on Amazon Bedrock model access. Refer Section 3.3 of Chapter 3 of the Book

In [None]:
## Defining model_id, prompt and other variables
## You can try out different model id, your own prompt. 
model_id = "amazon.titan-text-express-v1"
prompt = """User: Generate a story for a kid about beauty of a rainbow within 100 words
bot:
"""

##### <ins>Example with default inferance parameters</ins>

In [None]:
%%time

try:
    # Constructing the message payload for the converse API request
    messages = [
        {
            "role": "user",
            "content": [
                {
                    "text": prompt
                }
            ]
        }
    ]

    # Sending the request to the Amazon Bedrock service using the converse API
    response = boto3_bedrock_runtime_client.converse(
        modelId=model_id, messages=messages
    )

    # Extracting relevant fields from the response
    output_text = response['output']['message']['content'][0]['text']
    latency = response['metrics']['latencyMs']
    input_tokens = response['usage']['inputTokens']
    output_tokens = response['usage']['outputTokens']

    # Printing extracted values
    print(f"Output Text: {output_text}")
    print(f"Latency (ms): {latency}")
    print(f"Input Tokens: {input_tokens}")
    print(f"Output Tokens: {output_tokens}")

except botocore.exceptions.ClientError as error:
    # Retrieving the error code from the exception response
    error_code = error.response['Error'].get('Code', 'Unknown')
    
    # Handling specific error codes for more tailored responses
    if error_code == 'AccessDeniedException':
        print(f"\x1b[41mAccess Denied: {error.response['Error'].get('Message', 'No message available')}\x1b[0m")
    else:
        print(f"An error occurred: {error}")
        # Raising the exception again to ensure it is visible in the outer scope if necessary
        raise
except KeyError as key_error:
    # Handles missing keys in the response structure
    print(f"Key error: {key_error}. The response format may have changed.")
    raise
except Exception as general_error:
    # Catch-all for any other exceptions
    print(f"An unexpected error occurred: {general_error}")
    raise


##### <ins>Example with  inferance parameters configuration</ins>

In [None]:
%%time

try:
    # Constructing the message payload and inference configuration for the converse API request
    messages = [
        {
            "role": "user",
            "content": [
                {
                    "text": prompt
                }
            ]
        }
    ]
    
    inferenceConfig = {
        "temperature": 1.0,
        "maxTokens": 2000,
        "topP": 0.9
    }

    # Sending the request to the Amazon Bedrock service using the converse API
    response = boto3_bedrock_runtime_client.converse(
        modelId=model_id, messages=messages, inferenceConfig=inferenceConfig
    )

    # Extracting and printing relevant fields from the response
    output_text = response['output']['message']['content'][0]['text']
    latency = response['metrics']['latencyMs']
    input_tokens = response['usage']['inputTokens']
    output_tokens = response['usage']['outputTokens']

    print(f"Output Text: {output_text}")
    print(f"Latency (ms): {latency}")
    print(f"Input Tokens: {input_tokens}")
    print(f"Output Tokens: {output_tokens}")

except botocore.exceptions.ClientError as error:
    # Retrieve the error code from the exception response
    error_code = error.response['Error'].get('Code', 'Unknown')
    
    # Specific handling for AccessDeniedException
    if error_code == 'AccessDeniedException':
        print(f"\x1b[41mAccess Denied: {error.response['Error'].get('Message', 'No message available')}\x1b[0m")
    else:
        # General error message with code and message details
        print(f"An error occurred: {error_code} - {error.response['Error'].get('Message', 'No message available')}")
        raise
except KeyError as key_error:
    # Handles missing keys in the response structure
    print(f"Key error: {key_error}. Check if the response structure has changed.")
    raise
except Exception as general_error:
    # Catch-all for any other exceptions
    print(f"An unexpected error occurred: {general_error}")
    raise


# <ins>Example of Anthropic LLM foundation model</ins>

##### This example is based on Claude 3 Haiku foundation model. 
##### Model ID: anthropic.claude-3-haiku-20240307-v1:0
##### Refer https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-anthropic-claude-messages.html

##### API: converse
##### Refer https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html

# <ins>Disclaimer</ins> 

##### Make sure that anthropic.claude-3-haiku-20240307-v1:0 is allowlisted on Amazon Bedrock model access. Refer Section 3.3 of Chapter 3 of the Book

In [None]:
## Defining model_id, prompt and other variables
## You can try out different model id, your own prompt. 
model_id = "anthropic.claude-3-haiku-20240307-v1:0"
prompt = """Human: Generate a story for a kid about beauty of a rainbow within 100 words

Assitance:
"""

##### <ins>Example with default inferance parameters</ins>

In [None]:
%%time

try:
    # Constructing the message payload for the converse API request
    messages = [
        {
            "role": "user",
            "content": [
                {
                    "text": prompt
                }
            ]
        }
    ]

    # Sending the request to the Amazon Bedrock service using the converse API
    response = boto3_bedrock_runtime_client.converse(
        modelId=model_id, messages=messages
    )

    # Extracting and printing relevant fields from the response
    output_text = response['output']['message']['content'][0]['text']
    latency = response['metrics']['latencyMs']
    input_tokens = response['usage']['inputTokens']
    output_tokens = response['usage']['outputTokens']

    print(f"Output Text: {output_text}")
    print(f"Latency (ms): {latency}")
    print(f"Input Tokens: {input_tokens}")
    print(f"Output Tokens: {output_tokens}")

except botocore.exceptions.ClientError as error:
    # Retrieve the error code from the exception response
    error_code = error.response['Error'].get('Code', 'Unknown')
    
    # Handle specific error cases
    if error_code == 'AccessDeniedException':
        print(f"\x1b[41mAccess Denied: {error.response['Error'].get('Message', 'No message available')}\x1b[0m")
    else:
        # Print general error message with code and details
        print(f"An error occurred: {error_code} - {error.response['Error'].get('Message', 'No message available')}")
        raise

except KeyError as key_error:
    # Handle missing keys in the response structure
    print(f"Key error: {key_error}. Check if the response structure has changed.")
    raise

except Exception as general_error:
    # Catch-all for any other exceptions
    print(f"An unexpected error occurred: {general_error}")
    raise


##### <ins>Example with  inferance parameters configuration</ins>

In [None]:
%%time

try:
    # Constructing the message payload and inference configuration for the converse API request
    messages = [
        {
            "role": "user",
            "content": [
                {
                    "text": prompt
                }
            ]
        }
    ]
    
    inferenceConfig = {
        "temperature": 1.0,
        "maxTokens": 2000,
        "topP": 0.9
    }

    # Sending the request to the Amazon Bedrock service using the converse API
    response = boto3_bedrock_runtime_client.converse(
        modelId=model_id, messages=messages, inferenceConfig=inferenceConfig
    )

    # Extracting and printing relevant fields from the response
    output_text = response['output']['message']['content'][0]['text']
    latency = response['metrics']['latencyMs']
    input_tokens = response['usage']['inputTokens']
    output_tokens = response['usage']['outputTokens']

    print(f"Output Text: {output_text}")
    print(f"Latency (ms): {latency}")
    print(f"Input Tokens: {input_tokens}")
    print(f"Output Tokens: {output_tokens}")

except botocore.exceptions.ClientError as error:
    # Retrieve the error code from the exception response
    error_code = error.response['Error'].get('Code', 'Unknown')
    
    # Handle specific error cases
    if error_code == 'AccessDeniedException':
        print(f"\x1b[41mAccess Denied: {error.response['Error'].get('Message', 'No message available')}\x1b[0m")
    else:
        # Print general error message with code and details
        print(f"An error occurred: {error_code} - {error.response['Error'].get('Message', 'No message available')}")
        raise

except KeyError as key_error:
    # Handle missing keys in the response structure
    print(f"Key error: {key_error}. The response structure may have changed.")
    raise

except Exception as general_error:
    # Catch-all for any other exceptions
    print(f"An unexpected error occurred: {general_error}")
    raise


# <ins>Example of Amazon Titan LLM foundation model with streaming API</ins>


##### This example is based on Titan Text G1 - Express v1 foundation model. 
##### Model ID: amazon.titan-text-express-v1
##### Refer https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-titan-text.html

##### API: converse_stream
##### Refer https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_ConverseStream.html

# <ins>Disclaimer</ins> 

##### Make sure that amazon.titan-text-express-v1 is allowlisted on Amazon Bedrock model access. Refer Section 3.3 of Chapter 3 of the Book

In [None]:
## Defining model_id, prompt and other variables
## You can try out different model id, your own prompt. 
model_id = "amazon.titan-text-express-v1"
prompt = """User: Generate a story for a kid about beauty of a rainbow within 1000 words
bot:
"""

##### <ins>Example with default inferance parameters</ins>

In [None]:
%%time

import sys

try:
    # Constructing the message payload for the streaming API request
    messages = [
        {
            "role": "user",
            "content": [
                {
                    "text": prompt
                }
            ]
        }
    ]

    # Sending the request to the Amazon Bedrock service using the converse streaming API
    response = boto3_bedrock_runtime_client.converse_stream(
        modelId=model_id, messages=messages
    )

    # Processing and streaming the response in chunks
    for event in response['stream']:
        if 'contentBlockDelta' in event:
            # Extract and print each chunk of text as it streams in
            chunk = event['contentBlockDelta']
            sys.stdout.write(chunk['delta']['text'])
            sys.stdout.flush()

except botocore.exceptions.ClientError as error:
    # Retrieve the error code from the exception response
    error_code = error.response['Error'].get('Code', 'Unknown')
    
    # Specific handling for AccessDeniedException
    if error_code == 'AccessDeniedException':
        print(f"\x1b[41mAccess Denied: {error.response['Error'].get('Message', 'No message available')}\x1b[0m")
    else:
        # Print general error message with code and details
        print(f"An error occurred: {error_code} - {error.response['Error'].get('Message', 'No message available')}")
        raise

except KeyError as key_error:
    # Handle missing keys in the streaming response structure
    print(f"Key error: {key_error}. The response structure may have changed.")
    raise

except Exception as general_error:
    # Catch-all for any other exceptions
    print(f"An unexpected error occurred: {general_error}")
    raise


##### <ins>Example with  inferance parameters configuration</ins>

In [None]:
%%time

import sys

try:
    # Construct the message payload and inference configuration for the streaming API request
    messages = [
        {
            "role": "user",
            "content": [
                {
                    "text": prompt
                }
            ]
        }
    ]
    
    inferenceConfig = {
        "temperature": 1.0,
        "maxTokens": 2000,
        "topP": 0.9
    }

    # Sending the request to the Amazon Bedrock service using the converse streaming API
    response = boto3_bedrock_runtime_client.converse_stream(
        modelId=model_id, messages=messages, inferenceConfig=inferenceConfig
    )

    # Processing and streaming the response in chunks as it arrives
    for event in response['stream']:
        if 'contentBlockDelta' in event:
            # Extract and print each chunk of text in real-time
            chunk = event['contentBlockDelta']
            sys.stdout.write(chunk['delta']['text'])
            sys.stdout.flush()

except botocore.exceptions.ClientError as error:
    # Retrieve the error code from the exception response
    error_code = error.response['Error'].get('Code', 'Unknown')
    
    # Handle AccessDeniedException specifically
    if error_code == 'AccessDeniedException':
        print(f"\x1b[41mAccess Denied: {error.response['Error'].get('Message', 'No message available')}\x1b[0m")
    else:
        # Print a general error message with code and details
        print(f"An error occurred: {error_code} - {error.response['Error'].get('Message', 'No message available')}")
        raise

except KeyError as key_error:
    # Handle missing keys in the streaming response structure
    print(f"Key error: {key_error}. The response structure may have changed.")
    raise

except Exception as general_error:
    # Catch-all for any other exceptions
    print(f"An unexpected error occurred: {general_error}")
    raise


# End of NoteBook 

## Please ensure that you close the kernel after using this notebook to avoid any potential charges to your account.

## Process: Go to "Kernel" at top option. Choose "Shut Down Kernel". 
##### Refer https://docs.aws.amazon.com/sagemaker/latest/dg/studio-ui.html