# Module 4

## Prerequisites

In [1]:
import boto3
import json
import time
from datetime import datetime
import os

bedrock = boto3.client('bedrock-runtime', region_name='us-east-1')

## Inference profile

In [2]:
prompt = """
What design patterns are used in this code?

You are an expert Python developer. Here's a codebase context:

class DataProcessor:
    def __init__(self, config):
        self.config = config
        self.data = []
    
    def load_data(self, source):
        # Complex data loading logic
        pass
    
    def transform_data(self, transformations):
        # Data transformation pipeline
        pass
    
    def validate_data(self, rules):
        # Data validation logic
        pass

This class is part of a larger system with 50+ similar classes.
Always reference this context when answering questions.
"""

### Invoking cross-region inference profile

In [3]:
response = bedrock.converse(
    modelId='us.amazon.nova-lite-v1:0',
    messages=[
       {
            "role": "user",
            "content": [
                {
                    "text": prompt
                }
          ]
      }
    ]
)

print(json.dumps(response, indent=2))

{
  "ResponseMetadata": {
    "RequestId": "1e1185b1-1e0d-45dc-afed-ed45cb8bf80c",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "date": "Tue, 30 Sep 2025 20:11:50 GMT",
      "content-type": "application/json",
      "content-length": "3892",
      "connection": "keep-alive",
      "x-amzn-requestid": "1e1185b1-1e0d-45dc-afed-ed45cb8bf80c"
    },
    "RetryAttempts": 0
  },
  "output": {
    "message": {
      "role": "assistant",
      "content": [
        {
          "text": "Given the provided context, let's analyze the design patterns that might be used in the `DataProcessor` class and the larger system with 50+ similar classes.\n\n### Design Patterns in the `DataProcessor` Class\n\n1. **Abstract Factory Pattern**:\n   - If the system creates families of related objects without specifying their concrete classes, an Abstract Factory Pattern might be used. However, this pattern is not directly evident from the provided snippet.\n\n2. **Builder Pattern**:\n   - If the `DataPr

### Use your own inference profile

In [4]:
# create an inference profile
bedrock_control = boto3.client('bedrock', region_name='us-east-1')

response = bedrock_control.create_inference_profile(
    inferenceProfileName='my-nova-lite-profile',
    description='Custom inference profile for Nova Lite',
    modelSource={
        'copyFrom': 'arn:aws:bedrock:us-east-1:058264218236:inference-profile/us.amazon.nova-lite-v1:0'
    }
)
inference_profile_arn = response['inferenceProfileArn']

print(json.dumps(response, indent=2))

{
  "ResponseMetadata": {
    "RequestId": "56d89816-d4d4-4a2c-bac3-a056bdf15d10",
    "HTTPStatusCode": 201,
    "HTTPHeaders": {
      "date": "Tue, 30 Sep 2025 20:12:33 GMT",
      "content-type": "application/json",
      "content-length": "125",
      "connection": "keep-alive",
      "x-amzn-requestid": "56d89816-d4d4-4a2c-bac3-a056bdf15d10"
    },
    "RetryAttempts": 0
  },
  "inferenceProfileArn": "arn:aws:bedrock:us-east-1:058264218236:application-inference-profile/b3x58b3qiuhe",
  "status": "ACTIVE"
}


In [5]:
response = bedrock.converse(
    modelId=inference_profile_arn,
    messages=[
       {
            "role": "user",
            "content": [
                {
                    "text": prompt
                }
          ]
      }
    ]
)

print(json.dumps(response, indent=2))

{
  "ResponseMetadata": {
    "RequestId": "3bdf24df-a88e-4bdc-833f-8ff3821510e7",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "date": "Tue, 30 Sep 2025 20:13:10 GMT",
      "content-type": "application/json",
      "content-length": "4466",
      "connection": "keep-alive",
      "x-amzn-requestid": "3bdf24df-a88e-4bdc-833f-8ff3821510e7"
    },
    "RetryAttempts": 0
  },
  "output": {
    "message": {
      "role": "assistant",
      "content": [
        {
          "text": "Given the provided code snippet and the context that this class is part of a larger system with 50+ similar classes, let's analyze the design patterns that might be used in this codebase.\n\n### Design Patterns in the Provided Code\n\n1. **Factory Pattern**:\n   - **Context**: The `DataProcessor` class is initialized with a `config` object. This suggests that there might be a factory method or class responsible for creating instances of `DataProcessor` and possibly other similar classes.\n   - **Rationa

In [6]:
response = bedrock_control.delete_inference_profile(
    inferenceProfileIdentifier=inference_profile_arn
)

## Batch Inference

Upload jsonl file to S3

In [7]:
bucket_name = "batch-data-inference"

s3 = boto3.client('s3')

# Upload manifest to S3
input_key = 'input/city-reviews-manifest.jsonl'
s3.upload_file('input/city-reviews-manifest.jsonl', bucket_name, input_key)

Call the CreateModelInvocationJob API

In [8]:
accountnumber = "058264218236"

# Create the bedrock control plane client
bedrock_control = boto3.client('bedrock', region_name='us-east-1')

# Create batch inference job
job_name = f"batch-job-{datetime.now().strftime('%Y%m%d-%H%M%S')}"

response = bedrock_control.create_model_invocation_job(
    jobName=job_name,
    roleArn=f"arn:aws:iam::{accountnumber}:role/BedrockBatchRole",
    modelId="us.amazon.nova-lite-v1:0",
    inputDataConfig={
        's3InputDataConfig': {
            's3Uri': f's3://{bucket_name}/input/'
        }
    },
    outputDataConfig={
        's3OutputDataConfig': {
            's3Uri': f's3://{bucket_name}/output/'
        }
    }
)

job_arn = response['jobArn']

Get the Job Status

In [9]:
status_response = bedrock_control.get_model_invocation_job(jobIdentifier=job_arn)
status = status_response['status']
print(f"Job status: {status}")

Job status: Validating


Instead of waiting, let's use a jobArn for a job we know already finished

In [10]:
completed_job_arn = "arn:aws:bedrock:us-east-1:058264218236:model-invocation-job/p2t64s4pml6t"

Wait for the job to finish

In [11]:
# Wait for job completion
while True:
    status_response = bedrock_control.get_model_invocation_job(jobIdentifier=completed_job_arn)
    status = status_response['status']
    print(f"Job status: {status}")
    
    if status in ['Completed', 'PartiallyCompleted', 'Failed', 'Stopped', 'Expired']:
        break
    
    time.sleep(30)



Job status: Completed


Download the result

In [12]:
if status in ["Completed", "PartiallyCompleted"]:
    # Download results
    output_objects = s3.list_objects_v2(Bucket=bucket_name, Prefix='output/')
    
    os.makedirs('batch-output', exist_ok=True)
    
    for obj in output_objects.get('Contents', []):
        if obj['Key'].endswith('/'):
            continue
            
        local_file = os.path.join('batch-output', obj['Key'].split('/')[-1])
        s3.download_file(bucket_name, obj['Key'], local_file)
        print(f"Downloaded: {local_file}")


Downloaded: batch-output/city-reviews-manifest.jsonl.out
Downloaded: batch-output/manifest.json.out
Downloaded: batch-output/city-reviews-manifest.jsonl.out
Downloaded: batch-output/manifest.json.out


Stop the job we created to save on cost

In [None]:
response = bedrock_control.stop_model_invocation_job(
    jobIdentifier=job_arn
)

## Prompt Caching with Nova Lite

In [16]:
# Large context that we want to cache
cached_context = """
You are a expert Python developer. Here's a large codebase context:

class DataProcessor:
    def __init__(self, config):
        self.config = config
        self.data = []
    
    def load_data(self, source):
        # Complex data loading logic
        pass
    
    def transform_data(self, transformations):
        # Data transformation pipeline
        pass
    
    def validate_data(self, rules):
        # Data validation logic
        pass

This class is part of a larger system with 50+ similar classes.
Always reference this context when answering questions.
"""

First request that establishes the cache

In [17]:
firstRequest = bedrock.converse(
    modelId='amazon.nova-lite-v1:0',
    messages=[
       {
            "role": "user",
            "content": [
                {
                    "text": cached_context
                },
                {
                    "cachePoint": {
                        "type": "default"
                    }
                },
                {
                    "text": "What design patterns are used in this code?"
                }
          ]
      }
    ]
)

print(json.dumps(firstRequest, indent=2))

{
  "ResponseMetadata": {
    "RequestId": "8e7465ae-fa24-4a5d-9b98-81d13ce554b9",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "date": "Tue, 30 Sep 2025 20:26:06 GMT",
      "content-type": "application/json",
      "content-length": "3682",
      "connection": "keep-alive",
      "x-amzn-requestid": "8e7465ae-fa24-4a5d-9b98-81d13ce554b9"
    },
    "RetryAttempts": 0
  },
  "output": {
    "message": {
      "role": "assistant",
      "content": [
        {
          "text": "Given the provided context, the `DataProcessor` class exhibits several design patterns. Here are some of the most notable ones:\n\n### 1. **Factory Pattern**\nWhile not explicitly shown in the provided code, it's common to use a Factory Pattern to create instances of `DataProcessor` and similar classes. This pattern can help manage the creation of objects and ensure that the correct configuration is applied.\n\n### 2. **Builder Pattern**\nThe `__init__` method of `DataProcessor` takes a `config` paramet

Second request that uses cached context

In [18]:
secondRequest = bedrock.converse(
    modelId='amazon.nova-lite-v1:0',
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "text": cached_context
                },
                {
                    "cachePoint": {
                        "type": "default"
                    }
                },
                {
                    "text": "what else can you tell me about this code?"
                }
          ]
      }
    ]
)

print(json.dumps(secondRequest, indent=2))

{
  "ResponseMetadata": {
    "RequestId": "fe7ca182-413c-4bb9-8ffa-27212baa9eb8",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "date": "Tue, 30 Sep 2025 20:27:31 GMT",
      "content-type": "application/json",
      "content-length": "6236",
      "connection": "keep-alive",
      "x-amzn-requestid": "fe7ca182-413c-4bb9-8ffa-27212baa9eb8"
    },
    "RetryAttempts": 0
  },
  "output": {
    "message": {
      "role": "assistant",
      "content": [
        {
          "text": "Certainly! Let's break down the provided code and discuss some aspects of it, considering the context of a larger system with 50+ similar classes.\n\n### Class Overview\n\nThe `DataProcessor` class is designed to handle data processing tasks, including loading, transforming, and validating data. Here's a more detailed look at each component:\n\n1. **Initialization (`__init__`)**:\n    ```python\n    def __init__(self, config):\n        self.config = config\n        self.data = []\n    ```\n    - **`conf

Compare the usage

In [19]:
print("### First request ###")
print("Usage:", json.dumps(firstRequest['usage'], indent=2))
print("Metrics:", json.dumps(firstRequest['metrics'], indent=2))

print("\n\n### Second request ###")
print("Usage:", json.dumps(secondRequest['usage'], indent=2))
print("Metrics:", json.dumps(secondRequest['metrics'], indent=2))

### First request ###
Usage: {
  "inputTokens": 9,
  "outputTokens": 649,
  "totalTokens": 782,
  "cacheReadInputTokens": 0,
  "cacheWriteInputTokens": 124
}
Metrics: {
  "latencyMs": 2913
}


### Second request ###
Usage: {
  "inputTokens": 10,
  "outputTokens": 1200,
  "totalTokens": 1334,
  "cacheReadInputTokens": 124,
  "cacheWriteInputTokens": 0
}
Metrics: {
  "latencyMs": 4491
}


## Converse on documents
### Using `bytes` as a source

In [None]:
with open('input/AnyCompany_financial_10K.pdf', "rb") as file:
    doc_bytes = file.read()

In [None]:
response = bedrock.converse(
    modelId='us.amazon.nova-lite-v1:0',
    messages=[
       {
            "role": "user",
            "content": [
                {
                    "document": {
                        "format": "pdf",
                        "name": "AnyCompany_financial",
                        "source": {
                            "bytes": doc_bytes
                        }
                    }
                },
                {
                    "text": "What investments have AnyCompany made?"
                }
          ]
      }
    ]
)

print(json.dumps(response, indent=2))

### Using s3Location as a source

In [None]:
bucket_name = "batch-data-inference"

s3 = boto3.client('s3')

input_key = 'AnyCompany_financial_10K.pdf'
s3.upload_file('input/AnyCompany_financial_10K.pdf', bucket_name, input_key)

In [None]:
response = bedrock.converse(
    modelId='us.amazon.nova-lite-v1:0',
    messages=[
       {
            "role": "user",
            "content": [
                {
                    "document": {
                        "format": "pdf",
                        "name": "AnyCompany_financial",
                        "source": {
                            "s3Location": {
                                "uri": f"s3://{bucket_name}/{input_key}"
                            }
                        }
                    }
                },
                {
                    "text": "What property does AnyCompany own?"
                }
          ]
      }
    ]
)

print(json.dumps(response, indent=2))

## Conversation history

### Create the DynamoDB Table and set the time to live

In [None]:
table_name = "conversation-history"

dynamodb = boto3.client('dynamodb', region_name="us-east-1")
response = dynamodb.create_table(
    TableName=table_name,
    BillingMode='PAY_PER_REQUEST',
    AttributeDefinitions=[
        {
            'AttributeName': 'userId',
            'AttributeType': 'S'
        },
        {
            'AttributeName': 'timestamp',
            'AttributeType': 'N'
        }
    ],
    KeySchema=[
        {
            'AttributeName': 'userId',
            'KeyType': 'HASH'
        },
        {
            'AttributeName': 'timestamp',
            'KeyType': 'RANGE'
        }
    ]
)

# Wait for table to be active
waiter = dynamodb.get_waiter('table_exists')
waiter.wait(TableName=table_name)

ttl_response = dynamodb.update_time_to_live(
    TableName=table_name,
    TimeToLiveSpecification={
        'AttributeName': 'ttl',
        'Enabled': True
    }
)

### Create the functions to get the conversation history and store new messages

In [20]:
table_name = "conversation-history"
ddbclient = boto3.client('dynamodb', region_name = "us-east-1")

def get_conversation_history(user_id):
    # Pagination is required in case the conversation is more than 1MB.
    paginator = ddbclient.get_paginator('query')
    pages = paginator.paginate(
        TableName = table_name,
        KeyConditionExpression = 'userId = :val',
        ExpressionAttributeValues = {':val': {'S': user_id}}
    )
    messages = []
    for page in pages:
        for item in page.get('Items', []):
            messages.append({
                "role": item["role"]["S"],
                "content": [{ "text": item["message"]["S"] }]
            })
    return messages

In [21]:
dynamodbResource = boto3.resource('dynamodb', region_name = "us-east-1")
conversation_table = dynamodbResource.Table(table_name)

def store_message(user_id, message, role):
    now_in_seconds = int(time.time())
    expire_ttl = now_in_seconds + (1 * 24 * 60 * 60) # 1 day
    
    conversation_table.put_item(
        Item = {
            'userId': user_id,
            'timestamp': now_in_seconds,
            'message': message,
            'role': role,
            'ttl': expire_ttl
        }
    )


### Create the function to send messages to the LLM

In [22]:
def send_message(user_id, message):
    
    # Load the conversation history
    conversation_messages = get_conversation_history(user_id)

    # Store the new message
    store_message(user_id, message, "user")
    
    # Add the new message to the conversation
    conversation_messages.append({
        "role": "user",
        "content": [{ "text": message }]
    })

    # Send the conversation to the LLM
    model_response = bedrock.converse(
        modelId="amazon.nova-lite-v1:0",
        messages=conversation_messages,
        system = [{ 
            "text": "Please provide a helpful, conversational response based on the available information and conversation history." 
        }],
        inferenceConfig = {
            "maxTokens": 300,
            "temperature": 0.7,
            "topP": 0.9
        }
    )

    # Parse the message from the LLM, store it and return it
    assistant_msg = model_response['output']['message']['content'][0]['text']
    store_message(user_id, assistant_msg, "assistant")

    return conversation_messages, assistant_msg

### Sending messages

In [24]:
user_id = "johny"
convo, response = send_message(user_id, "My name is Johny and I live in Montreal. I like skiing and playing pickle ball. What other sport do you recommend I do?")
print(response)


Hi Johny! It's great to hear that you enjoy skiing and pickleball. Given your interests, I think you might enjoy trying out some other sports that combine physical activity, teamwork, and a bit of strategy. Here are a few suggestions:

1. **Ice Hockey**: Since you live in Montreal, ice hockey is a popular sport and a great way to stay active during the winter months. It's a fast-paced game that requires good coordination and teamwork.

2. **Soccer**: Soccer is a fantastic sport for improving your agility, endurance, and overall fitness. Montreal has a vibrant soccer community, and there are plenty of leagues and clubs you can join.

3. **Basketball**: If you enjoy a sport that involves quick movements and strategic plays, basketball could be a great fit. It's a high-energy game that can be played both indoors and outdoors.

4. **Tennis**: Given your interest in pickleball, tennis might be another sport you'd enjoy. It offers a great cardiovascular workout and helps improve hand-eye coo

In [25]:
convo, response = send_message(user_id, "What city do I live in?")
print(response)

You mentioned earlier that you live in Montreal. If you're referring to a different city, please let me know, and I can provide more tailored recommendations based on your location. Enjoy exploring new sports and activities!


In [None]:
print(json.dumps(convo, indent=2))