In [None]:
!pip install boto3
!pip install langchain
!pip install langchain_aws

# Module 1, Activity 1: Running an AWS-Hosted LLM with LangChain

In this demo notebook, we demonstrate how to use the boto3 Python SDK along with the abstractions available through the LangChain package to work with Amazon Bedrock Foundation Models.

In [1]:
import boto3
from langchain_aws import BedrockLLM
from langchain_aws import ChatBedrock

In [2]:
session = boto3.session.Session()
region = session.region_name

## Create Bedrock management connection

The Bedrock client is used as the control for Bedrock.  It can do things like list models, check availability, and manage configurations.  But it doesn't actually do anything with the models.

In [6]:
bedrock = boto3.client(
    service_name='bedrock',
    region_name = region,
)

## Listing available models

Here we can see which foundation models Bedrock as access to.  However, remember that not all of these are active for this workshop.

In [7]:
[models['modelId'] for models in bedrock.list_foundation_models()['modelSummaries']]

['amazon.titan-tg1-large',
 'amazon.titan-embed-g1-text-02',
 'amazon.titan-text-lite-v1:0:4k',
 'amazon.titan-text-lite-v1',
 'amazon.titan-text-express-v1:0:8k',
 'amazon.titan-text-express-v1',
 'amazon.nova-pro-v1:0',
 'amazon.nova-lite-v1:0',
 'amazon.nova-micro-v1:0',
 'amazon.titan-embed-text-v1:2:8k',
 'amazon.titan-embed-text-v1',
 'amazon.titan-embed-text-v2:0',
 'amazon.titan-embed-image-v1:0',
 'amazon.titan-embed-image-v1',
 'amazon.titan-image-generator-v1:0',
 'amazon.titan-image-generator-v1',
 'amazon.titan-image-generator-v2:0',
 'amazon.rerank-v1:0',
 'stability.stable-diffusion-xl-v1:0',
 'stability.stable-diffusion-xl-v1',
 'stability.sd3-large-v1:0',
 'stability.sd3-5-large-v1:0',
 'stability.stable-image-core-v1:0',
 'stability.stable-image-core-v1:1',
 'stability.stable-image-ultra-v1:0',
 'stability.stable-image-ultra-v1:1',
 'anthropic.claude-3-5-sonnet-20241022-v2:0:18k',
 'anthropic.claude-3-5-sonnet-20241022-v2:0:51k',
 'anthropic.claude-3-5-sonnet-20241022

## Creating Bedrock runtime client

In this next code block, a dedicated boto3 client for the bedrock-runtime service is created.  This client is responsible for executing runtime operations, such as invoking a model with a given prompt.

In [3]:
bedrock_runtime = boto3.client(
    service_name='bedrock-runtime',
    region_name=region,
)

## Initializing BedrockLLM and invoking a model

Here, the BedrockLLM class from the langchain_aws package is instantiated.  This class serves as a high-level wrapper to interface with AWS-hosted LLMs.
The initialization parameters include the model ID (in this case, "amazon.titan-tg1-large"), region, and the necessary AWS credentials.  Once the instance is created, the invoke method is used to send a prompt ("What is the recipe of mayonnaise?") to the model.  This section demonstrates the fundamental workflow: setting up the model wrapper and making a basic invocation call to test the model’s response, providing a concrete example of how to interact with AWS-hosted generative AI models using LangChain.

Try several different prompts here to see what different types of answers you can get!

In [8]:
llm = BedrockLLM(
    model_id="amazon.titan-tg1-large",
    region_name=region,
)

llm.invoke(input='What is the recipe of mayonnaise?')

'\nHere is the recipe of mayonnaise:\n\nIngredients:\n- 2 egg yolks\n- 1 tablespoon Dijon mustard\n- 1 tablespoon lemon juice\n- 1 teaspoon salt\n- 1 cup vegetable oil\n\nInstructions:\n1. In a blender or food processor, combine the egg yolks, mustard, lemon juice, and salt. Blend until smooth.\n2. With the blender running, slowly drizzle in the vegetable oil until the mixture thickens and forms a thick mayonnaise.'

## Introducing ChatBedrock

BedrockLLM is designed for single-turn, prompt-based interactions where you provide the prompt ("What is the recipe for mayonnaise?") and the model generates an output in one go.  This is fine for simple things, but when you need to have more sophisticated interactions you want something that supports chat-like exchanges where the model can manage context over several turns of dialogue.  Additionally, not all of the available models, including more sophisticated models like Anthropic's Claude 3 Sonnet below, are supported by BedrockLLM.  Hence, we have the more sophisticated ChatBedrock, as shown below.

Also note that the output of ChatBedrock contains much more information than just a text output.

In [9]:
chat_llm = ChatBedrock(
    model_id="anthropic.claude-3-sonnet-20240229-v1:0",
    region_name=region,
)

chat_llm.invoke(input="What is the recipe of mayonnaise?")

AIMessage(content='Here is a basic recipe for homemade mayonnaise:\n\nIngredients:\n- 1 egg yolk\n- 1 tablespoon lemon juice or white wine vinegar\n- 1 teaspoon Dijon mustard (optional)\n- 1/4 teaspoon salt\n- 3/4 cup vegetable oil or mild olive oil\n\nInstructions:\n\n1. In a medium bowl, whisk together the egg yolk, lemon juice/vinegar, mustard (if using), and salt.\n\n2. Very slowly, while whisking constantly, start adding the oil a few drops at a time. Once an emulsion starts to form, you can add the oil in a very thin steady stream, whisking vigorously the whole time.\n\n3. Continue whisking and adding oil until all the oil is incorporated and the mayonnaise has thickened to your desired consistency. \n\n4. Taste and adjust seasoning if needed by adding more lemon juice, vinegar or salt.\n\n5. Transfer to an airtight container and refrigerate for up to 1 week.\n\nThe key is adding the oil very slowly while whisking constantly to allow a stable emulsion to form between the egg yolk

## Temperature

Temperature is the thing that gives models creativity.  It controls the randomness of the model's responses.  Setting it to 0.0 (the minimum) typically results in a more deterministic and consistent output while setting it to 1.0 (the maximum) results in more creative responses.  Experiment with the temperature setting in the following cell to see how the output changes as a result.

In [10]:
chat_llm = ChatBedrock(
    model_id="anthropic.claude-3-sonnet-20240229-v1:0",
    region_name=region,
    temperature=0.0
)

chat_llm.invoke(input="What is the recipe of mayonnaise?")

AIMessage(content='Here is a basic recipe for homemade mayonnaise:\n\nIngredients:\n- 1 egg yolk\n- 1 tablespoon lemon juice or white wine vinegar\n- 1/2 teaspoon Dijon mustard (optional)\n- 1/4 teaspoon salt\n- 3/4 cup vegetable oil or mild olive oil\n\nInstructions:\n\n1. In a medium bowl, whisk together the egg yolk, lemon juice/vinegar, mustard (if using), and salt.\n\n2. Very slowly, while whisking constantly, drizzle in the oil just a few drops at a time at first. This allows the mixture to emulsify properly.\n\n3. Once about 1/4 cup of oil has been incorporated, you can start adding the oil in a thin steady stream while whisking vigorously. \n\n4. Continue whisking and adding oil until all the oil is incorporated and the mayonnaise has thickened to your desired consistency.\n\n5. Taste and adjust seasoning if needed by adding more lemon juice, salt, etc.\n\n6. Transfer to an airtight container and refrigerate for up to 1 week.\n\nThe key is adding the oil very slowly at first wh

## Limiting the number of tokens returned

The cost of using an LLM is dependent on how many tokens are sent back and forth with the model.  The `max_tokens` parameter can provide a limit on how many total tokens are returned.  Limiting the token count can be useful when you need to ensure that the responses remain concise or when working within strict output size constraints.  Experiment with a few different values for this to see how the output changes.

In [11]:
chat_llm = ChatBedrock(
    model_id="anthropic.claude-3-sonnet-20240229-v1:0",
    region_name=region,
    temperature=0.0,
    max_tokens=10
)

chat_llm.invoke(input="What is the recipe of mayonnaise?")

AIMessage(content='Here is a basic recipe for homemade may', additional_kwargs={'usage': {'prompt_tokens': 16, 'completion_tokens': 10, 'total_tokens': 26}, 'stop_reason': 'max_tokens', 'model_id': 'anthropic.claude-3-sonnet-20240229-v1:0'}, response_metadata={'usage': {'prompt_tokens': 16, 'completion_tokens': 10, 'total_tokens': 26}, 'stop_reason': 'max_tokens', 'model_id': 'anthropic.claude-3-sonnet-20240229-v1:0'}, id='run-1f4ae27f-263d-4c97-bd83-e357a559802e-0', usage_metadata={'input_tokens': 16, 'output_tokens': 10, 'total_tokens': 26})