# Amazon Bedrock boto3 Setup

> *This notebook should work well with the **`Data Science 3.0`** kernel in SageMaker Studio*

---

In this demo notebook, we demonstrate how to use the [`boto3` Python SDK](https://boto3.amazonaws.com/v1/documentation/api/latest/index.html) to work with [Amazon Bedrock](https://aws.amazon.com/bedrock/) Foundation Models.

---

## Prerequisites

Run the cells in this section to install the packages needed by the notebooks in this workshop. ⚠️ You will see pip dependency errors, you can safely ignore these errors. ⚠️

IGNORE ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.

In [2]:
%pip install --no-build-isolation --force-reinstall \
    "boto3>=1.28.57" \
    "awscli>=1.29.57" \
    "botocore>=1.31.57"


Collecting boto3>=1.28.57
  Obtaining dependency information for boto3>=1.28.57 from https://files.pythonhosted.org/packages/e4/76/d98acdf42e6acb2c17cd496005bbc2285153819befe8528673b312bd46de/boto3-1.33.8-py3-none-any.whl.metadata
  Using cached boto3-1.33.8-py3-none-any.whl.metadata (6.7 kB)
Collecting awscli>=1.29.57
  Obtaining dependency information for awscli>=1.29.57 from https://files.pythonhosted.org/packages/77/91/c820310657eebce63fd00de2e20f2986017802add67e8e757b0aa0095d56/awscli-1.31.8-py3-none-any.whl.metadata
  Using cached awscli-1.31.8-py3-none-any.whl.metadata (11 kB)
Collecting botocore>=1.31.57
  Obtaining dependency information for botocore>=1.31.57 from https://files.pythonhosted.org/packages/9c/ef/c8e456190ddc75c842a7c5e0edb7875b75759980b2a985b7fdff038b9059/botocore-1.33.8-py3-none-any.whl.metadata
  Using cached botocore-1.33.8-py3-none-any.whl.metadata (6.1 kB)
Collecting jmespath<2.0.0,>=0.7.1 (from boto3>=1.28.57)
  Using cached jmespath-1.0.1-py3-none-any.whl 

### text

In [3]:
%pip install --quiet \
    langchain==0.0.309 \
    matplotlib


[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.2.1[0m[39;49m -> [0m[32;49m23.3.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


In [4]:
# restart kernel
from IPython.core.display import HTML
HTML("<script>Jupyter.notebook.kernel.restart()</script>")

---

## Create the boto3 client

Interaction with the Bedrock API is done via the AWS SDK for Python: [boto3](https://boto3.amazonaws.com/v1/documentation/api/latest/index.html).

Depending on your environment, you might need to customize the setup when creating your Bedrock service client. To help with this, we've provided a `get_bedrock_client()` utility method that supports passing in different options. You can find the implementation in [../utils/bedrock.py](../utils/bedrock.py)

#### Use different clients
The boto3 provides different clients for Amazon Bedrock to perform different actions. The actions for [`InvokeModel`](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModel.html) and [`InvokeModelWithResponseStream`](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModelWithResponseStream.html) are supported by Amazon Bedrock Runtime where as other operations, such as [ListFoundationModels](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_ListFoundationModels.html), are handled via [Amazon Bedrock client](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_Operations_Amazon_Bedrock.html).

The `get_bedrock_client()` method accepts `runtime` (default=True) parameter to return either `bedrock` or `bedrock-runtime` client.

#### Use the default credential chain

If you are running this notebook from [Amazon Sagemaker Studio](https://aws.amazon.com/sagemaker/studio/) and your Sagemaker Studio [execution role](https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-roles.html) has permissions to access Bedrock you can just run the cells below as-is. This is also the case if you are running these notebooks from a computer whose default AWS credentials have access to Bedrock.

#### Use a different AWS Region

If you're running this notebook from your own computer or a SageMaker notebook in a different AWS Region from where Bedrock is set up, you can un-comment the `os.environ['AWS_DEFAULT_REGION']` line below and specify the region to use.

#### Use a specific profile

In case you're running this notebook from your own computer where you have setup the AWS CLI with multiple profiles, and the profile which has access to Bedrock is not the default one, you can un-comment the `os.environ['AWS_PROFILE']` line below and specify the profile to use.

#### Use a different role

In case you or your company has setup a specific, separate [IAM Role](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles.html) to access Bedrock, you can specify it by un-commenting the `os.environ['BEDROCK_ASSUME_ROLE']` line below. Ensure that your current user or role have permissions to [assume](https://docs.aws.amazon.com/STS/latest/APIReference/API_AssumeRole.html) such role.

#### A note about `langchain`

The Bedrock classes provided by `langchain` create a Bedrock boto3 client by default. To customize your Bedrock configuration, we recommend to explicitly create the Bedrock client using the method below, and pass it to the [`langchain.Bedrock`](https://python.langchain.com/docs/integrations/llms/bedrock) class instantiation method using `client=boto3_bedrock`

In [7]:
import json
import os
import sys

import boto3
import botocore

import bedrock


# ---- ⚠️ Un-comment and edit the below lines as needed for your AWS setup ⚠️ ----

# os.environ["AWS_DEFAULT_REGION"] = "<REGION_NAME>"  # E.g. "us-east-1"
# os.environ["AWS_PROFILE"] = "<YOUR_PROFILE>"
# os.environ["BEDROCK_ASSUME_ROLE"] = "<YOUR_ROLE_ARN>"  # E.g. "arn:aws:..."


boto3_bedrock = bedrock.get_bedrock_client(
    assumed_role=os.environ.get("BEDROCK_ASSUME_ROLE", None),
    region=os.environ.get("AWS_DEFAULT_REGION", None),
    runtime=False
)


Create new client
  Using region: us-east-1
boto3 Bedrock client successfully created!
bedrock(https://bedrock.us-east-1.amazonaws.com)


#### Validate the connection

We can check the client works by trying out the `list_foundation_models()` method, which will tell us all the models available for us to use 

In [8]:
boto3_bedrock.list_foundation_models()


{'ResponseMetadata': {'RequestId': 'f907a0e4-e3d3-42e2-97c7-41adb23788ed',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'date': 'Wed, 06 Dec 2023 07:28:23 GMT',
   'content-type': 'application/json',
   'content-length': '17836',
   'connection': 'keep-alive',
   'x-amzn-requestid': 'f907a0e4-e3d3-42e2-97c7-41adb23788ed'},
  'RetryAttempts': 0},
 'modelSummaries': [{'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-tg1-large',
   'modelId': 'amazon.titan-tg1-large',
   'modelName': 'Titan Text Large',
   'providerName': 'Amazon',
   'inputModalities': ['TEXT'],
   'outputModalities': ['TEXT'],
   'responseStreamingSupported': True,
   'customizationsSupported': [],
   'inferenceTypesSupported': ['ON_DEMAND'],
   'modelLifecycle': {'status': 'ACTIVE'}},
  {'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-image-generator-v1:0',
   'modelId': 'amazon.titan-image-generator-v1:0',
   'modelName': 'Titan Image Generator G1',
   'providerName': 'Amazon',

---

## Common inference parameter definitions

### Randomness and Diversity

Foundation models support the following parameters to control randomness and diversity in the 
response.

**Temperature** – Large language models use probability to construct the words in a sequence. For any 
given next word, there is a probability distribution of options for the next word in the sequence. When 
you set the temperature closer to zero, the model tends to select the higher-probability words. When 
you set the temperature further away from zero, the model may select a lower-probability word.

In technical terms, the temperature modulates the probability density function for the next tokens, 
implementing the temperature sampling technique. This parameter can deepen or flatten the density 
function curve. A lower value results in a steeper curve with more deterministic responses, and a higher 
value results in a flatter curve with more random responses.

**Top K** – Temperature defines the probability distribution of potential words, and Top K defines the cut 
off where the model no longer selects the words. For example, if K=50, the model selects from 50 of the 
most probable words that could be next in a given sequence. This reduces the probability that an unusual 
word gets selected next in a sequence.
In technical terms, Top K is the number of the highest-probability vocabulary tokens to keep for Top-
K-filtering - This limits the distribution of probable tokens, so the model chooses one of the highest-
probability tokens.

**Top P** – Top P defines a cut off based on the sum of probabilities of the potential choices. If you set Top 
P below 1.0, the model considers the most probable options and ignores less probable ones. Top P is 
similar to Top K, but instead of capping the number of choices, it caps choices based on the sum of their 
probabilities.
For the example prompt "I hear the hoof beats of ," you may want the model to provide "horses," 
"zebras" or "unicorns" as the next word. If you set the temperature to its maximum, without capping 
Top K or Top P, you increase the probability of getting unusual results such as "unicorns." If you set the 
temperature to 0, you increase the probability of "horses." If you set a high temperature and set Top K or 
Top P to the maximum, you increase the probability of "horses" or "zebras," and decrease the probability 
of "unicorns."

### Length

The following parameters control the length of the generated response.

**Response length** – Configures the minimum and maximum number of tokens to use in the generated 
response.

**Length penalty** – Length penalty optimizes the model to be more concise in its output by penalizing 
longer responses. Length penalty differs from response length as the response length is a hard cut off for 
the minimum or maximum response length.

In technical terms, the length penalty penalizes the model exponentially for lengthy responses. 0.0 
means no penalty. Set a value less than 0.0 for the model to generate longer sequences, or set a value 
greater than 0.0 for the model to produce shorter sequences.

### Repetitions

The following parameters help control repetition in the generated response.

**Repetition penalty (presence penalty)** – Prevents repetitions of the same words (tokens) in responses. 
1.0 means no penalty. Greater than 1.0 decreases repetition.

In [9]:
bedrock_runtime = bedrock.get_bedrock_client(
    assumed_role=os.environ.get("BEDROCK_ASSUME_ROLE", None),
    region=os.environ.get("AWS_DEFAULT_REGION", None)
)


Create new client
  Using region: us-east-1
boto3 Bedrock client successfully created!
bedrock-runtime(https://bedrock-runtime.us-east-1.amazonaws.com)


In [10]:
from urllib.request import urlopen
from bs4 import BeautifulSoup

url = "https://docs.aws.amazon.com/sagemaker/latest/dg/sm-domain.html"

html = urlopen(url).read()
soup = BeautifulSoup(html, features="html.parser")

# kill all script and style elements
for script in soup(["script", "style"]):
    script.extract()    # rip it out

# get text
text = soup.get_text()

# # break into lines and remove leading and trailing space on each
# lines = (line.strip() for line in text.splitlines())
# # break multi-headlines into a line each
# chunks = (phrase.strip() for line in lines for phrase in line.split("  "))
# # drop blank lines
# text = '\n'.join(chunk for chunk in chunks if chunk)

print(text)


Amazon SageMaker Domain - Amazon SageMakerAmazon SageMaker Domain - Amazon SageMakerAWSDocumentationAmazon SageMakerDeveloper GuideMaintenance of applicationsAmazon SageMaker DomainAmazon SageMaker Domain supports SageMaker machine learning (ML) environments. A SageMaker Domain is
        composed of the following entities. For onboarding steps to create a Domain, see Amazon SageMaker Domain overview.


Domain: An Amazon SageMaker Domain consists of an associated
                Amazon Elastic File System (Amazon EFS) volume; a list of authorized users; and a variety of security,
                application, policy, and Amazon Virtual Private Cloud (Amazon VPC) configurations. Users within a Domain
                can share notebook files and other artifacts with each other. An account can have
                multiple Domains. For more information about multiple Domains, see Multiple Domains Overview.


UserProfile: A user profile represents a single user within a
                Dom

In [11]:
# prompt data
# text = "Amazon SageMaker Domain supports SageMaker machine learning (ML) environments. A SageMaker Domain is composed of the following entities. For onboarding steps to create a Domain, see Onboard to Amazon SageMaker Domain. Domain: An Amazon SageMaker Domain consists of an associated Amazon Elastic File System (Amazon EFS) volume; a list of authorized users; and a variety of security, application, policy, and Amazon Virtual Private Cloud (Amazon VPC) configurations. Users within a Domain can share notebook files and other artifacts with each other. An account can have multiple Domains. For more information about multiple Domains, see Multiple Domains Overview. UserProfile: A user profile represents a single user within a Domain. It is the main way to reference a user for the purposes of sharing, reporting, and other user-oriented features. This entity is created when a user onboards to the Amazon SageMaker Domain. For more information about user profiles, see Domain User Profiles. shared space: A shared space consists of a shared JupyterServer application and shared directory. All users within the Domain have access to the shared space. All user profiles in a Domain have access to all shared spaces in the Domain. For more information about shared spaces, see Collaborate with shared spaces. App: An app represents an application that supports the reading and execution experience of the user’s notebooks, terminals, and consoles. The type of app can be JupyterServer, KernelGateway, RStudioServerPro, or RSession. A user may have multiple apps active simultaneously. The following tables describe the status values for the Domain, UserProfile, shared space, and App entities. Where applicable, they also give troubleshooting steps."
kind = "diagram"  # "diagram" or "flowchart"
direction = "LR"   # or "TD"

In [None]:
# # If you'd like to try your own prompt, edit this parameter!
# prompt = f"""

# Here are some documents for you to reference for your task:

# <text>
# {text}
# </text>

# Human: 
# Create a {kind} summarizing the content of the previous text, using Mermaid notation. 
# The {kind} should capture the content and relationships in the given text, and result into a rich visual representation.
# Someone who would only view this Mermaid {kind}, should understand the gist of the given text. 
# The Mermaid {kind} should follow all the correct notation rules and should compile without any errors.
# Use the following specifications for the generated Mermaid {kind}:

# <specifications>
# 1. Use different colors, shapes or groups to represent different concepts in the given text.
# 2. Include the Mermaid {kind} inside <mermaid> tags.
# 3. NEVER write anything before the <mermaid> block.
# 4. Use only information from within the given text. Don't make up new information.
# </specifications>

# After you are done generating the Mermaid {kind}, and after the </mermaid> tag, check your work carefully to make sure there are no mistakes, errors, or inconsistencies. 
# If there are errors, list those errors in <error> tags, then generate a new version with those errors fixed. 
# If there are no errors, write "CHECKED: NO ERRORS" in <error> tags.


# Assistant:
# """

# print(prompt)

In [12]:
# # If you'd like to try your own prompt, edit this parameter!
# prompt = f"""

# Here are some documents for you to reference for your task:

# <text>
# {text}
# </text>

# Human: 
# Create a {kind} summarizing the content of the previous text, using Mermaid notation. 
# The {kind} should capture the main gist of the given text, without too many low-ldevel details. 
# Someone who would only view this Mermaid {kind}, should understand the gist of the given text. 
# The Mermaid {kind} should follow all the correct notation rules and should compile without any errors.
# Use the following specifications for the generated Mermaid {kind}:

# <specifications>
# 1. Use different colors, shapes or groups to represent different concepts in the given text.
# 2. Include the Mermaid {kind} inside <mermaid> tags.
# 3. NEVER write anything before the <mermaid> block.
# 4. Use only information from within the given text. Don't make up new information.
# </specifications>

# Mermaid {kind}:

# Assistant:
# """

# print(prompt)



Here are some documents for you to reference for your task:

<text>

Amazon SageMaker Domain - Amazon SageMakerAmazon SageMaker Domain - Amazon SageMakerAWSDocumentationAmazon SageMakerDeveloper GuideMaintenance of applicationsAmazon SageMaker DomainAmazon SageMaker Domain supports SageMaker machine learning (ML) environments. A SageMaker Domain is
        composed of the following entities. For onboarding steps to create a Domain, see Amazon SageMaker Domain overview.


Domain: An Amazon SageMaker Domain consists of an associated
                Amazon Elastic File System (Amazon EFS) volume; a list of authorized users; and a variety of security,
                application, policy, and Amazon Virtual Private Cloud (Amazon VPC) configurations. Users within a Domain
                can share notebook files and other artifacts with each other. An account can have
                multiple Domains. For more information about multiple Domains, see Multiple Domains Overview.


UserProfile

In [15]:
# If you'd like to try your own prompt, edit this parameter!
prompt = f"""

Here are some documents for you to reference for your task:

<text>
{text}
</text>

Human: 
Summarize the given text. Then convert the summary to a {kind} using Mermaid notation. 
The {kind} should capture the main gist of the summary, without too many low-level details. 
Someone who would only view this Mermaid {kind}, should understand the gist of the summary. 
The Mermaid {kind} should follow all the correct notation rules and should compile without any errors.
Use the following specifications for the generated Mermaid {kind}:

<specifications>
1. Use different colors, shapes or groups to represent different concepts in the given text.
2. Include the Mermaid {kind} inside <mermaid> tags.
3. NEVER write anything before the <mermaid> block.
4. Use only information from within the given text. Don't make up new information.
</specifications>

Mermaid {kind}:

Assistant:
"""

print(prompt)



Here are some documents for you to reference for your task:

<text>

Amazon SageMaker Domain - Amazon SageMakerAmazon SageMaker Domain - Amazon SageMakerAWSDocumentationAmazon SageMakerDeveloper GuideMaintenance of applicationsAmazon SageMaker DomainAmazon SageMaker Domain supports SageMaker machine learning (ML) environments. A SageMaker Domain is
        composed of the following entities. For onboarding steps to create a Domain, see Amazon SageMaker Domain overview.


Domain: An Amazon SageMaker Domain consists of an associated
                Amazon Elastic File System (Amazon EFS) volume; a list of authorized users; and a variety of security,
                application, policy, and Amazon Virtual Private Cloud (Amazon VPC) configurations. Users within a Domain
                can share notebook files and other artifacts with each other. An account can have
                multiple Domains. For more information about multiple Domains, see Multiple Domains Overview.


UserProfile

In [16]:
body = json.dumps(
    {
        "prompt": prompt, 
        "max_tokens_to_sample": 500,
        "temperature": 0.9,
        "top_k": 250,
        "top_p": 1,
        "stop_sequences": ["\n\nHuman:"]
    }
)
modelId = "anthropic.claude-v2:1"  # change this to use a different version from the model provider
accept = "application/json"
contentType = "application/json"

try:

    response = bedrock_runtime.invoke_model(
        body=body, 
        modelId=modelId, 
        accept=accept, 
        contentType=contentType
    )
    response_body = json.loads(response.get("body").read())

    print(response_body.get("completion"))

except botocore.exceptions.ClientError as error:

    if error.response['Error']['Code'] == 'AccessDeniedException':
           print(f"\x1b[41m{error.response['Error']['Message']}\
                \nTo troubeshoot this issue please refer to the following resources.\
                 \nhttps://docs.aws.amazon.com/IAM/latest/UserGuide/troubleshoot_access-denied.html\
                 \nhttps://docs.aws.amazon.com/bedrock/latest/userguide/security-iam.html\x1b[0m\n")

    else:
        raise error


 Here is a summary of the key points from the text, represented as a Mermaid diagram:

<mermaid>
graph TD;
    A[Domain]---|consists of|B(Amazon EFS volume);
    A---|contains|C[UserProfiles];
    A---|contains|D[Shared spaces];
    A---|contains|E[Apps];

    B---|used for|F(Sharing files);

    C---|represents|G[Single user];
    C---|has access to|D;

    D---|contains|H[JupyterServer app];
    D---|contains|F;
    
    E---|can be|I[JupyterServer];
    E---|can be|J[KernelGateway];
    E---|can be|K[RStudioServerPro];
    E---|can be|L[RSession];
    
    M["SageMaker updates <br/>apps every <br/> 90 days"]---|causes|N[App status <br/> to Pending];
    N---|changes back to|O[InService];
    
</mermaid>

This Mermaid diagram summarizes the key concepts from the text - Domain, UserProfiles, Shared spaces, and Apps. It shows their relationships and some of their key properties. The diagram uses different shapes and colors to distinguish between the different concepts. It also captures

In [17]:
import base64
from IPython.display import Image, display
import matplotlib.pyplot as plt

# parsing completion
def find_between( s, first, last ):
    try:
        start = s.index( first ) + len( first )
        end = s.index( last, start )
        return s[start:end]
    except ValueError:
        return ""

# display graph
def mm(graph):
    graphbytes = graph.encode("utf8")
    base64_bytes = base64.b64encode(graphbytes)
    base64_string = base64_bytes.decode("ascii")
    display(Image(url="https://mermaid.ink/img/" + base64_string))

    
def display_graph(llm_completion):
    str_mermaid_graph = find_between(llm_completion, "<mermaid>", "</mermaid>")
    print(str_mermaid_graph)
    mm(str_mermaid_graph)

    
display_graph(response_body.get("completion"))


graph TD;
    A[Domain]---|consists of|B(Amazon EFS volume);
    A---|contains|C[UserProfiles];
    A---|contains|D[Shared spaces];
    A---|contains|E[Apps];

    B---|used for|F(Sharing files);

    C---|represents|G[Single user];
    C---|has access to|D;

    D---|contains|H[JupyterServer app];
    D---|contains|F;
    
    E---|can be|I[JupyterServer];
    E---|can be|J[KernelGateway];
    E---|can be|K[RStudioServerPro];
    E---|can be|L[RSession];
    
    M["SageMaker updates <br/>apps every <br/> 90 days"]---|causes|N[App status <br/> to Pending];
    N---|changes back to|O[InService];
    

