# Amazon Bedrock boto3 Setup

> *This notebook should work well with the **`Data Science 3.0`** kernel in SageMaker Studio*

---

In this demo notebook, we demonstrate how to use the [`boto3` Python SDK](https://boto3.amazonaws.com/v1/documentation/api/latest/index.html) to work with [Amazon Bedrock](https://aws.amazon.com/bedrock/) Foundation Models.

---

## Prerequisites

Run the cells in this section to install the packages needed by the notebooks in this workshop. ⚠️ You will see pip dependency errors, you can safely ignore these errors. ⚠️

IGNORE ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.

In [2]:
%pip install --no-build-isolation --force-reinstall \
    "boto3>=1.28.57" \
    "awscli>=1.29.57" \
    "botocore>=1.31.57"


Collecting boto3>=1.28.57
  Downloading boto3-1.34.4-py3-none-any.whl.metadata (6.6 kB)
Collecting awscli>=1.29.57
  Downloading awscli-1.32.4-py3-none-any.whl.metadata (11 kB)
Collecting botocore>=1.31.57
  Downloading botocore-1.34.4-py3-none-any.whl.metadata (5.6 kB)
Collecting jmespath<2.0.0,>=0.7.1 (from boto3>=1.28.57)
  Downloading jmespath-1.0.1-py3-none-any.whl (20 kB)
Collecting s3transfer<0.10.0,>=0.9.0 (from boto3>=1.28.57)
  Downloading s3transfer-0.9.0-py3-none-any.whl.metadata (1.7 kB)
Collecting docutils<0.17,>=0.10 (from awscli>=1.29.57)
  Downloading docutils-0.16-py2.py3-none-any.whl (548 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m548.2/548.2 kB[0m [31m4.9 MB/s[0m eta [36m0:00:00[0m00:01[0m
[?25hCollecting PyYAML<6.1,>=3.10 (from awscli>=1.29.57)
  Downloading PyYAML-6.0.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (2.1 kB)
Collecting colorama<0.4.5,>=0.2.5 (from awscli>=1.29.57)
  Downloading colorama-0.4.4-py2

In [None]:
%pip install --quiet \
    langchain==0.0.309 \
    matplotlib


In [3]:
# restart kernel
from IPython.core.display import HTML
HTML("<script>Jupyter.notebook.kernel.restart()</script>")

---

## Create the boto3 client

Interaction with the Bedrock API is done via the AWS SDK for Python: [boto3](https://boto3.amazonaws.com/v1/documentation/api/latest/index.html).

Depending on your environment, you might need to customize the setup when creating your Bedrock service client. To help with this, we've provided a `get_bedrock_client()` utility method that supports passing in different options. You can find the implementation in [../utils/bedrock.py](../utils/bedrock.py)

#### Use different clients
The boto3 provides different clients for Amazon Bedrock to perform different actions. The actions for [`InvokeModel`](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModel.html) and [`InvokeModelWithResponseStream`](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModelWithResponseStream.html) are supported by Amazon Bedrock Runtime where as other operations, such as [ListFoundationModels](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_ListFoundationModels.html), are handled via [Amazon Bedrock client](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_Operations_Amazon_Bedrock.html).

The `get_bedrock_client()` method accepts `runtime` (default=True) parameter to return either `bedrock` or `bedrock-runtime` client.

#### Use the default credential chain

If you are running this notebook from [Amazon Sagemaker Studio](https://aws.amazon.com/sagemaker/studio/) and your Sagemaker Studio [execution role](https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-roles.html) has permissions to access Bedrock you can just run the cells below as-is. This is also the case if you are running these notebooks from a computer whose default AWS credentials have access to Bedrock.

#### Use a different AWS Region

If you're running this notebook from your own computer or a SageMaker notebook in a different AWS Region from where Bedrock is set up, you can un-comment the `os.environ['AWS_DEFAULT_REGION']` line below and specify the region to use.

#### Use a specific profile

In case you're running this notebook from your own computer where you have setup the AWS CLI with multiple profiles, and the profile which has access to Bedrock is not the default one, you can un-comment the `os.environ['AWS_PROFILE']` line below and specify the profile to use.

#### Use a different role

In case you or your company has setup a specific, separate [IAM Role](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles.html) to access Bedrock, you can specify it by un-commenting the `os.environ['BEDROCK_ASSUME_ROLE']` line below. Ensure that your current user or role have permissions to [assume](https://docs.aws.amazon.com/STS/latest/APIReference/API_AssumeRole.html) such role.

#### A note about `langchain`

The Bedrock classes provided by `langchain` create a Bedrock boto3 client by default. To customize your Bedrock configuration, we recommend to explicitly create the Bedrock client using the method below, and pass it to the [`langchain.Bedrock`](https://python.langchain.com/docs/integrations/llms/bedrock) class instantiation method using `client=boto3_bedrock`

In [4]:
import json
import os
import sys

import boto3
import botocore

import bedrock


# ---- ⚠️ Un-comment and edit the below lines as needed for your AWS setup ⚠️ ----

# os.environ["AWS_DEFAULT_REGION"] = "<REGION_NAME>"  # E.g. "us-east-1"
# os.environ["AWS_PROFILE"] = "<YOUR_PROFILE>"
# os.environ["BEDROCK_ASSUME_ROLE"] = "<YOUR_ROLE_ARN>"  # E.g. "arn:aws:..."

bedrock_runtime = bedrock.get_bedrock_client(
    assumed_role=os.environ.get("BEDROCK_ASSUME_ROLE", None),
    region=os.environ.get("AWS_DEFAULT_REGION", None),
    runtime=True
)



Create new client
  Using region: us-east-1
boto3 Bedrock client successfully created!
bedrock-runtime(https://bedrock-runtime.us-east-1.amazonaws.com)


---

## Common inference parameter definitions

### Randomness and Diversity

Foundation models support the following parameters to control randomness and diversity in the 
response.

**Temperature** – Large language models use probability to construct the words in a sequence. For any 
given next word, there is a probability distribution of options for the next word in the sequence. When 
you set the temperature closer to zero, the model tends to select the higher-probability words. When 
you set the temperature further away from zero, the model may select a lower-probability word.

In technical terms, the temperature modulates the probability density function for the next tokens, 
implementing the temperature sampling technique. This parameter can deepen or flatten the density 
function curve. A lower value results in a steeper curve with more deterministic responses, and a higher 
value results in a flatter curve with more random responses.

**Top K** – Temperature defines the probability distribution of potential words, and Top K defines the cut 
off where the model no longer selects the words. For example, if K=50, the model selects from 50 of the 
most probable words that could be next in a given sequence. This reduces the probability that an unusual 
word gets selected next in a sequence.
In technical terms, Top K is the number of the highest-probability vocabulary tokens to keep for Top-
K-filtering - This limits the distribution of probable tokens, so the model chooses one of the highest-
probability tokens.

**Top P** – Top P defines a cut off based on the sum of probabilities of the potential choices. If you set Top 
P below 1.0, the model considers the most probable options and ignores less probable ones. Top P is 
similar to Top K, but instead of capping the number of choices, it caps choices based on the sum of their 
probabilities.
For the example prompt "I hear the hoof beats of ," you may want the model to provide "horses," 
"zebras" or "unicorns" as the next word. If you set the temperature to its maximum, without capping 
Top K or Top P, you increase the probability of getting unusual results such as "unicorns." If you set the 
temperature to 0, you increase the probability of "horses." If you set a high temperature and set Top K or 
Top P to the maximum, you increase the probability of "horses" or "zebras," and decrease the probability 
of "unicorns."

### Length

The following parameters control the length of the generated response.

**Response length** – Configures the minimum and maximum number of tokens to use in the generated 
response.

**Length penalty** – Length penalty optimizes the model to be more concise in its output by penalizing 
longer responses. Length penalty differs from response length as the response length is a hard cut off for 
the minimum or maximum response length.

In technical terms, the length penalty penalizes the model exponentially for lengthy responses. 0.0 
means no penalty. Set a value less than 0.0 for the model to generate longer sequences, or set a value 
greater than 0.0 for the model to produce shorter sequences.

### Repetitions

The following parameters help control repetition in the generated response.

**Repetition penalty (presence penalty)** – Prevents repetitions of the same words (tokens) in responses. 
1.0 means no penalty. Greater than 1.0 decreases repetition.

In [5]:
from urllib.request import urlopen
from bs4 import BeautifulSoup

def get_html_text(url, postprocess=False, print_text=False):
    # return the text from an html page

    html = urlopen(url).read()
    soup = BeautifulSoup(html, features="html.parser")

    # kill all script and style elements
    for script in soup(["script", "style"]):
        script.extract()    # rip it out

    # get text
    text = soup.get_text()

    if postprocess is True:
        # break into lines and remove leading and trailing space on each
        lines = (line.strip() for line in text.splitlines())
        # break multi-headlines into a line each
        chunks = (phrase.strip() for line in lines for phrase in line.split("  "))
        # drop blank lines
        text = '\n'.join(chunk for chunk in chunks if chunk)

    if print_text is True:
        print(text)
    
    return text
    

In [8]:
# import Mermaid notation as a context

context_mermaid_notation = get_html_text(
    url="https://mermaid.js.org/syntax/flowchart.html", 
    postprocess=False, 
    print_text=False
)


In [13]:

html_text = context_mermaid_notation = get_html_text(
    url="https://docs.aws.amazon.com/sagemaker/latest/dg/onboard-vpc.html", 
    postprocess=False, 
    print_text=False
)


In [14]:
# prompt data
kind = "flowchart"  # "mindmap" or "flowchart"
orientation = "LR"   # "LR" or "TD"

In [30]:
prompt = f"""\n\nHuman: 

Here is a text for you to reference for the following task:
<text>
{html_text}
</text>

Task: Summarize the given text and provide the summary inside <summary> tags. 
Then convert the summary to a {kind} using Mermaid notation. 

<mermaid_notation>
{context_mermaid_notation}
</mermaid_notation>

The {kind} should capture the main gist of the summary, without too many low-level details. 
Someone who would only view the Mermaid {kind}, should understand the gist of the summary. 
The Mermaid {kind} should follow all the correct notation rules and should compile without any errors.
Use the following specifications for the generated Mermaid {kind}:

<specifications>
1. Use different colors, shapes or groups to represent different concepts in the given text.
2. The orientation of the Mermaid {kind} should be {orientation}.
3. Any text inside parenthesis should be inside quotes "".
4. Include the Mermaid {kind} inside <mermaid> tags.
5. Do not write anything after the </mermaid> tag.
6. Use only information from within the given text. Don't make up new information.
</specifications>

\n\nAssistant:
"""

print(prompt)



Human: 

Here is a text for you to reference for the following task:
<text>

Choose an Amazon VPC - Amazon SageMakerChoose an Amazon VPC - Amazon SageMakerAWSDocumentationAmazon SageMakerDeveloper GuideChoose an Amazon VPCThis topic provides detailed information about choosing an Amazon Virtual Private Cloud (Amazon VPC) when you
      onboard to Amazon SageMaker Domain. For more information about onboarding to SageMaker Domain, see Amazon SageMaker Domain overview.By default, SageMaker Domain uses two Amazon VPCs. One Amazon VPC is managed by Amazon SageMaker and provides
      direct internet access. You specify the other Amazon VPC, which provides encrypted traffic between
      the Domain and your Amazon Elastic File System (Amazon EFS) volume.You can change this behavior so that SageMaker sends all traffic over your specified Amazon VPC.
      When you choose this option, you must provide the subnets, security groups, and interface
      endpoints that are necessary to communica

In [31]:
body = json.dumps(
    {
        "prompt": prompt, 
        "max_tokens_to_sample": 500,
        "temperature": 0.9,
        "top_k": 250,
        "top_p": 1,
        "stop_sequences": ["\n\nHuman:"]
    }
)
modelId = "anthropic.claude-v2:1"  # change this to use a different version from the model provider
accept = "application/json"
contentType = "application/json"

try:

    response = bedrock_runtime.invoke_model(
        body=body, 
        modelId=modelId, 
        accept=accept, 
        contentType=contentType
    )
    response_body = json.loads(response.get("body").read())

    print(response_body.get("completion"))

except botocore.exceptions.ClientError as error:

    if error.response['Error']['Code'] == 'AccessDeniedException':
           print(f"\x1b[41m{error.response['Error']['Message']}\
                \nTo troubeshoot this issue please refer to the following resources.\
                 \nhttps://docs.aws.amazon.com/IAM/latest/UserGuide/troubleshoot_access-denied.html\
                 \nhttps://docs.aws.amazon.com/bedrock/latest/userguide/security-iam.html\x1b[0m\n")

    else:
        raise error


 <summary>
The text provides information about choosing an Amazon Virtual Private Cloud (VPC) when onboarding to Amazon SageMaker Domain. By default, SageMaker Domain uses two VPCs - one managed by SageMaker for internet access, and another specified by the user for encrypted traffic between Domain and Amazon EFS. 

Users can configure SageMaker to send all traffic through their specified VPC by setting the network access type to "VPC only". This requires providing subnets, security groups, and interface endpoints to communicate with various AWS services.

The onboarding process involves:

1. Selecting the network access type 
2. Choosing the VPC
3. Choosing subnets 
4. Choosing security groups

There are different options presented based on the number of VPC entities the user has in the region.
</summary>

<mermaid>
graph LR
    A["Start"] --> B["Select network access type:"]
    B --> C["Public internet only"]
    B --> D["VPC only"]
    D --> E["Choose VPC"]
    D --> F["Choose subn

In [32]:
import base64
from IPython.display import Image, display
import matplotlib.pyplot as plt

# parsing completion
def find_between( s, first, last ):
    try:
        start = s.index( first ) + len( first )
        end = s.index( last, start )
        return s[start:end]
    except ValueError:
        return ""

# display graph
def mm(graph):
    graphbytes = graph.encode("utf8")
    base64_bytes = base64.b64encode(graphbytes)
    base64_string = base64_bytes.decode("ascii")
    display(Image(url="https://mermaid.ink/img/" + base64_string))

    
def display_graph(llm_completion):
    str_mermaid_graph = find_between(llm_completion, "<mermaid>", "</mermaid>")
    print(str_mermaid_graph)
    mm(str_mermaid_graph)

    
display_graph(response_body.get("completion"))


graph LR
    A["Start"] --> B["Select network access type:"]
    B --> C["Public internet only"]
    B --> D["VPC only"]
    D --> E["Choose VPC"]
    D --> F["Choose subnets"]
    D --> G["Choose security groups"]
    E --> H["Onboarding process complete"]
    F --> H
    G --> H
    H --> I["End"]
    
    classDef grey fill:#dddddd,stroke:#ffffff,stroke-width:2px,color:#000000
    class A,I grey



In [36]:
def generate_diagram(
    url,
    kind="flowchart", # "mindmap" or "flowchart"
    orientation="LR", # "LR" or "TD"
    mermaid_context=True,
    max_tokens_to_sample=500,
    temperature=0.9,
    top_k=250,
    top_p=1
):
    
    html_text = context_mermaid_notation = get_html_text(
        url=url, 
        postprocess=False, 
        print_text=False
    )
    
    if mermaid_context is True:
        context_mermaid_notation = get_html_text(
            url="https://mermaid.js.org/syntax/flowchart.html", 
            postprocess=False, 
            print_text=False
        )
    else:
        context_mermaid_notation = ""
    
    
    prompt = f"""\n\nHuman: 
    Here is a text for you to reference for the following task:
    <text>
    {html_text}
    </text>

    Task: Summarize the given text and provide the summary inside <summary> tags. 
    Then convert the summary to a {kind} using Mermaid notation. 

    <mermaid_notation>
    {context_mermaid_notation}
    </mermaid_notation>

    The {kind} should capture the main gist of the summary, without too many low-level details. 
    Someone who would only view the Mermaid {kind}, should understand the gist of the summary. 
    The Mermaid {kind} should follow all the correct notation rules and should compile without any errors.
    Use the following specifications for the generated Mermaid {kind}:

    <specifications>
    1. Use different colors, shapes or groups to represent different concepts in the given text.
    2. The orientation of the Mermaid {kind} should be {orientation}.
    3. Any text inside parenthesis should be inside quotes "".
    4. Include the Mermaid {kind} inside <mermaid> tags.
    5. Do not write anything after the </mermaid> tag.
    6. Use only information from within the given text. Don't make up new information.
    </specifications>

    \n\nAssistant:
    """
        
    body = json.dumps(
        {
            "prompt": prompt, 
            "max_tokens_to_sample": max_tokens_to_sample,
            "temperature": temperature,
            "top_k": top_k,
            "top_p": top_p,
            "stop_sequences": ["\n\nHuman:"]
        }
    )
    modelId = "anthropic.claude-v2:1"  # change this to use a different version from the model provider
    accept = "application/json"
    contentType = "application/json"

    response = bedrock_runtime.invoke_model(
            body=body, 
            modelId=modelId, 
            accept=accept, 
            contentType=contentType
        )
    response_body = json.loads(response.get("body").read())

    display_graph(response_body.get("completion"))

In [None]:
generate_diagram(
    url="https://docs.aws.amazon.com/sagemaker/latest/dg/onboard-vpc.html",
    kind="flowchart", # "mindmap" or "flowchart"
    orientation="LR", # "LR" or "TD"
    mermaid_context=True,
    max_tokens_to_sample=500,
    temperature=0.9,
    top_k=250,
    top_p=1
)

---
2nd stage

In [26]:
str_mermaid_graph = find_between(response_body.get("completion"), "<mermaid>", "</mermaid>")

In [28]:

prompt2 = f"""\n\nHuman: 

Here is a text for you to reference for the following task:
<text>
{html_text}
</text>

Here is a Mermaid flowchart summarizing the contents of the given text.
<mermaid_flowchart>
{str_mermaid_graph}
</mermaid_flowchart>

Task: Check whether there are any errors on the Mermaid flowchart and correct them.
Include the updated Mermaid flowchart inside <mermaid> tags.

\n\nAssistant:
"""

print(prompt2)



Human: 

Here is a text for you to reference for the following task:
<text>

Choose an Amazon VPC - Amazon SageMakerChoose an Amazon VPC - Amazon SageMakerAWSDocumentationAmazon SageMakerDeveloper GuideChoose an Amazon VPCThis topic provides detailed information about choosing an Amazon Virtual Private Cloud (Amazon VPC) when you
      onboard to Amazon SageMaker Domain. For more information about onboarding to SageMaker Domain, see Amazon SageMaker Domain overview.By default, SageMaker Domain uses two Amazon VPCs. One Amazon VPC is managed by Amazon SageMaker and provides
      direct internet access. You specify the other Amazon VPC, which provides encrypted traffic between
      the Domain and your Amazon Elastic File System (Amazon EFS) volume.You can change this behavior so that SageMaker sends all traffic over your specified Amazon VPC.
      When you choose this option, you must provide the subnets, security groups, and interface
      endpoints that are necessary to communica

In [29]:
body = json.dumps(
    {
        "prompt": prompt, 
        "max_tokens_to_sample": 500,
        "temperature": 0.9,
        "top_k": 250,
        "top_p": 1,
        "stop_sequences": ["\n\nHuman:"]
    }
)
modelId = "anthropic.claude-v2:1"  # change this to use a different version from the model provider
accept = "application/json"
contentType = "application/json"

try:

    response = bedrock_runtime.invoke_model(
        body=body, 
        modelId=modelId, 
        accept=accept, 
        contentType=contentType
    )
    response_body = json.loads(response.get("body").read())

    print(response_body.get("completion"))

except botocore.exceptions.ClientError as error:

    if error.response['Error']['Code'] == 'AccessDeniedException':
           print(f"\x1b[41m{error.response['Error']['Message']}\
                \nTo troubeshoot this issue please refer to the following resources.\
                 \nhttps://docs.aws.amazon.com/IAM/latest/UserGuide/troubleshoot_access-denied.html\
                 \nhttps://docs.aws.amazon.com/bedrock/latest/userguide/security-iam.html\x1b[0m\n")

    else:
        raise error


 <summary>
The text provides information about choosing an Amazon Virtual Private Cloud (VPC) when onboarding to Amazon SageMaker Domain. By default, SageMaker Domain uses two VPCs - one managed by SageMaker for internet access, and another specified by the user for encrypted traffic between Domain and Amazon EFS. 

Users can configure SageMaker to send all traffic through their specified VPC by setting the network access type to "VPC only". This requires providing subnets, security groups, and interface endpoints to communicate with various AWS services.

When specifying the VPC entities, users are presented with options based on number of existing entities:
- Use existing entity if there is 1
- Choose from list if there are multiple
- Create new entities if there are none

Users also need to choose the VPC, subnets, and security groups. Subnets spanning multiple availability zones are recommended.
</summary>

<mermaid>
graph LR
    A(Onboard to SageMaker<br/>Domain) --> B{Choose netw

In [None]:
display_graph(response_body.get("completion"))