# Import Libraries & Setup

In this notebook, we will walk through various examples of using Bedrock and Bedrock Runtime APIs with Foundational Models hosted on Amazon SageMaker. We will discuss the following examples:

1. [Converse API](https://docs.aws.amazon.com/bedrock/latest/userguide/conversation-inference.html)
2. [Invoke Model API](https://docs.aws.amazon.com/bedrock/latest/userguide/inference-invoke.html)
3. [Retrieve and Generate API](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenrate.html)
4. [Retrieve API](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_Retrieve.html)
5. [Guard content with Amazon Bedrock Guardrails](https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails.html)

In [None]:
%pip install --force-reinstall -q -r ./requirements.txt

In [None]:
# restart kernel
from IPython.core.display import HTML
HTML("<script>Jupyter.notebook.kernel.restart()</script>")

In [None]:
import warnings
warnings.filterwarnings('ignore')

In [None]:
from knowledge_base import KnowledgeBasesForAmazonBedrock

import boto3
import os
import time
import json
import logging


In [None]:
iam_client = boto3.client("iam")
s3_client = boto3.client("s3")
sts_client = boto3.client('sts')

session = boto3.session.Session()
region = session.region_name
account_id = sts_client.get_caller_identity()["Account"]

bedrock = boto3.client("bedrock")
bedrock_runtime = boto3.client("bedrock-runtime")
bedrock_agent_client = boto3.client("bedrock-agent")
bedrock_agent_runtime_client = boto3.client("bedrock-agent-runtime")

In [None]:
logging.basicConfig(format='[%(asctime)s] p%(process)s {%(filename)s:%(lineno)d} %(levelname)s - %(message)s', level=logging.INFO)
logger = logging.getLogger(__name__)

In [None]:
logger.info(f"{region}, {account_id}")

## Setup - Create and Register Amazon SageMaker Endpoint with Amazon Bedrock

### Pre-requisites

This notebook requires permissions to:

- Create and delete Amazon IAM roles
- Create, update and delete Amazon S3 buckets
- Access Amazon Bedrock
- Access to Amazon OpenSearch Serverless

If running on SageMaker Studio, you should add the following managed policies to your role:

- IAMFullAccess
- AWSLambda_FullAccess
- AmazonS3FullAccess
- AmazonBedrockFullAccess
- Custom policy for Amazon OpenSearch Serverless such as:

```
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "aoss:*",
            "Resource": "*"
        }
    ]
}
```

<div class="alert alert-block alert-info">
Please make sure to enable Titan Text Embeddings V2 model access in Amazon Bedrock Console, as the notebook will use Titan Text Embeddings V2 models for generating embeddings. 

Follow instructions [here](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access-modify.html). 
</div>

### Deploy a model via SageMaker JumpStart and registering it with Amazon Bedrock

#### Step 1

Customers can log into the AWS Console, and navigate to the Amazon SageMaker service page via the search bar or the recently visited tab. On the SageMaker service page, select Studio from the navigation panel on the left. If you have not set up a SageMaker Domain to access Studio, please follow the steps outlined here. Once you have created a domain and a user, click on Open Studio. 

![step1](images/step1.PNG)

#### Step 2

In SageMaker Studio, navigate to the JumpStart tab from the navigation panel on the left. Here, you will see a list of all the model providers that offer pre-trained foundation models. Certain model provider cards will have a “Bedrock Ready” tag, indicating that they offer models that can be registered with Bedrock after they are deployed to an endpoint via SageMaker Jumpstart. Click on a model provider card to learn more. 

![step2](images/step2.PNG)

#### Step 3

You can filter the list of models from the to view which models are supported by Bedrock. To filter, check the “Bedrock Ready” option under the Action tab. Search for Gemma 2 27B Instruct in the Search Models bar and click on the model card. 

![step3](images/step3.PNG)

#### Step 4

You can view the model details after clicking on the Model card. We will go ahead and deploy the model. Click on the Deploy button on the top right of the webpage. On the next page, Review the End User License Agreement and check the box. For the endpoint settings, leave them as the default values and click on Deploy. For additional details on the model deployment process with SageMaker Jumpstart, refer to this [link](https://docs.aws.amazon.com/sagemaker/latest/dg/jumpstart-foundation-models-use-studio-updated-deploy.html). 


![step4](images/step4.PNG)

#### Step 5

Wait for a few minutes for the model to be successfully deployed to an Endpoint. Once the model is deployed, navigate to the Endpoint tab under Deployments from the navigation panel on the left. Click on the endpoint to see more details. 


![step5](images/step5.PNG)

#### Step 6

In the details page for the endpoint, you will see a “Use with Bedrock Button”  at the top right of the webpage. Click on that button. 

![step6](images/step6.PNG)

#### Step 7

The “Use with Bedrock” button will redirect you to the Bedrock Service page in the console. It will direct you to register your existing endpoint in SageMaker. It will prefill the Endpoint ARN and Model ARN automatically. Review details and click on Register. 

![step7](images/step7.PNG)

#### Step 8

Once your SageMaker endpoint is registered with Bedrock, you can now invoke it via the Converse API! We can test our newly registered model in the Bedrock console. On the Bedrock service page, click on Models under Bedrock Marketplace in the navigation pane on the left. Click on Self-hosted deployments. 

![step8](images/step8.PNG)

<div class="alert alert-block alert-warning">
Make sure to put correct value for endpoint_arn. 


![endpointARN](images/EndpointARN.png)
</div>

In [None]:
endpoint_arn = ""

<div class="alert alert-block alert-info">
Gemma 2 27B Instruct supports the following common payload parameters. You may specify any subset of these parameters when invoking an endpoint.

- do_sample: If True, activates logits sampling. If specified, it must be boolean.
- max_new_tokens: Maximum number of generated tokens. If specified, it must be a positive integer.
- repetition_penalty: A penalty for repetitive generated text. 1.0 means no penalty.
- return_full_text: If True, input text will be part of the output generated text. If specified, it must be boolean. The default value for it is False.
- seed: Random sampling seed.
- temperature: Controls the randomness in the output. Higher temperature results in output sequence with low-probability words and lower temperature results in output sequence with high-probability words. If temperature -> 0, it results in greedy decoding. If specified, it must be a positive float.
- top_k: In each step of text generation, sample from only the top_k most likely words. If specified, it must be a positive integer.
- top_p: In each step of text generation, sample from the smallest possible set of words with cumulative probability top_p. If specified, it must be a float between 0 and 1.
- details: Return generation details, to include output token logprobs and IDs.
</div>

<div class="alert alert-block alert-info">
Gemma 2 27B Instruct does not support system prompts.
</div>

## Model Inference

### Converse API (AWS CLI) with FMs hosted on Amazon SageMaker

<div class="alert alert-block alert-warning">
Make sure to replace endpoint_arn with Amazon SageMaker Endpoint ARN. 
</div>

In [None]:
!aws bedrock-runtime converse \
    --model-id endpoint_arn \
    --messages '[{"role": "user", "content": [{"text": "What is Amazon doing in the field of generative AI?"}]}]'

### Converse API (boto3) with FMs hosted on Amazon SageMaker

In [None]:
# Base inference parameters to use.
inference_config = {
        "maxTokens": 256,
        "temperature": 0.1,
        "topP": 0.999,
}

# Additional inference parameters to use.
additional_model_fields = {"top_k": 250}


response = bedrock_runtime.converse(
    modelId=endpoint_arn,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "text": "What is Amazon doing in the field of generative AI?",
                },
            ]
        },
    ],
    inferenceConfig=inference_config,
    additionalModelRequestFields=additional_model_fields,
)

In [None]:
response

In [None]:
print(response["output"]["message"]["content"][0]["text"])

### Invoke Model API (boto3) with FMs hosted on Amazon SageMaker

In [None]:
# Combine base and additional parameters
request_body = {
    "inputs": "What is Amazon doing in the field of generative AI?",
    "parameters": {
        "max_tokens": 256,
        "temperature": 0.1,
        "top_p": 0.999,
        "top_k": 250,
        "return_full_text": True,
        "details": True,
        "repetition_penalty": 0.9
    }
}

response = bedrock_runtime.invoke_model(
    modelId=endpoint_arn,
#     contentType='application/json',
#     accept='application/json',
    body=json.dumps(request_body)
)

# Parse the response
response_body = json.loads(response['body'].read())

In [None]:
response_body

## Retrieve data and generate AI responses  with Amazon Bedrock Knowledge Bases

We will now going to create a Knowledge Base for Amazon Bedrock and its requirements including:

- [Amazon OpenSearch Serverless](https://aws.amazon.com/opensearch-service/features/serverless/) for the vector database
- [AWS IAM](https://aws.amazon.com/iam/?gclid=Cj0KCQiA0fu5BhDQARIsAMXUBOIUK3yz8b91PiCpnXnXMCaQki8JThR5aWHqFMp0jXZmsJMr9vKDl9gaAoXJEALw_wcB&trk=da94b437-337f-4ee7-81b4-5dcf158370ab&sc_channel=ps&ef_id=Cj0KCQiA0fu5BhDQARIsAMXUBOIUK3yz8b91PiCpnXnXMCaQki8JThR5aWHqFMp0jXZmsJMr9vKDl9gaAoXJEALw_wcB:G:s&s_kwcid=AL!4422!3!651737511581!e!!g!!amazon%20iam!19845796027!146736269229) roles and permissions
- [Amazon S3](https://aws.amazon.com/s3/) bucket to store the knowledge base documents

To create the knowledge base and its dependencies, we will use the BedrockKnowledgeBase support class, available in this folder. It allows you to create a new knowledge base, ingest documents to the knowledge base data source and delete the resources after you are done working with this lab.

Note that creation of the Amazon OpenSearch Serverless collection can take several minutes. You can use the Amazon OpenSearch Serverless console to monitor creation progress.

![data-ingestion](images/data_ingestion.png)

For more details on how to setup Amazon Bedrock Knowledge Base checkout the following resources:

1. [Workshop](https://github.com/aws-samples/amazon-bedrock-workshop/tree/main/02_KnowledgeBases_and_RAG)
   
2. [User Guide](https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base.html)

In [None]:
suffix = f"{region}-{account_id}"
knowledge_base_name = f'sample-kb'
knowledge_base_description = "Knowledge Base containing Amazon's Letters to Shareholders"
bucket_name = f'{knowledge_base_name}-{suffix}'

Steps:
- Create Amazon Bedrock Knowledge Base execution role with necessary policies for accessing data from S3 and writing embeddings into OSS.
- Create an empty OpenSearch serverless index.
- Download documents
- Create Amazon Bedrock knowledge base
- Create a data source within knowledge base which will connect to Amazon S3
- Start an ingestion job using KB APIs which will read data from s3, chunk it, convert chunks into embeddings using - Amazon Titan Embeddings model and then store these embeddings in AOSS. All of this without having to build, deploy and manage the data pipeline.

In [None]:
kb = KnowledgeBasesForAmazonBedrock()
kb_id, ds_id = kb.create_or_retrieve_knowledge_base(knowledge_base_name, knowledge_base_description, bucket_name)

In the example RAG workflow we will upload the da

In [None]:
# Download and prepare dataset
!mkdir -p ./kb_documents

from urllib.request import urlretrieve
urls = [
    'https://s2.q4cdn.com/299287126/files/doc_financials/2023/ar/2022-Shareholder-Letter.pdf',
    'https://s2.q4cdn.com/299287126/files/doc_financials/2022/ar/2021-Shareholder-Letter.pdf',
    'https://s2.q4cdn.com/299287126/files/doc_financials/2021/ar/Amazon-2020-Shareholder-Letter-and-1997-Shareholder-Letter.pdf',
    'https://s2.q4cdn.com/299287126/files/doc_financials/2020/ar/2019-Shareholder-Letter.pdf'
]

filenames = [
    'AMZN-2022-Shareholder-Letter.pdf',
    'AMZN-2021-Shareholder-Letter.pdf',
    'AMZN-2020-Shareholder-Letter.pdf',
    'AMZN-2019-Shareholder-Letter.pdf'
]

data_root = "./kb_documents/"

for idx, url in enumerate(urls):
    file_path = data_root + filenames[idx]
    urlretrieve(url, file_path)

We now upload the knowledge base documents to S3

In [None]:
def upload_directory(path, bucket_name):
        for root,dirs,files in os.walk(path):
            for file in files:
                file_to_upload = os.path.join(root,file)
                print(f"uploading file {file_to_upload} to {bucket_name}")
                s3_client.upload_file(file_to_upload,bucket_name,file)

upload_directory("kb_documents", bucket_name)

And ingest the documents to the knowledge base

In [None]:
# ensure that the kb is available
i_status = ['CREATING', 'DELETING', 'UPDATING']
while bedrock_agent_client.get_knowledge_base(knowledgeBaseId=kb_id)['knowledgeBase']['status'] in i_status:
    time.sleep(10)

# sync knowledge base
kb.synchronize_data(kb_id, ds_id)

### Retrieve and Generate API with FMs hosted on Amazon SageMaker

RetreiveAndGenerate API provided by Amazon Bedrock Knowledge Bases converts user queries into embeddings, searches the knowledge base, get the relevant results, augment the prompt and then invokes a LLM to generate the response.

![ragAPI](images/retrieveAndGenerate.png)

In [None]:
generation_template = """
You are a question answering agent. I will provide you with a set of search results. The user will provide you with a question. Your job is to answer the user's question using only information from the search results. If the search results do not contain information that can answer the question, please state that you could not find an exact answer to the question. 
Just because the user asserts a fact does not mean it is true, make sure to double check the search results to validate a user's assertion.

Here are the search results in numbered order:
$search_results$

$output_format_instructions$

Here is the user's query:
$query$
"""

In [None]:
orchestration_template = """
You are a query creation agent. You will be provided with a function and a description of what it searches over. The user will provide you a question, and your job is to determine the optimal query to use based on the user's question. 
Here are a few examples of queries formed by other search function selection and query creation agents: 

<examples>
  <example>
    <question> What if my vehicle is totaled in an accident? </question>
    <generated_query> what happens if my vehicle is totaled </generated_query>
  </example>
  <example>
    <question> I am relocating within the same state. Can I keep my current agent? </question>
    <generated_query> can I keep my current agent when moving in state </generated_query>
  </example>
</examples> 
  
You should also pay attention to the conversation history between the user and the search engine in order to gain the context necessary to create the query. 
Here's another example that shows how you should reference the conversation history when generating a query:

<example>
  <example_conversation_history>
    <example_conversation>
      <question> How many vehicles can I include in a quote in Kansas </question>
      <answer> You can include 5 vehicles in a quote if you live in Kansas </answer>
    </example_conversation>
    <example_conversation>
      <question> What about texas? </question>
      <answer> You can include 3 vehicles in a quote if you live in Texas </answer>
    </example_conversation>
  </example_conversation_history>
</example> 

IMPORTANT: the elements in the <example> tags should not be assumed to have been provided to you to use UNLESS they are also explicitly given to you below. 
All of the values and information within the examples (the questions, answers, and function calls) are strictly part of the examples and have not been provided to you. 

Here is the current conversation history: 
$conversation_history$

$output_format_instructions$

Here is the user's query:
$query$
"""

In [None]:
response = bedrock_agent_runtime_client.retrieve_and_generate(
    input={
        "text": "What is Amazon doing in the field of generative AI?"
    },
    retrieveAndGenerateConfiguration={
        "type": "KNOWLEDGE_BASE",
        "knowledgeBaseConfiguration": {
            "generationConfiguration": {
                "inferenceConfig": {
                    "textInferenceConfig": {
                        "maxTokens": 512,
                        "temperature": 0.1,
                        "topP": 0.9
                    }
                },
                "promptTemplate": {
                    "textPromptTemplate": generation_template
                }
            },
            "knowledgeBaseId": kb_id,
            "orchestrationConfiguration": {
                "inferenceConfig": {
                    "textInferenceConfig": {
                        "maxTokens": 512,
                        "temperature": 0.1,
                        "topP": 0.9
                    }
                },
                "promptTemplate": {
                    "textPromptTemplate": orchestration_template
                },
            },
            "modelArn": endpoint_arn,
            "retrievalConfiguration": {
                "vectorSearchConfiguration": {
                    "numberOfResults":5
                } 
            }
        }
    }
)

print(response['output']['text'],end='\n'*2)

In [None]:
response

## Stop harmful content in models using Amazon Bedrock Guardrails

Guardrails for Amazon Bedrock have multiple components which include Content Filters, Denied Topics, Word and Phrase Filters, and Sensitive Word (PII & Regex) Filters. For a full list check out the [documentation](https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails-create.html).



In [None]:
create_response = bedrock.create_guardrail(
    name='fiduciary-advice',
    description='Prevents the our model from providing fiduciary advice.',
    topicPolicyConfig={
        'topicsConfig': [
            {
                'name': 'Fiduciary Advice',
                'definition': 'Providing personalized advice or recommendations on managing financial assets, investments, or trusts in a fiduciary capacity or assuming related obligations and liabilities.',
                'examples': [
                    'What stocks should I invest in for my retirement?',
                    'Is it a good idea to put my money in a mutual fund?',
                    'How should I allocate my 401(k) investments?',
                    'What type of trust fund should I set up for my children?',
                    'Should I hire a financial advisor to manage my investments?'
                ],
                'type': 'DENY'
            }
        ]
    },
    contentPolicyConfig={
        'filtersConfig': [
            {
                'type': 'SEXUAL',
                'inputStrength': 'HIGH',
                'outputStrength': 'HIGH'
            },
            {
                'type': 'VIOLENCE',
                'inputStrength': 'HIGH',
                'outputStrength': 'HIGH'
            },
            {
                'type': 'HATE',
                'inputStrength': 'HIGH',
                'outputStrength': 'HIGH'
            },
            {
                'type': 'INSULTS',
                'inputStrength': 'HIGH',
                'outputStrength': 'HIGH'
            },
            {
                'type': 'MISCONDUCT',
                'inputStrength': 'HIGH',
                'outputStrength': 'HIGH'
            },
            {
                'type': 'PROMPT_ATTACK',
                'inputStrength': 'HIGH',
                'outputStrength': 'NONE'
            }
        ]
    },
    wordPolicyConfig={
        'wordsConfig': [
            {'text': 'fiduciary advice'},
            {'text': 'investment recommendations'},
            {'text': 'stock picks'},
            {'text': 'financial planning guidance'},
            {'text': 'portfolio allocation advice'},
            {'text': 'retirement fund suggestions'},
            {'text': 'wealth management tips'},
            {'text': 'trust fund setup'},
            {'text': 'investment strategy'},
            {'text': 'financial advisor recommendations'}
        ],
        'managedWordListsConfig': [
            {'type': 'PROFANITY'}
        ]
    },
    sensitiveInformationPolicyConfig={
        'piiEntitiesConfig': [
            {'type': 'EMAIL', 'action': 'ANONYMIZE'},
            {'type': 'PHONE', 'action': 'ANONYMIZE'},
            {'type': 'NAME', 'action': 'ANONYMIZE'},
            {'type': 'US_SOCIAL_SECURITY_NUMBER', 'action': 'BLOCK'},
            {'type': 'US_BANK_ACCOUNT_NUMBER', 'action': 'BLOCK'},
            {'type': 'CREDIT_DEBIT_CARD_NUMBER', 'action': 'BLOCK'}
        ],
        'regexesConfig': [
            {
                'name': 'Account Number',
                'description': 'Matches account numbers in the format XXXXXX1234',
                'pattern': r'\b\d{6}\d{4}\b',
                'action': 'ANONYMIZE'
            }
        ]
    },
    contextualGroundingPolicyConfig={
        'filtersConfig': [
            {
                'type': 'GROUNDING',
                'threshold': 0.5
            },
            {
                'type': 'RELEVANCE',
                'threshold': 0.5
            }
        ]
    },
    blockedInputMessaging="""I can provide general info about Amazon's recent advances.""",
    blockedOutputsMessaging="""I can provide general info about Amazon's recent advances. """,
    tags=[
        {'key': 'purpose', 'value': 'fiduciary-advice-prevention'},
        {'key': 'environment', 'value': 'production'}
    ]
)

print(create_response)

In [None]:
# Now let's create a version for our Guardrail 
version_response = bedrock.create_guardrail_version(
    guardrailIdentifier=create_response['guardrailId'],
    description='Version of Guardrail'
)
print(version_response)

In [None]:
guardrail_identifier = create_response["guardrailId"]
guardrail_version = version_response["version"]

### Step 1: Retrieve relevant chunks from Amazon Bedrock Knowledge Base using Retrieve API

![retrieve](images/retrieveAPI.png)

In [None]:
relevant_documents = bedrock_agent_runtime_client.retrieve(
    retrievalQuery= {
        "text": "What is Amazon doing in the field of generative AI?"
    },
    knowledgeBaseId=kb_id,
    retrievalConfiguration= {
        "vectorSearchConfiguration": {
            "numberOfResults": 1
        }
    }
)

### Step 2 Invoke model with Coverse API

In [None]:
def invoke_model(prompt, source, inference_config=None, additional_model_field=None):
    messages = [
        {
            "role": "user",
            "content": [
                {
                    "guardContent": {
                        "text": {
                            "text": source,
                            "qualifiers": ["grounding_source"],
                        }
                    }
                },
                {
                    "guardContent": {
                        "text": {
                            "text": prompt,
                            "qualifiers": ["query"],
                        }
                    }
                },
            ],
        }
    ]
    if not inference_config:
        # Base inference parameters to use.
        inference_config = {
                "maxTokens": 256,
                "temperature": 0.1,
                "topP": 0.999,
        }
    
    if not additional_model_field:
        # Additional inference parameters to use.
        additional_model_fields = {"top_k": 250}


    response = bedrock_runtime.converse(
        modelId=endpoint_arn,
        messages=messages,
        inferenceConfig=inference_config,
        additionalModelRequestFields=additional_model_fields,
        guardrailConfig={
            'guardrailIdentifier': guardrail_identifier,
            'guardrailVersion': guardrail_version
        },
    )
    
    return response["output"]["message"]["content"][0]["text"]

In [None]:
invoke_model(prompt="What is Amazon doing in the field of generative AI?", source=relevant_documents["retrievalResults"][0]["content"]["text"])

In [None]:
invoke_model(prompt="Should I buy bitcoin?", source=relevant_documents["retrievalResults"][0]["content"]["text"])