# Code Explanation
1)Connects to Amazon Bedrock (Knowledge Base/Agent service).

2) Lists knowledge bases in your account (limited to 1).

3) Prints the Knowledge Base ID if found.

4) Otherwise, prints a message that no KBs exist.

5) Handles API errors gracefully.

This is typically the first step in a RAG workflow  confirming that your Bedrock Knowledge Base exists and retrieving its ID.

# Details



* boto3 → AWS SDK for Python, used to interact with AWS services.
* botocore → low-level core library used by boto3, provides error handling, request/response handling.
* Creates a Bedrock Agent client using bedrock-agent.

This is the service namespace for interacting with Amazon Bedrock Knowledge Bases and Agents.


In [None]:
import boto3
import botocore

session = boto3.Session()
bedrock_client = session.client('bedrock-agent')

try:
    response = bedrock_client.list_knowledge_bases(
        maxResults=1  # We only need to retrieve the first Knowledge Base
    )
    knowledge_base_summaries = response.get('knowledgeBaseSummaries', [])

    if knowledge_base_summaries:
        kb_id = knowledge_base_summaries[0]['knowledgeBaseId']
        print(f"Knowledge Base ID: {kb_id}")
    else:
        print("No Knowledge Base summaries found.")

except botocore.exceptions.ClientError as e:
    print(f"Error: {e}")

#Code Explanation

1. Prepares AWS session & region.
2. Sets up a Bedrock Runtime client (for foundation model inference).
3. Sets up a Bedrock Agent Runtime client (for RAG apps, knowledge base queries, or agent flows).
4. Configures clients with long timeouts & no retries to handle potentially long-running calls.
5. Includes a pretty printer for cleanly displaying responses.

#Details

* boto3 → AWS SDK for Python (interacts with AWS services).
* Config (from botocore.client) → lets you customize client settings like timeouts and retries.
* pprint → pretty printer for nicer JSON/dict printing.
* json → standard Python library for working with JSON.
* bedrock_client → talk to foundation models directly.
* bedrock_agent_client → talk to knowledge bases / RAG agents.


In [None]:
import boto3
from botocore.client import Config
import pprint
import json

pp = pprint.PrettyPrinter(indent=2)

session = boto3.session.Session()
region = session.region_name

bedrock_config = Config(connect_timeout=120, read_timeout=120, retries={'max_attempts': 0})
bedrock_client = boto3.client('bedrock-runtime', region_name = region)
bedrock_agent_client = boto3.client("bedrock-agent-runtime",
                              config=bedrock_config, region_name = region)


#Code Explanation
This function retrieves relevant context from your Knowledge Base using hybrid semantic + keyword search to support your RAG QA pipeline.


In [None]:
def retrieve(query, kbId, numberOfResults=5):
    return bedrock_agent_client.retrieve(
        retrievalQuery= {
            'text': query
        },
        knowledgeBaseId=kbId,
        retrievalConfiguration= {
            'vectorSearchConfiguration': {
                'numberOfResults': numberOfResults,
                'overrideSearchType': "HYBRID", # optional
            }
        }
    )

# Code Explanation
1. User asks: “What was the total operating lease liabilities and total sublease income … ?”
2. Bedrock Knowledge Base searches across embedded vectors + keywords.
3. Returns top 5 most relevant text chunks from your documents (financial reports in S3, indexed in OpenSearch).
4. You now have retrieved context.
5. Next step in RAG → Pass this retrievalResults into a Bedrock LLM (e.g., Nova Pro) to generate a final answer.


In [None]:
query = "What was the total operating lease liabilities and total sublease income of the AnyCompany as of December 31, 2022?"
response = retrieve(query, kb_id, 5)
retrievalResults = response['retrievalResults']
pp.pprint(retrievalResults)

#Code Explanation
The function explanation,
1. Take retrieval results → input is the list of results returned from Knowledge Base search.
2. Initialize an empty list → contexts = [].
3. Loop through results → for each retrieved result (a dictionary with text + metadata + score).
4. Extract only the text → retrievedResult['content']['text'].
5. Append text to list → collect all chunks into contexts.
6. Return the list of texts → clean context chunks ready for LLM input.


In [None]:
# fetch context from the response
def get_contexts(retrievalResults):
    contexts = []
    for retrievedResult in retrievalResults:
        contexts.append(retrievedResult['content']['text'])
    return contexts

#Code Explanation

This step verifies what knowledge (context text) has been retrieved from your S3 documents (via the Bedrock Knowledge Base + OpenSearch).
Later, these contexts will be fed into the LLM (e.g., Nova Pro/Lite) as grounding context for the RAG answer.


In [None]:
contexts = get_contexts(retrievalResults)
pp.pprint(contexts)

#Code Explanation

This is the prompt engineering step of your RAG pipeline.

This prompt construction step is where you merge:

🡪 User Query (query)

🡪 Retrieved Context (contexts)

🡪 System Instructions (rules & guidelines)

The result is a structured, context-grounded prompt that can be safely sent to the Bedrock LLM (bedrock-runtime) for generation.

In [None]:
prompt = f"""
Human: You are a financial advisor AI system specializing in analyzing financial statements, particularly focusing on lease accounting and disclosures. Use the following pieces of information to provide a detailed answer to the question enclosed in <question> tags.

When analyzing the information, please follow these guidelines:
1. Look for specific details about operating lease liabilities and sublease income as of December 31, 2022.
2. If exact figures for the requested date are not available, provide the most recent available data and clearly state the date.
3. Include any relevant information about lease terms, weighted-average remaining lease terms, or lease accounting policies.
4. If the information is not explicitly about AnyCompany, but general lease accounting information is provided, summarize that instead.
5. If you don't find specific lease information, mention any other financial data that might be relevant to understanding the company's obligations or income streams.

<context>
{contexts}
</context>

<question>
{query}
</question>

In your response:
- Provide specific numerical figures where available, using proper currency notation (e.g., $X million).
- Clearly indicate what information is directly from the context and what might be inferred.
- If certain information is not available, explicitly state: "The [specific detail] is not provided in the available financial statements."
- Include any other relevant lease-related or financial information found in the context.

Assistant:"""

#Code Explanation

You’re packaging the user’s structured prompt into the format Nova Lite expects, with generation controls, so you can send this payload to bedrock-runtime.invoke_model().


In [None]:
# payload with model parameters
messages = [{
    "role": "user",
    "content": [{"text": prompt}]
}]

# Create the proper Nova Lite payload
nova_payload = {
    "schemaVersion": "messages-v1",
    "messages": messages,
    "inferenceConfig": {
        "maxTokens": 512,
        "temperature": 0.5,
        "topP": 0.9,
        "topK": 20
    }
}

#Code Explanation
1. Pick model + formats
2. Send payload with invoke_model()
3. Parse JSON response
4. Extract just the text reply
5. Print it


In [None]:
modelId = 'amazon.nova-lite-v1:0' # change this to use a different version from the model provider
accept = 'application/json'
contentType = 'application/json'

response = bedrock_client.invoke_model(
    body=json.dumps(nova_payload),
    modelId=modelId,
    accept=accept,
    contentType=contentType
)

# Parse and extract the response
response_body = json.loads(response.get('body').read())

# Extract just the text from the response
response_text = ''
if 'output' in response_body and 'message' in response_body['output']:
    message_content = response_body['output']['message']['content']
    if message_content and isinstance(message_content, list):
        response_text = message_content[0].get('text', '')

# Print the response text
print(response_text)