# File Name: advanced_rag_building_part2.ipynb
### Location: Chapter 6
### Purpose: 
#####       1. Example of Sentence Window Retrieval RAG pattern.
#####       2. Example of Reranker RAG pattern.
#####       3. Example of FLARE RAG pattern.
#####       4. Example of MultiStep Query Engine RAG pattern.

##### Dependency: Not Applicable
# <ins>-----------------------------------------------------------------------------------</ins>

# <ins>Amazon SageMaker Classic</ins>
#### Those who are new to Amazon SageMaker Classic. Follow the link for the details. https://docs.aws.amazon.com/sagemaker/latest/dg/studio.html

# <ins>Environment setup of Kernel</ins>
##### Fill "Image" as "Data Science"
##### Fill "Kernel" as "Python 3"
##### Fill "Instance type" as "ml-t3-medium"
##### Fill "Start-up script" as "No Scripts"
##### Click "Select"

###### Refer https://docs.aws.amazon.com/sagemaker/latest/dg/notebooks-create-open.html for details.

# <ins>Mandatory installation on the kernel through pip</ins>

##### This lab will work with below software version. But, if you are trying with latest version of boto3, awscli, and botocore. This code may fail. You might need to change the corresponding api. 

##### You will see pip dependency errors. you can safely ignore these errors and continue executing rest of the cell. 

# CAUTION: The below pip installation can take more than 30 mins for the first time. 

In [None]:
%pip install --no-build-isolation --force-reinstall -q \
    "boto3>=1.34.84" \
    "langchain>=0.2.16" \
    "langchain_community>=0.2.17" \
    "awscli>=1.32.84" \
    "botocore>=1.34.84" \
    "PyPDF2" \
    "pypdf" \
    "llama-index" \
    "llama-index-llms-bedrock" \
    "llama-index-embeddings-bedrock" \
    "llama-index-embeddings-huggingface" \
    "llama-index-llms-langchain" \
    "langchain-chroma>=0.1.2" \
    "ipywidgets>=7.6.5" \
    "jupyterlab" \
    "jupyter" \
    "tqdm" \
    "iprogress>=0.4" \
    "llama-index-embeddings-langchain" \
    "ipynb" \
    "langchain-aws>=0.1.7"  

# <ins>Disclaimer</ins>

##### You will see pip dependency errors. you can safely ignore these errors and continue executing rest of the cell.

# <ins>Restart the kernel</ins>

In [None]:
# restart kernel
from IPython.core.display import HTML
HTML("<script>Jupyter.notebook.kernel.restart()</script>")

# <ins>Python package import</ins>

##### boto3 offers various clients for Amazon Bedrock to execute various actions.
##### botocore is a low-level interface to AWS tools, while boto3 is built on top of botocore and provides additional features

In [None]:
import json
import os
import boto3
import botocore
import warnings
import time
from tqdm import tqdm
import ipywidgets as widgets
from IPython.display import display
from langchain.document_loaders import PyPDFLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_chroma import Chroma
from langchain_aws.embeddings.bedrock import BedrockEmbeddings
from langchain_aws import ChatBedrock
from langchain.retrievers.bedrock import AmazonKnowledgeBasesRetriever
from langchain.chains import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.prompts import ChatPromptTemplate

# Import necessary modules
from langchain.prompts import PromptTemplate

# Define prompt, Amazon Bedrock Foundation model, and Amazon Bedrock embed model

In [None]:
# Define prompt
prompt = "What is Amazon doing and cashflow?"

# List of Bedrock models with names and model codes
bedrock_model_id = "anthropic.claude-3-haiku-20240307-v1:0"

# List of Bedrock embed models with names and model codes
bedrock_embed_model_id = "amazon.titan-embed-text-v1"

## Define important environment variable

In [None]:
# Try-except block to handle potential errors
try:
    # Create a new Boto3 session to interact with AWS services
    # This session is responsible for managing credentials and region configuration
    boto3_session = boto3.session.Session()

    # Retrieve the current AWS region from the session (e.g., 'us-east-1', 'us-west-2')
    aws_region_name = boto3_session.region_name
    
    # Initialize Bedrock and Bedrock Runtime clients using Boto3
    # These clients will allow interactions with Bedrock-related AWS services
    boto3_bedrock_client = boto3.client('bedrock', region_name=aws_region_name)
    boto3_bedrock_runtime_client = boto3.client('bedrock-runtime', region_name=aws_region_name)

    # Store all relevant variables in a dictionary for easier access and management
    variables_store = {
        "aws_region_name": aws_region_name,                          # AWS region name
        "boto3_bedrock_client": boto3_bedrock_client,                # Bedrock client instance
        "boto3_bedrock_runtime_client": boto3_bedrock_runtime_client,  # Bedrock Runtime client instance
        "boto3_session": boto3_session                               # Current Boto3 session object
    }

    # Print all stored variables for debugging and verification
    for var_name, value in variables_store.items():
        print(f"{var_name}: {value}")

# Handle any exceptions that occur during the execution
except Exception as e:
    # Print the error message if an unexpected error occurs
    print(f"An unexpected error occurred: {e}")

# Modular Python setup for initializing language and embeddings models with error handling using Bedrock client.

### 1. initialize_language_model: This function initializes a language model using the Bedrock client and model ID. If successful, it returns the model; otherwise, it returns None.

### 2. initialize_embeddings_model: Similar to the first function, this initializes the embeddings model using the Bedrock client and embeddings model ID, with error handling in case of failure.

### 3. setup_models: This is the main function that calls the previous two functions to initialize both the language model and embeddings model. It handles errors, ensuring that if either initialization fails, the process is halted, and appropriate messages are shown.

In [None]:
# Importing necessary packages from langchain
from langchain.prompts import PromptTemplate
from langchain.chains import RetrievalQA

# Function to initialize the language model with the Bedrock client
def initialize_language_model(client, model_id):
    """
    Initializes the language model using the provided Bedrock client and model ID.
    
    Args:
        client: The Bedrock client for model invocation.
        model_id (str): The ID of the language model to be initialized.
    
    Returns:
        ChatBedrock: The initialized language model or None if an error occurs.
    """
    try:
        # Initialize the language model
        llm = ChatBedrock(model_id=model_id, client=client)
        print("Successfully initialized the language model.")
        return llm
    except Exception as e:
        # Handle any errors during initialization
        print(f"Error initializing the language model: {e}")
        return None

# Function to initialize the Bedrock Embeddings model
def initialize_embeddings_model(client, embed_model_id):
    """
    Initializes the embeddings model using the provided Bedrock client and embeddings model ID.
    
    Args:
        client: The Bedrock client for model invocation.
        embed_model_id (str): The ID of the embeddings model to be initialized.
    
    Returns:
        BedrockEmbeddings: The initialized embeddings model or None if an error occurs.
    """
    try:
        # Initialize the embeddings model
        embeddings_model = BedrockEmbeddings(client=client, model_id=embed_model_id)
        print("Successfully initialized the Bedrock Embeddings model.")
        return embeddings_model
    except Exception as e:
        # Handle any errors during initialization
        print(f"Error initializing the Bedrock Embeddings model: {e}")
        return None

# Main function to set up both models with error handling
def setup_models(bedrock_client, bedrock_model_id, embed_model_id):
    """
    Sets up the language model and embeddings model, handling errors during the setup.
    
    Args:
        bedrock_client: The Bedrock client used to interact with the models.
        bedrock_model_id (str): The ID of the language model.
        embed_model_id (str): The ID of the embeddings model.
    
    Returns:
        tuple: (llm, embeddings_model) initialized models or None if an error occurs.
    """
    try:
        # Initialize the language model
        llm = initialize_language_model(bedrock_client, bedrock_model_id)
        if not llm:
            # Return None if language model initialization failed
            print("Failed to initialize language model, exiting setup.")
            return None, None

        # Initialize the embeddings model
        embeddings_model = initialize_embeddings_model(bedrock_client, embed_model_id)
        if not embeddings_model:
            # Return None if embeddings model initialization failed
            print("Failed to initialize embeddings model, exiting setup.")
            return llm, None

        print("Both models initialized successfully.")
        return llm, embeddings_model
    except Exception as e:
        # Handle any unexpected errors during the setup process
        print(f"Unexpected error in setup process: {e}")
        return None, None

# Example usage: setting up both models
llm, embeddings_model = setup_models(boto3_bedrock_runtime_client, bedrock_model_id, bedrock_embed_model_id)

if llm and embeddings_model:
    # Indicate that both models are ready for use
    print("Language model and embeddings model are ready for use.")
else:
    # Indicate that there was an issue with initialization
    print("One or both models failed to initialize.")

# Important python package for Llama index

In [None]:
from llama_index.core import StorageContext
from llama_index.core import Settings
from llama_index.core.postprocessor import MetadataReplacementPostProcessor
from llama_index.core.retrievers import VectorIndexRetriever
from llama_index.core.query_engine import RetrieverQueryEngine
from llama_index.core.query_engine import FLAREInstructQueryEngine
from llama_index.llms.bedrock import Bedrock


from llama_index.core import (
    SimpleDirectoryReader,
    VectorStoreIndex,
    get_response_synthesizer,
    Settings,
)

from llama_index.core.indices.query.query_transform.base import (
    StepDecomposeQueryTransform,
)

from llama_index.core.query_engine import MultiStepQueryEngine


Settings.llm = llm
Settings.embed_model = embeddings_model
Settings.chunk_size = 2056

### Ignore warning 

In [None]:
warnings.filterwarnings('ignore')

# Find out data directory

#### 1. Retrieves the current working directory and prints it.
#### 2. Builds a path that navigates up one directory and appends 'data/rag_use_cases' to the path, then prints this resulting path.

In [None]:
try:
    # Get the current working directory
    current_directory = os.getcwd()
    
    # Print the current working directory
    print(f"Current working directory: {current_directory}")
    
    # Attempt to navigate up one directory and then to 'data/rag_use_cases'
    data_directory = os.path.join(os.path.dirname(current_directory), 'data/rag_use_cases')
    
    # Print the resulting path
    print(f"Data directory path: {data_directory}")
    
except FileNotFoundError as e:
    # Handle the case where the directory path does not exist
    print(f"Error: The specified path does not exist - {e}")
    
except Exception as e:
    # General exception handler for any other errors
    print(f"An unexpected error occurred: {e}")

# Disclaimer
##### Make Sure that data_directory is pointing to the right path and data files are present. Otherwise, you need to change the above code

# Define prompt, Amazon Bedrock Foundation model, and Amazon Bedrock embed model

In [None]:
# Define prompt
prompt = "What is Amazon doing and cashflow?"

# List of Bedrock models with names and model codes
bedrock_model_id = "anthropic.claude-3-haiku-20240307-v1:0"

# List of Bedrock embed models with names and model codes
bedrock_embed_model_id = "amazon.titan-embed-text-v1"

## Define important environment variable

In [None]:
# Try-except block to handle potential errors
try:
    # Create a new Boto3 session to interact with AWS services
    # This session is responsible for managing credentials and region configuration
    boto3_session = boto3.session.Session()

    # Retrieve the current AWS region from the session (e.g., 'us-east-1', 'us-west-2')
    aws_region_name = boto3_session.region_name
    
    # Initialize Bedrock and Bedrock Runtime clients using Boto3
    # These clients will allow interactions with Bedrock-related AWS services
    boto3_bedrock_client = boto3.client('bedrock', region_name=aws_region_name)
    boto3_bedrock_runtime_client = boto3.client('bedrock-runtime', region_name=aws_region_name)

    # Store all relevant variables in a dictionary for easier access and management
    variables_store = {
        "aws_region_name": aws_region_name,                          # AWS region name
        "boto3_bedrock_client": boto3_bedrock_client,                # Bedrock client instance
        "boto3_bedrock_runtime_client": boto3_bedrock_runtime_client,  # Bedrock Runtime client instance
        "boto3_session": boto3_session                               # Current Boto3 session object
    }

    # Print all stored variables for debugging and verification
    for var_name, value in variables_store.items():
        print(f"{var_name}: {value}")

# Handle any exceptions that occur during the execution
except Exception as e:
    # Print the error message if an unexpected error occurs
    print(f"An unexpected error occurred: {e}")

# Prepare dataset 

##### This function, load_documents_from_directory, attempts to load documents from a specified directory path using SimpleDirectoryReader. It returns a list of documents if successful, printing the count of documents loaded. The function includes exception handling for cases such as missing directories (FileNotFoundError), insufficient permissions (PermissionError), and other general errors, with each case providing a clear error message. In the example usage, it checks if documents were loaded successfully, printing a confirmation message if they are ready for further processing.

In [None]:
# Function to load documents from a directory
def load_documents_from_directory(directory_path):
    """
    Loads documents from a specified directory path.
    
    Parameters:
    - directory_path (str): The path to the directory containing document files.
    
    Returns:
    - list: A list of loaded documents if successful; otherwise, None.
    """
    try:
        # Attempt to load documents from the specified directory
        documents = SimpleDirectoryReader(directory_path).load_data()
        
        # Confirm successful document loading
        print(f"Successfully loaded {len(documents)} documents from {directory_path}.")
        print()  # Add extra line breaks for clarity in output
        return documents

    # Handle any exceptions during loading process
    except FileNotFoundError:
        # Handle the case where directory or files are not found
        print(f"Error: Directory '{directory_path}' not found.")
        return None
    
    except PermissionError:
        # Handle cases where permissions are insufficient to read files
        print(f"Error: Insufficient permissions to read from '{directory_path}'.")
        return None
    
    except Exception as e:
        # General error handling for unexpected issues
        print(f"Error loading documents from {directory_path}: {e}")
        return None

# Example usage

llamaindex_documents = load_documents_from_directory(data_directory)

# Check if documents were loaded successfully
if llamaindex_documents:
    print("Documents loaded and ready for processing.")
else:
    print("Document loading failed.")

# 1. Sentence Window Retrieval patterns

< Details >

#### Refer:
https://github.com/run-llama/llama-hub/blob/main/llama_hub/llama_packs/sentence_window_retriever/sentence_window.ipynb

## Create a VectorStoreIndex from the previously loaded documents. A VectorStoreIndex is used to store and index document embeddings, enabling efficient similarity searches and queries.
### It uses the VectorStoreIndex.from_documents() method to create the index, incorporating both the document data and an embedding model for the indexing process.

In [None]:
# Attempt to create a VectorStoreIndex from the loaded documents
try:
    llama_index = VectorStoreIndex.from_documents(
        llamaindex_documents,
        embed_model=embeddings_model,
        llm_predictor=llm,  # Use the language model directly for predictions
        show_progress=True    # Display progress during index creation
    )
    print("Successfully created the VectorStoreIndex.")

except Exception as e:
    print(f"Error creating VectorStoreIndex: {e}")
    llama_index = None  # Set to None in case of an error

# Creating and executing a query engine based on the successful creation of a VectorStoreIndex

### 1. similarity_top_k=2: Specifies that the query engine should retrieve the top 2 most similar documents from the index based on the query.
### 2. node_postprocessors: A list of postprocessors applied to the nodes in the query engine. In this case, the MetadataReplacementPostProcessor is used to modify the metadata associated with the documents, specifically replacing the metadata key with 'window'.

In [None]:
# If the index was successfully created, proceed to create a query engine
if llama_index:
    try:
        query_engine = llama_index.as_query_engine(
            similarity_top_k=2,  # Retrieve the top 2 similar documents
            node_postprocessors=[
                MetadataReplacementPostProcessor(target_metadata_key="window")  # Replace metadata key with 'window'
            ],
        )
        print("Successfully created the query engine.")

    except Exception as e:
        print(f"Error creating query engine: {e}")
        query_engine = None  # Set to None in case of an error

    # If the query engine was successfully created, execute the query
    if query_engine:
        try:
            window_response = query_engine.query(prompt)  # Execute the query
            print("Query executed successfully.")

        except Exception as e:
            print(f"Error executing query: {e}")
            window_response = None  # Set to None in case of an error

        # Print the response if it was successful
        if window_response:
            print("Response:", window_response)
        else:
            print("No response received from the query.")
    else:
        print("Query engine was not created; cannot execute query.")
else:
    print("VectorStoreIndex was not created; cannot create query engine.")

# 2. Reranker patterns

< Details >

#### Refer:
https://python.langchain.com/docs/templates/rag-pinecone-rerank/

# Process to Initialize Retriever, Create Temporary Index, Configure Query Engine, and Execute Query with Synthesized Response

### Approach to querying an indexed document store, with various functions focused on retrieving, configuring, and processing data. It incorporates error handling and logs to ensure that any issues encountered during the process are clearly communicated. Below is a detailed breakdown of each function and its role in the overall process.

### 1. initialize_retriever Function: Initializes the retriever, which is responsible for retrieving the most relevant nodes based on the user’s query (prompt).
### 2. create_temp_index Function: Creates a temporary VectorStoreIndex from a list of nodes.
### 3. configure_retriever Function: Configures the retriever for querying the llama_index based on the provided query mode and top-k similarity threshold.
### 4. initialize_response_synthesizer Function: Initializes a response synthesizer to generate responses based on the retrieved information.
### 5. query_engine_with_retriever Function: Assembles the query engine using the retriever and the response synthesizer, and then executes the query based on the user's prompt.
### 6. reranker Function (Main Execution Process): The main function that ties all the components together and executes the complete process of retrieving, indexing, and generating a response.

In [None]:
# Function to initialize the retriever and retrieve nodes based on a prompt
def initialize_retriever(llama_index, prompt, top_k=50):
    """
    Initializes the retriever with a specified similarity threshold and retrieves nodes based on the prompt.

    Parameters:
    - llama_index: The index used for retrieving nodes.
    - prompt (str): The user query.
    - top_k (int): The number of top documents to retrieve for similarity matching.
    
    Returns:
    - list: A list of nodes retrieved from the index, or None if retrieval fails.
    """
    try:
        # Initialize retriever and retrieve nodes based on similarity
        retriever = llama_index.as_retriever(similarity_top_k=top_k)
        nodes = retriever.retrieve(prompt)
        print(f"Successfully retrieved {len(nodes)} nodes.")
        return [node.node for node in nodes]
    except Exception as e:
        print(f"Error initializing retriever or retrieving nodes: {e}")
        return None

# Function to create a temporary index from a list of nodes
def create_temp_index(node_list):
    """
    Creates a VectorStoreIndex from a list of nodes.

    Parameters:
    - node_list (list): The list of nodes for the index.
    
    Returns:
    - VectorStoreIndex: The created index, or None if index creation fails.
    """
    try:
        # Initialize VectorStoreIndex with node list and show progress
        temp_index = VectorStoreIndex(node_list, show_progress=True)
        print("Temporary VectorStoreIndex created successfully.")
        return temp_index
    except Exception as e:
        print(f"Error creating VectorStoreIndex: {e}")
        return None

# Function to set up and configure the retriever with query mode
def configure_retriever(llama_index, top_k=10, query_mode="mmr"):
    """
    Configures the retriever with the specified query mode and similarity threshold.

    Parameters:
    - llama_index: The index used for configuring the retriever.
    - top_k (int): The number of top documents for similarity search.
    - query_mode (str): Mode for querying the vector store.

    Returns:
    - VectorIndexRetriever: Configured retriever or None if configuration fails.
    """
    try:
        retriever = VectorIndexRetriever(
            index=llama_index,
            similarity_top_k=top_k,
            vector_store_query_mode=query_mode,
        )
        print("Retriever configured successfully.")
        return retriever
    except Exception as e:
        print(f"Error configuring the retriever: {e}")
        return None

# Function to initialize the response synthesizer
def initialize_response_synthesizer(mode="tree_summarize"):
    """
    Initializes the response synthesizer with the specified mode.

    Parameters:
    - mode (str): The mode used for synthesizing responses.

    Returns:
    - ResponseSynthesizer: Configured response synthesizer or None if initialization fails.
    """
    try:
        response_synthesizer = get_response_synthesizer(response_mode=mode)
        print("Response synthesizer initialized successfully.")
        return response_synthesizer
    except Exception as e:
        print(f"Error initializing response synthesizer: {e}")
        return None

# Function to assemble the query engine and query a prompt
def query_engine_with_retriever(retriever, response_synthesizer, prompt):
    """
    Assembles the query engine with the retriever and synthesizer and executes a query.

    Parameters:
    - retriever: The retriever configured with the index.
    - response_synthesizer: The synthesizer configured for response generation.
    - prompt (str): The query or prompt to send to the query engine.

    Returns:
    - str: The response from the query engine or None if querying fails.
    """
    try:
        # Assemble query engine
        query_engine = RetrieverQueryEngine(
            retriever=retriever,
            response_synthesizer=response_synthesizer,
        )
        
        # Query with the assembled engine
        response = query_engine.query(prompt)
        print("Query executed successfully.")
        return response
    except Exception as e:
        print(f"Error executing query: {e}")
        return None

# Main execution process
def reranker(prompt):
    try:
        # Load and retrieve nodes
        nodes = initialize_retriever(llama_index, prompt)
        if not nodes:
            print("No nodes retrieved, exiting.")
            return

        # Create temporary index from nodes
        temp_index = create_temp_index(nodes)
        if not temp_index:
            print("Temporary index creation failed, exiting.")
            return

        # Configure retriever
        retriever = configure_retriever(llama_index)
        if not retriever:
            print("Retriever configuration failed, exiting.")
            return

        # Initialize response synthesizer
        response_synthesizer = initialize_response_synthesizer()
        if not response_synthesizer:
            print("Response synthesizer initialization failed, exiting.")
            return

        # Run query
        response = query_engine_with_retriever(retriever, response_synthesizer, prompt)
        if response:
            print("Final response:", response)
        else:
            print("No response received from the query engine.")
            
    except Exception as e:
        print(f"An unexpected error occurred: {e}")

# Execute the main process with a prompt
reranker(prompt)

# 3. FLARE patterns

< Details >

#### Refer:
https://docs.llamaindex.ai/en/stable/examples/query_engine/flare_query_engine/

# Initializing and Executing LlamaIndex and FLARE Query Engines with Error Handling

### Approach to querying an indexed document store, with various functions focused on retrieving, configuring, and processing data. It incorporates error handling and logs to ensure that any issues encountered during the process are clearly communicated. Below is a detailed breakdown of each function and its role in the overall process.

### 1. initialize_retriever Function: Initializes the retriever, which is responsible for retrieving the most relevant nodes based on the user’s query (prompt).
### 2. create_temp_index Function: Creates a temporary VectorStoreIndex from a list of nodes.
### 3. configure_retriever Function: Configures the retriever for querying the llama_index based on the provided query mode and top-k similarity threshold.
### 4. initialize_response_synthesizer Function: Initializes a response synthesizer to generate responses based on the retrieved information.
### 5. query_engine_with_retriever Function: Assembles the query engine using the retriever and the response synthesizer, and then executes the query based on the user's prompt.
### 6. reranker Function (Main Execution Process): The main function that ties all the components together and executes the complete process of retrieving, indexing, and generating a response.

### Function run_query_engines that initializes and runs two types of query engines — LlamaIndex and FLARE — using a provided query. It includes robust error handling to manage any issues that may arise during engine initialization or query execution. Below is a detailed breakdown of the function and its flow:

### 1. Initialize the LlamaIndex query engine with the given index.
### 2. Initialize the FLARE query engine using LlamaIndex as its base.
### 3. Execute the provided query using the FLARE query engine and return the results.
### 4. Provide error handling throughout the initialization and query execution stages to ensure the process is robust.

In [None]:
# Function to initialize and run the query engines with error handling
def run_query_engines(llama_index, query):
    """
    Initializes and runs the LlamaIndex and FLARE query engines with the specified query.
    
    Parameters:
    - llama_index: The LlamaIndex object containing the indexed documents.
    - query (str): The query string for retrieving relevant documents.
    
    Returns:
    - response: The response generated from the FLARE query engine, or None if an error occurs.
    """
    try:
        # Initialize the Llama query engine with specified similarity_top_k
        index_query_engine = llama_index.as_query_engine(similarity_top_k=2)
        print("Llama query engine initialized successfully.")

        # Initialize the FLARE query engine with the Llama query engine as the base
        flare_query_engine = FLAREInstructQueryEngine(
            query_engine=index_query_engine,
            max_iterations=7,
            verbose=True
        )
        print("FLARE query engine initialized successfully.")
        
        # Execute the query using the FLARE query engine
        response = flare_query_engine.query(query)
        print("Query executed successfully.")
        
        return response

    except Exception as e:
        # Handle any exceptions during the setup or query process
        print(f"An error occurred while executing the query: {e}")
        return None

# Example usage
response = run_query_engines(llama_index, prompt)

# Print the response if available
if response:
    print("Query response:", response)
else:
    print("Failed to retrieve a response.")

# 4. MultiStep Query Engine patterns

< Details >

# Initializing and Executing Multi-Step Query Engine with Bedrock LLM Model

### To initialize a Bedrock LLM model, configure a query engine, and execute a multi-step query using step decomposition. The process involves several stages, with comprehensive error handling to ensure smooth execution. Below is a detailed breakdown of each part of the code:

### 1. Initializing the Bedrock LLM Model: The first step is to initialize a Bedrock LLM model using a specified model ID (bedrock_model_id).
### 2. Setting Up Step Decomposition Transform: To handle complex queries by breaking them into smaller, more manageable steps, a step decomposition transform is created.
### 3. Defining the Index Summary: An index summary is defined to provide a concise description of the content within the index. In this case, it is set to "Cash Flow", which will be used as part of the query process.
### 4. Configuring the Primary Query Engine: The query engine is configured by transforming the existing VectorStoreIndex into a query engine that integrates with the Bedrock LLM model.
### 5. Setting Up the Multi-Step Query Engine: A multi-step query engine is created to allow for advanced query processing that includes the step decomposition transform and the defined index summary.
### 6. Executing the Query: Once the multi-step query engine is set up, the next step is to execute a query using the engine. The query is provided as the prompt variable, which is passed into the query_engine.query(prompt) method.

### 9. Considerations:
##### Verbose Logging: The verbose=True flag in StepDecomposeQueryTransform ensures that detailed logs are generated, which can help in debugging the decomposition process.
##### Complex Queries: The use of a multi-step query engine enables the handling of more complex queries by breaking them down into smaller steps.
##### Model Integration: The integration of the Bedrock LLM into the process allows the query engine to leverage powerful language model capabilities for better response generation.

In [None]:
try:
    # Initialize Bedrock LLM model with the specified model ID
    llama_llm = Bedrock(model=bedrock_model_id)
    print("Bedrock LLM model initialized successfully.")

    # Set up a step decomposition transform for query processing
    step_decompose_transform = StepDecomposeQueryTransform(llm=llama_llm, verbose=True)
    print("Step decomposition transform initialized successfully.")

    # Define the index summary, a concise description of the index's content
    index_summary = "Cash Flow"

    # Configure the primary query engine with Bedrock LLM
    query_engine = llama_index.as_query_engine(llm=llama_llm)
    print("Primary query engine created successfully.")

    # Set up a multi-step query engine with decomposition capabilities
    query_engine = MultiStepQueryEngine(
        query_engine=query_engine,
        query_transform=step_decompose_transform,
        index_summary=index_summary,
    )
    print("Multi-step query engine created successfully.")

    # Execute the query using the multi-step query engine
    response = query_engine.query(prompt)
    print("Query executed successfully. Response received.")
    print(response)

except Exception as e:
    # Catch any errors that occur during initialization or querying
    print(f"An error occurred: {e}")


# End of NoteBook 

## Please ensure that you close the kernel after using this notebook to avoid any potential charges to your account.

## Process: Go to "Kernel" at top option. Choose "Shut Down Kernel". 
##### Refer https://docs.aws.amazon.com/sagemaker/latest/dg/studio-ui.html