## Query Decomposition: Avoid semantic delutions in the queries
 * "Semantic dilutions" refer to the loss of meaning or relevance when chunks of text, intended to represent information for retrieval, become too large or lack semantic cohesion, leading to less accurate and useful results.
 * Semantic dilutions can be mitigated on the indexed documents side by applying optimized chunking strategies per each use case.
 * Query texts can also have semantic delutions when multiple contexts or topics are combined. This can be improved by reforming the queries.

This notebook will demonstrate "query decomposition" strategy to improve the search quality against a knowledge base. 

![Query Decomposition](https://raw.githubusercontent.com/aws-samples/langgraph-agents-with-amazon-bedrock/refs/heads/main/assets/lab3_2.png "https://github.com/aws-samples/langgraph-agents-with-amazon-bedrock/tree/main/Lab_3")

#### Prerequsites

In [None]:
import warnings
warnings.filterwarnings('ignore')

!pip install -U boto3

In [None]:
import json
with open("../variables.json", "r") as f:
    variables = json.load(f)

variables

In [None]:
# Configurations for Knowledge Base retrievals

kb_id = variables["kbSemanticChunk"]
model_id = f"arn:aws:bedrock:us-west-2:{variables['accountNumber']}:inference-profile/us.amazon.nova-lite-v1:0"

number_of_results = 5
generation_configuration = {
    'inferenceConfig': {
                    'textInferenceConfig': {
                        'maxTokens': 1024,
                        'stopSequences': [],
                        'temperature': 0.0,
                        'topP': 0.2
                    }
                },
}


### How Model Size Affects Table Interpretation  

When querying Amazon’s Operating Income for 2022, **smaller models (Nova Lite, Llama 3B)** tend to pick the **"At Prior Year Rates"** value (\$11,387), while **larger models (Nova Pro)** correctly select the **"As Reported"** value (\$12,248).

#### Possible Reasons:
- **Table Parsing Limitations:** Smaller models may not accurately align column headers to values.  
- **Context Misinterpretation:** They might default to the last numerical column or fail to strongly associate **"As Reported"** with the correct column.  
- **Stronger Reasoning in Larger Models:** Nova Pro better understands structured data, leading to more accurate retrieval.  

![Image](./operating_income.png)

### Basic RAG query
The query retrieves only one of the topics mentioned. It is because of the mix of two queries dilluted the second topic, which is called "semantic dillution."
In real world scenarios, it is required to pull multiple chunks with different contexts, for example:
* How Amazon's net income increased from 2018 to 2024?
* What is difference between RAG and text-to-SQL?

In [None]:
# WITHOUT QUERY DECOMPOSITION
import boto3

bedrock_agent_runtime = boto3.client("bedrock-agent-runtime", region_name=variables["regionName"])

# Query example
query= "What was Amazon’s Operating Income as reported for the fiscal year ending December 31, 2022? What is text-to-SQL? How did text-to-SQL contribute to Amazons earnings, if any?"


response = bedrock_agent_runtime.retrieve_and_generate(
    input={
        "text": query
    },
    retrieveAndGenerateConfiguration={
        "type": "KNOWLEDGE_BASE",
        "knowledgeBaseConfiguration": {
            'knowledgeBaseId': kb_id,
            "modelArn": model_id,
            "generationConfiguration"
            : generation_configuration,
            "retrievalConfiguration": {
                "vectorSearchConfiguration": {
                    "numberOfResults": number_of_results
                } 
            }
        }
    }
)
print('----------------- Answer ---------------------')
print(response['output']['text'],end='\n'*2)
print('----------------- Citations ------------------')
print(json.dumps(response, indent=2))

### Use Amazon Bedrock API feature to decompose a query.
ReteriveAndGenerate API support built-in query decompose feature.

In [None]:
# WITH QUERY DECOMPOSITION
import boto3

bedrock_agent_runtime = boto3.client("bedrock-agent-runtime", region_name=variables["regionName"])

model_id = f"arn:aws:bedrock:us-west-2:{variables['accountNumber']}:inference-profile/us.amazon.nova-pro-v1:0"

# Query example
query= "What was Amazon’s Operating Income as reported for the fiscal year ending December 31, 2022? What is text-to-SQL? How did text-to-SQL contribute to Amazons earnings, if any?"

response = bedrock_agent_runtime.retrieve_and_generate(
    input={
        "text": query
    },
    retrieveAndGenerateConfiguration={
        "type": "KNOWLEDGE_BASE",
        "knowledgeBaseConfiguration": {
            'knowledgeBaseId': kb_id,
            "modelArn": model_id,
            "generationConfiguration": generation_configuration,
            "retrievalConfiguration": {
                "vectorSearchConfiguration": {
                    "numberOfResults": number_of_results
                } 
            },
            #######################
            'orchestrationConfiguration': {
                'queryTransformationConfiguration': {
                    'type': 'QUERY_DECOMPOSITION'
                }
            }
            #######################
        }
    }
)
print('----------------- Answer ---------------------')
print(response['output']['text'],end='\n'*2)
print('----------------- Citations ------------------')
print(json.dumps(response, indent=2))

## Query Decomposition with Agentic RAG using SageMaker and LangChain

#### Prerequites

In [None]:
import boto3
# Reuse the same LLM endpoint deployed to SageMaker in the previous notebook.
from langchain_aws.llms import SagemakerEndpoint
from langchain_aws.llms.sagemaker_endpoint import LLMContentHandler

class ContentHandler(LLMContentHandler):
    content_type = "application/json"
    accepts = "application/json"
    def transform_input(self, prompt: str, model_kwargs: dict) -> bytes:
        input_str = json.dumps({"inputs": prompt, "parameters": model_kwargs})
        return input_str.encode("utf-8")

    def transform_output(self, output: bytes) -> str:
        response_json = json.loads(output.read().decode("utf-8"))
        return response_json["generated_text"]

sagemaker_runtime = boto3.client(
    "sagemaker-runtime"
)

generation_configuration = {"temperature": 0,
                            "top_p": 0.3,
                            "max_new_tokens": 512,
                            "stop":["<|eot_id|>"]
                            }

llm = SagemakerEndpoint(
        endpoint_name=variables["sagemakerLLMEndpoint"],
        client=sagemaker_runtime,
        model_kwargs=generation_configuration,
        content_handler=ContentHandler(),
    )


# RAG config
number_of_results = 5


from langchain_aws.retrievers import AmazonKnowledgeBasesRetriever

retriever = AmazonKnowledgeBasesRetriever(
    knowledge_base_id=kb_id,
    region_name=variables["regionName"],
    retrieval_config={"vectorSearchConfiguration": {"numberOfResults": number_of_results}},

)

#### Test a complex query with plain Q&A chain

In [None]:
import json 
from langchain_core.output_parsers import StrOutputParser
from langchain.prompts import PromptTemplate
from langchain_core.runnables import RunnablePassthrough


prompt = PromptTemplate.from_template(
"""
Human:

You are an assistant who answers questions using  following pieces of retrieved context only. 
If you don't find the answer from the retrieved context, do not include it and just say you don't know about it.

Question: {question}

Context: {context}

Answer: Based on the context given, my answer for your question is as following:
""")

def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

qa_chain = (
    {
        "context": retriever | format_docs,
        "question": RunnablePassthrough(),
    }
    | prompt
    | llm
    | StrOutputParser()
)



In [None]:
# Query
query= "What was the Operating Income of Amazon As Reported for the Year Ending December 31, 2022? What is text-to-SQL? How did text-to-SQL contribute to Amazons earnings, if any?"

# Invoke RAG chain
answer = qa_chain.invoke(query)

#print("Question:", query)
print("Answer:", answer)

### Query Decomposition using Agentic RAG with LangChain

In [None]:
from langchain_core.exceptions import OutputParserException
from langchain_core.output_parsers import BaseOutputParser


# The [bool] desribes a parameterization of a generic.
# It's basically indicating what the return type of parse is
# in this case the return type is either True or False
class CustomOutputParser(BaseOutputParser):
    """Custom parser."""

    def parse(self, text: str):
        print(text)
        return text

    @property
    def _type(self) -> str:
        return "custome_output_text"

In [None]:
# Import necessary libraries
from langchain.agents import AgentType, initialize_agent, Tool


# 1. Define the RAG tools
def fn_search(question):
    """
    Search the answer of the question from the knowledge base. 
    """
    chunks = [doc.page_content for doc in retriever.invoke(question)]
    return chunks

def noop(input)-> None: 
    """Use this when no action need to be taken for your thought."""
    return

kb_tool_finance=Tool(
    name="SearchFinancialStatements",
    func=fn_search,
    description="Use this tool to find answers for financial data."
)

kb_tool_technology=Tool(
    name="SearchTechnologyDocuments",
    func=fn_search,
    description="Use this tool to find answers for technologies."
)

noop=Tool(
    name="None",
    func=noop,
    description="Use this when no action need to be taken for your thought"
)

tools = [kb_tool_finance, kb_tool_technology, noop]


# 2. Create the agent
agent = initialize_agent(
    tools=tools,
    llm=llm,
    agent_type=AgentType.SELF_ASK_WITH_SEARCH,
    verbose=True,
    handle_parsing_errors=True
)

"""
From https://api.python.langchain.com/en/latest/agents/langchain.agents.agent_types.AgentType.html
SELF_ASK_WITH_SEARCH = 'self-ask-with-search'
An agent that breaks down a complex question into a series of simpler questions.

This agent uses a search tool to look up answers to the simpler questions in order to answer the original complex question.
"""

# 3. Test the agent
query= "What was Amazon’s Operating Income as reported for the fiscal year ending December 31, 2022? What is text-to-SQL? How did text-to-SQL contribute to Amazons earnings, if any?"
result = agent.run(query)
print(result)
