# RAG Retrieval Optimization - Sentence Window Parsing technique with Amazon Bedrock and Llamaindex

### Small to big retrieval

Sentence window is another technique that enhances the retrieval process by focusing on individual sentences while providing surrounding context. In this approach, documents are parsed into single sentences, each with a "window" of surrounding sentences. During retrieval, the system finds the most relevant individual sentences. However, instead of using only these single sentences, it replaces them with their corresponding windows, which include a specified number of sentences before and after the retrieved sentence. This method allows for more fine-grained retrieval of specific information while still providing necessary context, potentially improving the relevance and coherence of the generated responses.

In this lab, we demonstrated how to use sentence window technique for post-retrieval with LlamaIndex. Specifically, we employed the SentenceWindowNodeParser module to splits Amazon's SEC filing documents into individual sentences, creating a node for each sentence while also including a configurable "window" of surrounding sentences in the node's metadata. We can then use the MetadataReplacementPostProcessor module to retrieve the sentence along with associated 'window' metadata to improve the context for final response generation.

- Vector Database (Faiss / local)
- LLM (Amazon Bedrock - Claude3 Sonnet)
- Embeddings Model (Bedrock Titan Text Embeddings v2.0)
- Datasets ( Amazons 10-k sec filings from year 2022 and 2023 )
- Llamaindex SentenceWindowNodeParser (This example is built on referece llamaindex documentation available at - https://docs.llamaindex.ai/en/stable/examples/node_postprocessor/MetadataReplacementDemo/)


### > Setup
We start by importing necessary llamaindex libraries

In [None]:
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.core.tools import QueryEngineTool, ToolMetadata
from llama_index.core.query_engine import SubQuestionQueryEngine
from llama_index.core.callbacks import CallbackManager, LlamaDebugHandler
from llama_index.core import Settings

We select Anthropic Claude3 Sonnet as our LLM. For embedding model, we are selecting Amazon Titan Text Embed v2.0. 

In [None]:
import json
from typing import Sequence, List
from llama_index.core.settings import Settings
from llama_index.llms.bedrock import Bedrock
from llama_index.embeddings.bedrock import BedrockEmbedding, Models
from llama_index.core.node_parser import SentenceWindowNodeParser
from llama_index.core.node_parser import SentenceSplitter

# create the sentence window node parser w/ default settings
node_parser = SentenceWindowNodeParser.from_defaults(
    window_size=3,
    window_metadata_key="window",
    original_text_metadata_key="original_text",
)

# base node parser is a sentence splitter
text_splitter = SentenceSplitter()

llm = Bedrock(model = "anthropic.claude-3-sonnet-20240229-v1:0")
embed_model = BedrockEmbedding(model = "amazon.titan-embed-text-v2:0")

Settings.llm = llm
Settings.embed_model = embed_model
Settings.chunk_size = 256
Settings.text_splitter = text_splitter

from llama_index.core.llms import ChatMessage
from llama_index.core.tools import BaseTool, FunctionTool
import nest_asyncio
nest_asyncio.apply()

### > Document Ingestion
We ingest and index the data stored in data directory. The amazon folder has SEC-10k files from 2022 and 2023.

In [None]:
# load data
amazon_secfiles = SimpleDirectoryReader(input_dir="../data/lab03/amazon/").load_data()

In [None]:
nodes = node_parser.get_nodes_from_documents(amazon_secfiles)

In [None]:
base_nodes = text_splitter.get_nodes_from_documents(amazon_secfiles)

In [None]:
from llama_index.core import VectorStoreIndex

sentence_index = VectorStoreIndex(nodes)

In [None]:
base_index = VectorStoreIndex(base_nodes)

In [None]:
from llama_index.core.postprocessor import MetadataReplacementPostProcessor

query_engine = sentence_index.as_query_engine(
    similarity_top_k=5,
    # the target key defaults to `window` to match the node_parser's default
    node_postprocessors=[
        MetadataReplacementPostProcessor(target_metadata_key="window")
    ],
)
window_response = query_engine.query(
    "Whats Amazons ownership stake in Rivian??"
)
print(window_response)

In [None]:
window = window_response.source_nodes[0].node.metadata["window"]
sentence = window_response.source_nodes[0].node.metadata["original_text"]

print(f"Window: {window}")
print("------------------")
print(f"Original Sentence: {sentence}")

# Contrast with normal VectorStoreIndex

Naive RAG is not able to pinpoint necessary details. 

In [None]:
query_engine = base_index.as_query_engine(similarity_top_k=2)
vector_response = query_engine.query(
    "Whats Amazons ownership stake in Rivian?"
)
print(vector_response)

In [None]:
for source_node in window_response.source_nodes:
    print(source_node.node.metadata["original_text"])
    print("--------")

In [None]:
for node in vector_response.source_nodes:
    print(node.node.text)
    print("--------")