# Assignment 2 Solution: Advanced RAG Techniques
## Day 6 Session 2 - Advanced RAG Fundamentals

This notebook contains the complete solution for Assignment 2.

**Solution covers:**
- Node postprocessors for filtering and reranking
- Response synthesizers (TreeSummarize, Refine)
- Structured outputs with Pydantic models
- Advanced RAG pipelines combining multiple techniques


In [1]:
# Import required libraries for advanced RAG
import os
from pathlib import Path
from typing import Dict, List, Optional, Any
from pydantic import BaseModel, Field

# Core LlamaIndex components
from llama_index.core import SimpleDirectoryReader, VectorStoreIndex, StorageContext, Settings
from llama_index.core.query_engine import RetrieverQueryEngine
from llama_index.core.retrievers import VectorIndexRetriever

# Vector store
from llama_index.vector_stores.lancedb import LanceDBVectorStore

# Embeddings and LLM
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.llms.openrouter import OpenRouter

# Advanced RAG components
from llama_index.core.postprocessor import SimilarityPostprocessor
from llama_index.core.response_synthesizers import TreeSummarize, Refine, CompactAndRefine
from llama_index.core.output_parsers import PydanticOutputParser

print("✅ Advanced RAG libraries imported successfully!")


  from .autonotebook import tqdm as notebook_tqdm


✅ Advanced RAG libraries imported successfully!


In [2]:
# Configure Advanced RAG Settings (Using OpenRouter)
def setup_advanced_rag_settings():
    """Configure LlamaIndex with optimized settings for advanced RAG."""
    api_key = os.getenv("OPENROUTER_API_KEY")
    if not api_key:
        print("⚠️  OPENROUTER_API_KEY not found - LLM operations will be limited")
    else:
        print("✅ OPENROUTER_API_KEY found - full advanced RAG functionality available")
        Settings.llm = OpenRouter(
            api_key=api_key,
            model="gpt-4o-mini",
            temperature=0.1
        )
    
    Settings.embed_model = HuggingFaceEmbedding(
        model_name="BAAI/bge-small-en-v1.5",
        trust_remote_code=True
    )
    Settings.chunk_size = 512
    Settings.chunk_overlap = 50
    
    print("✅ Advanced RAG settings configured")

setup_advanced_rag_settings()


✅ OPENROUTER_API_KEY found - full advanced RAG functionality available
✅ Advanced RAG settings configured


In [3]:
# Setup: Create index (reusing Assignment 1 concepts)
def setup_basic_index(data_folder: str = "../data"):
    """Create a basic vector index for advanced RAG demonstrations."""
    vector_store = LanceDBVectorStore(
        uri="./advanced_rag_vectordb",
        table_name="documents"
    )
    
    if not Path(data_folder).exists():
        print(f"❌ Data folder not found: {data_folder}")
        return None
        
    reader = SimpleDirectoryReader(input_dir=data_folder, recursive=True)
    documents = reader.load_data()
    
    storage_context = StorageContext.from_defaults(vector_store=vector_store)
    index = VectorStoreIndex.from_documents(
        documents, 
        storage_context=storage_context,
        show_progress=True
    )
    
    print(f"✅ Basic index created with {len(documents)} documents")
    return index

# Create the basic index
index = setup_basic_index()
print("🚀 Ready for advanced RAG techniques!")


Table documents doesn't exist yet. Please add some data to create it.
Parsing nodes: 100%|██████████| 42/42 [00:00<00:00, 151.85it/s]
Generating embeddings: 100%|██████████| 94/94 [00:05<00:00, 16.22it/s]
2025-09-21 09:33:18,639 - INFO - Create new table documents adding data.


✅ Basic index created with 42 documents
🚀 Ready for advanced RAG techniques!


[90m[[0m2025-09-21T04:03:18Z [33mWARN [0m lance::dataset::write::insert[90m][0m No existing dataset at /Users/ishandutta/Documents/code/ai-accelerator/Day_6/session_2/solutions/advanced_rag_vectordb/documents.lance, it will be created


In [4]:
def create_query_engine_with_similarity_filter(index, similarity_cutoff: float = 0.3, top_k: int = 10):
    """Create a query engine that filters results based on similarity scores."""
    # Create similarity postprocessor with the cutoff threshold
    similarity_processor = SimilarityPostprocessor(similarity_cutoff=similarity_cutoff)
    
    # Create query engine with similarity filtering
    query_engine = index.as_query_engine(
        similarity_top_k=top_k,
        node_postprocessors=[similarity_processor]
    )
    return query_engine

# Test the function
filtered_engine = create_query_engine_with_similarity_filter(index, similarity_cutoff=0.3)
print("✅ Query engine with similarity filtering created")

# Test query
test_query = "What are the benefits of AI agents?"
print(f"\n🔍 Testing query: '{test_query}'")
response = filtered_engine.query(test_query)
print(f"📝 Filtered Response: {response}")


✅ Query engine with similarity filtering created

🔍 Testing query: 'What are the benefits of AI agents?'


2025-09-21 09:33:19,901 - INFO - query_type :, vector
2025-09-21 09:33:22,229 - INFO - HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 200 OK"
2025-09-21 09:33:25,195 - INFO - HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 200 OK"
2025-09-21 09:33:28,504 - INFO - HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 200 OK"


📝 Filtered Response: AI agents provide numerous advantages, such as enhanced reasoning, planning, and execution capabilities for complex tasks. They can engage with external tools and data sources, facilitating the resolution of intricate problems that necessitate multi-step reasoning. Both single-agent and multi-agent systems show robust performance in achieving complex objectives, especially when they utilize clear feedback mechanisms, task decomposition, and iterative refinement. Furthermore, multi-agent systems benefit from collaboration among different agents, enabling parallel task execution and more adaptive problem-solving strategies. This adaptability significantly improves their overall effectiveness across a range of applications.


In [5]:
def create_query_engine_with_tree_summarize(index, top_k: int = 5):
    """Create a query engine that uses TreeSummarize for comprehensive responses."""
    # Create TreeSummarize response synthesizer
    tree_synthesizer = TreeSummarize()
    
    # Create query engine with the synthesizer
    query_engine = index.as_query_engine(
        response_synthesizer=tree_synthesizer,
        similarity_top_k=top_k
    )
    return query_engine

# Test the function
tree_engine = create_query_engine_with_tree_summarize(index)
print("✅ Query engine with TreeSummarize created")

# Test with a complex analytical query
analytical_query = "Compare the advantages and disadvantages of different AI agent frameworks"
print(f"\n🔍 Testing analytical query: '{analytical_query}'")
response = tree_engine.query(analytical_query)
print(f"📝 TreeSummarize Response:\n{response}")


✅ Query engine with TreeSummarize created

🔍 Testing analytical query: 'Compare the advantages and disadvantages of different AI agent frameworks'


2025-09-21 09:33:30,633 - INFO - query_type :, vector
2025-09-21 09:33:31,867 - INFO - HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 200 OK"


📝 TreeSummarize Response:
Different AI agent frameworks offer various advantages and disadvantages based on their design and intended use cases:

1. **Autonomous Agents**:
   - **AutoGPT**:
     - *Advantages*: High capability for autonomous task execution, making it suitable for complex, independent operations.
     - *Disadvantages*: Steep learning curve, which may be challenging for beginners.
   - **BabyAGI**:
     - *Advantages*: Simplified approach to autonomous AI, making it more accessible.
     - *Disadvantages*: May lack the advanced features needed for more complex tasks.
   - **AgentGPT**:
     - *Advantages*: Web-based platform, which can enhance accessibility and ease of use.
     - *Disadvantages*: Potential limitations in customization compared to more robust frameworks.

2. **Tool-Using Agents**:
   - **LangChain**:
     - *Advantages*: Comprehensive framework for LLM applications, providing a versatile toolset.
     - *Disadvantages*: Medium complexity may require som

In [6]:
# Define the Pydantic models for structured outputs
class ResearchPaperInfo(BaseModel):
    """Structured information about a research paper or AI concept."""
    title: str = Field(description="The main title or concept name")
    key_points: List[str] = Field(description="3-5 main points or findings")
    applications: List[str] = Field(description="Practical applications or use cases")
    summary: str = Field(description="Brief 2-3 sentence summary")

# Import the missing component
from llama_index.core.program import LLMTextCompletionProgram

def create_structured_output_program(output_model: BaseModel = ResearchPaperInfo):
    """Create a structured output program using Pydantic models."""
    # Create output parser with the Pydantic model
    output_parser = PydanticOutputParser(output_model)
    
    # Create the structured output program
    program = LLMTextCompletionProgram.from_defaults(
        output_parser=output_parser,
        prompt_template_str=(
            "Extract structured information from the following context:\n"
            "{context}\n\n"
            "Question: {query}\n\n"
            "Provide the information in the specified JSON format."
        )
    )
    return program

# Test the function
structured_program = create_structured_output_program(ResearchPaperInfo)
print("✅ Structured output program created")

# Test with retrieval and structured extraction
structure_query = "Tell me about AI agents and their capabilities"
print(f"\n🔍 Testing structured query: '{structure_query}'")

# Get context for structured extraction
retriever = VectorIndexRetriever(index=index, similarity_top_k=3)
nodes = retriever.retrieve(structure_query)
context = "\n".join([node.text for node in nodes])

# Extract structured information
response = structured_program(context=context, query=structure_query)
print(f"📊 Structured Response:\n{response}")

print("\n💡 The response follows the ResearchPaperInfo model structure.")


✅ Structured output program created

🔍 Testing structured query: 'Tell me about AI agents and their capabilities'


2025-09-21 09:33:42,162 - INFO - query_type :, vector
2025-09-21 09:33:43,474 - INFO - HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 200 OK"


📊 Structured Response:
title='The Landscape of Emerging AI Agent Architectures for Reasoning, Planning, and Tool Calling: A Survey' key_points=['AI agents show enhanced capabilities in reasoning, planning, and tool execution.', 'Single-agent architectures perform well with defined personas and iterative feedback.', 'Multi-agent systems benefit from clear leadership, defined planning phases, and dynamic team composition.', 'Current agent evaluation benchmarks are inconsistent, complicating comparisons across implementations.', 'Addressing biases and improving reliability are critical for future AI agent development.'] applications=['Autonomous decision-making systems', 'Complex task execution in collaborative environments', 'Enhanced user interaction through intelligent agents', 'Dynamic team formation for specific problem-solving tasks'] summary='This survey explores advancements in AI agent architectures, highlighting their capabilities and limitations. It emphasizes the importance of

In [7]:
def create_advanced_rag_pipeline(index, similarity_cutoff: float = 0.3, top_k: int = 10):
    """Create a comprehensive advanced RAG pipeline combining multiple techniques."""
    # Create similarity postprocessor
    similarity_processor = SimilarityPostprocessor(similarity_cutoff=similarity_cutoff)
    
    # Create TreeSummarize for comprehensive responses
    tree_synthesizer = TreeSummarize()
    
    # Create the comprehensive query engine combining both techniques
    advanced_engine = index.as_query_engine(
        response_synthesizer=tree_synthesizer,
        node_postprocessors=[similarity_processor],
        similarity_top_k=top_k
    )
    
    return advanced_engine

# Test the comprehensive pipeline
advanced_pipeline = create_advanced_rag_pipeline(index)
print("✅ Advanced RAG pipeline created successfully!")
print("   🔧 Similarity filtering: ✅")
print("   🌳 TreeSummarize synthesis: ✅")

# Test with complex query
complex_query = "Analyze the current state and future potential of AI agent technologies"
print(f"\n🔍 Testing complex query: '{complex_query}'")
response = advanced_pipeline.query(complex_query)
print(f"🚀 Advanced RAG Response:\n{response}")

print("\n🎯 This provides:")
print("   - Filtered relevant results only")
print("   - Comprehensive analytical response")
print("   - Combined postprocessing and synthesis")


✅ Advanced RAG pipeline created successfully!
   🔧 Similarity filtering: ✅
   🌳 TreeSummarize synthesis: ✅

🔍 Testing complex query: 'Analyze the current state and future potential of AI agent technologies'


2025-09-21 09:33:47,449 - INFO - query_type :, vector
2025-09-21 09:33:48,585 - INFO - HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 200 OK"
2025-09-21 09:33:53,696 - INFO - HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 200 OK"
2025-09-21 09:34:00,487 - INFO - HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 200 OK"


🚀 Advanced RAG Response:
AI agent technologies are currently advancing significantly, particularly in their ability to achieve complex goals through improved reasoning, planning, and tool execution. Both single-agent and multi-agent systems have shown strong performance across various benchmarks and problem types. However, challenges remain, including the need for comprehensive evaluation benchmarks, real-world applicability, and addressing biases in language models.

Looking to the future, there is substantial potential for AI agents, especially in enhancing reliability and addressing existing limitations. Key areas for improvement include the development of standardized evaluation metrics, robust integration with existing systems, and a focus on scalability and real-time performance. The financial sector, in particular, has seen specialized frameworks that have led to notable productivity gains, with multi-agent systems showing promise in complex tasks like algorithmic trading and fr

In [8]:
# Final demonstration: Advanced RAG Techniques
print("🚀 Advanced RAG Techniques - Complete Solution")
print("=" * 60)

# Test queries for comparison
test_queries = [
    "What are the key capabilities of AI agents?",
    "How do you evaluate agent performance metrics?",
    "Explain the benefits and challenges of multimodal AI systems"
]

# Create basic query engine for comparison
basic_engine = index.as_query_engine(similarity_top_k=5)

print("\n🆚 COMPARISON: Basic vs Advanced RAG")
print("=" * 60)

for i, query in enumerate(test_queries, 1):
    print(f"\n📋 Test Query {i}: '{query}'")
    print("-" * 50)
    
    # Basic RAG
    print("🔹 Basic RAG:")
    basic_response = basic_engine.query(query)
    print(f"   Response: {str(basic_response)[:200]}...")
    
    # Advanced RAG
    print("\n🔸 Advanced RAG:")
    advanced_response = advanced_pipeline.query(query)
    print(f"   Response: {advanced_response}")

print("\n" + "=" * 60)
print("🎉 Advanced RAG Solution Complete!")
print("   ✅ Node postprocessors for result filtering")
print("   ✅ Response synthesizers for better answers")
print("   ✅ Structured outputs for reliable data")
print("   ✅ Advanced pipelines combining all techniques")

print("\n🚀 Key Benefits Demonstrated:")
print("   - Better precision through similarity filtering")
print("   - Comprehensive analysis with TreeSummarize")
print("   - Reliable structured outputs for integration")
print("   - Production-ready advanced RAG systems")


🚀 Advanced RAG Techniques - Complete Solution

🆚 COMPARISON: Basic vs Advanced RAG

📋 Test Query 1: 'What are the key capabilities of AI agents?'
--------------------------------------------------
🔹 Basic RAG:


2025-09-21 09:34:05,490 - INFO - query_type :, vector
2025-09-21 09:34:06,938 - INFO - HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 200 OK"
2025-09-21 09:34:09,959 - INFO - query_type :, vector


   Response: AI agents require several key capabilities to effectively solve real-world problems. These include:

1. **Reasoning**: This is essential for making decisions, solving problems, and interacting with co...

🔸 Advanced RAG:


2025-09-21 09:34:10,803 - INFO - HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 200 OK"
2025-09-21 09:34:15,975 - INFO - HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 200 OK"
2025-09-21 09:34:20,312 - INFO - HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 200 OK"


   Response: AI agents possess several key capabilities that enable them to effectively address real-world challenges. These include:

1. **Reasoning**: They have strong reasoning abilities that allow them to make decisions, solve problems, and learn new tasks quickly, even in uncertain environments.

2. **Planning**: Effective planning involves breaking down tasks into manageable sub-tasks and refining plans based on new information, enhancing their task execution.

3. **Tool Calling**: AI agents can utilize multiple tools to interact with external environments, leveraging various functions and data sources to tackle complex problems.

4. **Iterative Execution**: They follow a plan-act-evaluate process, allowing them to refine their approach based on feedback and results from previous actions.

5. **Collaboration**: In multi-agent systems, they can work together, sharing information and tasks to enhance overall effectiveness, especially in scenarios requiring diverse skills.

6. **Rol

2025-09-21 09:34:25,080 - INFO - query_type :, vector
2025-09-21 09:34:26,437 - INFO - HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 200 OK"
2025-09-21 09:34:28,458 - INFO - query_type :, vector


   Response: Agent performance metrics can be evaluated using several key performance indicators (KPIs). These include:

1. **Accuracy**: This measures the correctness of agent predictions and decisions in various...

🔸 Advanced RAG:


2025-09-21 09:34:29,803 - INFO - HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 200 OK"
2025-09-21 09:34:34,388 - INFO - HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 200 OK"
2025-09-21 09:34:38,768 - INFO - HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 200 OK"


   Response: Agent performance metrics can be evaluated through a variety of key performance indicators (KPIs) that focus on different aspects of the agent's capabilities. Important metrics include:

1. **Accuracy**: Assessing the correctness of the agent's predictions and decisions.
2. **Efficiency**: Evaluating the speed and resource consumption in task completion.
3. **Robustness**: Measuring the agent's ability to handle noisy or incomplete data and unexpected events.
4. **Explainability**: Determining how well the agent's decisions can be justified.

Objective evaluation metrics may involve success rates, similarity of outputs to human responses, and overall efficiency. More nuanced measures, such as the effectiveness of tool use and reliability in planning, are also significant but can be harder to quantify and may require expert evaluation.

Regular monitoring and evaluation are essential for continuous improvement. This includes assessing reasoning abilities, communication effe

2025-09-21 09:34:43,623 - INFO - query_type :, vector
2025-09-21 09:34:44,632 - INFO - HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 200 OK"
2025-09-21 09:34:47,578 - INFO - query_type :, vector


   Response: The discussion primarily focuses on multi-agent architectures, which can be seen as a form of multimodal AI systems. The benefits of such architectures include the intelligent division of labor based ...

🔸 Advanced RAG:


2025-09-21 09:34:49,030 - INFO - HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 200 OK"
2025-09-21 09:34:52,159 - INFO - HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 200 OK"
2025-09-21 09:34:57,643 - INFO - HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 200 OK"


   Response: Multimodal AI systems offer several benefits, including enhanced understanding by integrating various types of data, which leads to improved performance in tasks such as image captioning and video analysis. They also facilitate more natural and intuitive user interactions, allowing engagement through multiple channels like voice commands paired with visual feedback. Additionally, these systems have a broader application range, making them suitable for diverse fields such as healthcare, education, and entertainment.

However, there are notable challenges associated with multimodal AI systems. The complexity of integrating data from different modalities necessitates sophisticated architectures and algorithms, complicating the development process. Data alignment issues can arise, making it difficult to ensure that information from various sources is interpretable and coherent, especially when dealing with different structures and formats. Furthermore, these systems often requ