# Assignment 2: Advanced RAG Techniques
## Day 6 Session 2 - Advanced RAG Fundamentals

**OBJECTIVE:** Implement advanced RAG techniques including postprocessors, response synthesizers, and structured outputs.

**LEARNING GOALS:**
- Understand and implement node postprocessors for filtering and reranking
- Learn different response synthesis strategies (TreeSummarize, Refine)
- Create structured outputs using Pydantic models
- Build advanced retrieval pipelines with multiple processing stages

**DATASET:** Use the same data folder as Assignment 1 (`Day_6/session_2/data/`)

**PREREQUISITES:** Complete Assignment 1 first

**INSTRUCTIONS:**
1. Complete each function by replacing the TODO comments with actual implementation
2. Run each cell after completing the function to test it
3. The answers can be found in the `03_advanced_rag_techniques.ipynb` notebook
4. Each technique builds on the previous one


In [1]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [2]:
!pip install -r '/content/drive/MyDrive/requirements.txt'



In [5]:
# Import required libraries for advanced RAG
import os
from pathlib import Path
from typing import Dict, List, Optional, Any
from pydantic import BaseModel, Field

# Core LlamaIndex components
from llama_index.core import SimpleDirectoryReader, VectorStoreIndex, StorageContext, Settings
from llama_index.core.query_engine import RetrieverQueryEngine
from llama_index.core.retrievers import VectorIndexRetriever

# Vector store
from llama_index.vector_stores.lancedb import LanceDBVectorStore

# Embeddings and LLM
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.llms.openrouter import OpenRouter

# Advanced RAG components (we'll use these in the assignments)
from llama_index.core.postprocessor import SimilarityPostprocessor
from llama_index.core.response_synthesizers import TreeSummarize, Refine, CompactAndRefine
from llama_index.core.output_parsers import PydanticOutputParser
from google.colab import userdata

print("‚úÖ Advanced RAG libraries imported successfully!")


‚úÖ Advanced RAG libraries imported successfully!


In [6]:
# Configure Advanced RAG Settings (Using OpenRouter)
def setup_advanced_rag_settings():
    """
    Configure LlamaIndex with optimized settings for advanced RAG.
    Uses local embeddings and OpenRouter for LLM operations.
    """
    # Check for OpenRouter API key
    api_key = userdata.get('OPENROUTER_API_KEY')
    if not api_key:
        print("‚ö†Ô∏è  OPENROUTER_API_KEY not found - LLM operations will be limited")
        print("   You can still complete postprocessor and retrieval exercises")
    else:
        print("‚úÖ OPENROUTER_API_KEY found - full advanced RAG functionality available")

        # Configure OpenRouter LLM
        Settings.llm = OpenRouter(
            api_key=api_key,
            model="gpt-4o",
            temperature=0.1  # Lower temperature for more consistent responses
        )

    # Configure local embeddings (no API key required)
    Settings.embed_model = HuggingFaceEmbedding(
        model_name="BAAI/bge-small-en-v1.5",
        trust_remote_code=True
    )

    # Advanced RAG configuration
    Settings.chunk_size = 512  # Smaller chunks for better precision
    Settings.chunk_overlap = 50

    print("‚úÖ Advanced RAG settings configured")
    print("   - Chunk size: 512 (optimized for precision)")
    print("   - Using local embeddings for cost efficiency")
    print("   - OpenRouter LLM ready for response synthesis")

# Setup the configuration
setup_advanced_rag_settings()


‚úÖ OPENROUTER_API_KEY found - full advanced RAG functionality available
‚úÖ Advanced RAG settings configured
   - Chunk size: 512 (optimized for precision)
   - Using local embeddings for cost efficiency
   - OpenRouter LLM ready for response synthesis


In [8]:
# Setup: Create index from Assignment 1 (reuse the basic functionality)
def setup_basic_index(data_folder: str = "/content/drive/MyDrive/data", force_rebuild: bool = False):
    """
    Create a basic vector index that we'll enhance with advanced techniques.
    This reuses the concepts from Assignment 1.
    """
    # Create vector store
    vector_store = LanceDBVectorStore(
        uri="./advanced_rag_vectordb",
        table_name="documents"
    )

    # Load documents
    if not Path(data_folder).exists():
        print(f"‚ùå Data folder not found: {data_folder}")
        return None

    reader = SimpleDirectoryReader(input_dir=data_folder, recursive=True)
    documents = reader.load_data()

    # Create storage context and index
    storage_context = StorageContext.from_defaults(vector_store=vector_store)
    index = VectorStoreIndex.from_documents(
        documents,
        storage_context=storage_context,
        show_progress=True
    )

    print(f"‚úÖ Basic index created with {len(documents)} documents")
    print("   Ready for advanced RAG techniques!")
    return index

# Create the basic index
print("üìÅ Setting up basic index for advanced RAG...")
index = setup_basic_index()

if index:
    print("üöÄ Ready to implement advanced RAG techniques!")
else:
    print("‚ùå Failed to create index - check data folder path")




üìÅ Setting up basic index for advanced RAG...


100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 139M/139M [00:12<00:00, 11.9MiB/s]


Parsing nodes:   0%|          | 0/42 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/89 [00:00<?, ?it/s]

‚úÖ Basic index created with 42 documents
   Ready for advanced RAG techniques!
üöÄ Ready to implement advanced RAG techniques!


## 1. Node Postprocessors - Similarity Filtering

**Concept:** Postprocessors refine retrieval results after the initial vector search. The `SimilarityPostprocessor` filters out chunks that fall below a relevance threshold.

**Why it matters:** Raw vector search often returns some irrelevant results. Filtering improves precision and response quality.

Complete the function below to create a query engine with similarity filtering.


In [10]:
def create_query_engine_with_similarity_filter(index, similarity_cutoff: float = 0.3, top_k: int = 10):
    """
    Create a query engine that filters results based on similarity scores.

    TODO: Complete this function to create a query engine with similarity postprocessing.
    HINT: Use index.as_query_engine() with node_postprocessors parameter containing SimilarityPostprocessor

    Args:
        index: Vector index to query
        similarity_cutoff: Minimum similarity score (0.0 to 1.0)
        top_k: Number of initial results to retrieve before filtering

    Returns:
        Query engine with similarity filtering
    """
    # TODO: Create similarity postprocessor with the cutoff threshold
    similarity_processor = SimilarityPostprocessor(similarity_cutoff=similarity_cutoff)

    # TODO: Create query engine with similarity filtering
    query_engine = index.as_query_engine(similarity_top_k=top_k,node_postprocessors=[similarity_processor])

    return query_engine

    # PLACEHOLDER - Replace with actual implementation
    print(f"TODO: Create query engine with similarity cutoff {similarity_cutoff}")
    return None

# Test the function
if index:
    filtered_engine = create_query_engine_with_similarity_filter(index, similarity_cutoff=0.3)

    if filtered_engine:
        print("‚úÖ Query engine with similarity filtering created")

        # Test query
        test_query = "What are the benefits of AI agents?"
        print(f"\nüîç Testing query: '{test_query}'")

        # Uncomment when implemented:
        response = filtered_engine.query(test_query)
        print(f"üìù Response: {response}")
        print("   (Complete the function above to test the response)")
    else:
        print("‚ùå Failed to create filtered query engine")
else:
    print("‚ùå No index available - run previous cells first")


‚úÖ Query engine with similarity filtering created

üîç Testing query: 'What are the benefits of AI agents?'
üìù Response: AI agents offer several benefits, including enhanced problem-solving capabilities and the ability to autonomously manage complex tasks. They are equipped with reasoning and planning skills, which allow them to interact effectively with complex environments and assist humans in various tasks. AI agents can operate within single or multi-agent architectures, providing flexibility in task execution and collaboration. Multi-agent systems, in particular, excel in executing tasks in parallel and incorporating diverse feedback, which improves performance in scenarios without predefined examples. Additionally, AI agents can adapt to new information and iteratively refine their actions, leading to more robust decision-making.
   (Complete the function above to test the response)


## 2. Response Synthesizers - TreeSummarize

**Concept:** Response synthesizers control how retrieved information becomes final answers. `TreeSummarize` builds responses hierarchically, ideal for complex analytical questions.

**Why it matters:** Different synthesis strategies work better for different query types. TreeSummarize excels at comprehensive analysis and long-form responses.

Complete the function below to create a query engine with TreeSummarize response synthesis.


In [11]:
def create_query_engine_with_tree_summarize(index, top_k: int = 5):
    """
    Create a query engine that uses TreeSummarize for comprehensive responses.

    TODO: Complete this function to create a query engine with TreeSummarize synthesis.
    HINT: Create a TreeSummarize instance, then use index.as_query_engine() with response_synthesizer parameter

    Args:
        index: Vector index to query
        top_k: Number of results to retrieve

    Returns:
        Query engine with TreeSummarize synthesis
    """
    # TODO: Create TreeSummarize response synthesizer
    tree_synthesizer = TreeSummarize(verbose=True)

    # TODO: Create query engine with the synthesizer
    query_engine = index.as_query_engine(response_synthesizer=tree_synthesizer, similarity_top_k=top_k)

    return query_engine

    # PLACEHOLDER - Replace with actual implementation
    print(f"TODO: Create query engine with TreeSummarize synthesis")
    return None

# Test the function
if index:
    tree_engine = create_query_engine_with_tree_summarize(index)

    if tree_engine:
        print("‚úÖ Query engine with TreeSummarize created")

        # Test with a complex analytical query
        analytical_query = "Compare the advantages and disadvantages of different AI agent frameworks"
        print(f"\nüîç Testing analytical query: '{analytical_query}'")

        # Uncomment when implemented:
        response = tree_engine.query(analytical_query)
        print(f"üìù TreeSummarize Response:\n{response}")
        print("   (Complete the function above to test comprehensive analysis)")
    else:
        print("‚ùå Failed to create TreeSummarize query engine")
else:
    print("‚ùå No index available - run previous cells first")


‚úÖ Query engine with TreeSummarize created

üîç Testing analytical query: 'Compare the advantages and disadvantages of different AI agent frameworks'
1 text chunks after repacking
üìù TreeSummarize Response:
Different AI agent frameworks offer various advantages and disadvantages based on their design and intended use cases:

1. **Autonomous Agents:**
   - **Advantages:** Frameworks like AutoGPT and BabyAGI are pioneers in autonomous task execution, allowing for high levels of automation and independence in task handling.
   - **Disadvantages:** They often come with a steep learning curve and can be complex to implement, making them less suitable for beginners.

2. **Tool-Using Agents:**
   - **Advantages:** Frameworks such as LangChain and LlamaIndex are designed for specific applications like LLM applications and document understanding, respectively. They offer a more straightforward setup and are generally easier to use, making them ideal for beginners.
   - **Disadvantages:** Th

## 3. Structured Outputs with Pydantic Models

**Concept:** Structured outputs ensure predictable, parseable responses using Pydantic models. This is essential for API endpoints and data pipelines.

**Why it matters:** Instead of free-text responses, you get type-safe, validated data structures that applications can reliably process.

Complete the function below to create a structured output system for extracting research paper information.


In [21]:
# First, define the Pydantic models for structured outputs
class ResearchPaperInfo(BaseModel):
    """Structured information about a research paper or AI concept."""
    title: str = Field(description="The main title or concept name")
    key_points: List[str] = Field(description="3-5 main points or findings")
    applications: List[str] = Field(description="Practical applications or use cases")
    summary: str = Field(description="Brief 2-3 sentence summary")

# Import the missing component
from llama_index.core.program import LLMTextCompletionProgram

def create_structured_output_program(output_model: BaseModel = ResearchPaperInfo):
    """
    Create a structured output program using Pydantic models.

    TODO: Complete this function to create a structured output program.
    HINT: Use LLMTextCompletionProgram.from_defaults() with PydanticOutputParser and a prompt template

    Args:
        output_model: Pydantic model class for structured output

    Returns:
        LLMTextCompletionProgram that returns structured data
    """
    # TODO: Create output parser with the Pydantic model
    output_parser = PydanticOutputParser(output_model)

    # TODO: Create the structured output program
    program = LLMTextCompletionProgram.from_defaults(output_parser=output_parser,
        prompt_template_str="Extract structured information from the following text:\n{input_text}\n",
        verbose=True)

    print("‚úÖ Structured output program created")

    return program

    # PLACEHOLDER - Replace with actual implementation
    print(f"TODO: Create structured output program with {output_model.__name__}")
    return None

# Test the function
if index:
    structured_program = create_structured_output_program(ResearchPaperInfo)

    if structured_program:
        print("‚úÖ Structured output program created")

        # Test with retrieval and structured extraction
        structure_query = "Tell me about AI agents and their capabilities"
        print(f"\nüîç Testing structured query: '{structure_query}'")

        # Get context for structured extraction (Uncomment when implemented)
        retriever = VectorIndexRetriever(index=index, similarity_top_k=3)
        nodes = retriever.retrieve(structure_query)
        context = "\n".join([node.text for node in nodes])

        # Uncomment when implemented:
        response = structured_program(input_text=context)
        print(f"üìä Structured Response:\n{response}")
        print("   (Complete the function above to get structured JSON output)")

        print("\nüí° Expected output format:")
        print("   - title: String")
        print("   - key_points: List of strings")
        print("   - applications: List of strings")
        print("   - summary: String")
    else:
        print("‚ùå Failed to create structured output program")
else:
    print("‚ùå No index available - run previous cells first")


‚úÖ Structured output program created
‚úÖ Structured output program created

üîç Testing structured query: 'Tell me about AI agents and their capabilities'
üìä Structured Response:
title='The Landscape of Emerging AI Agent Architectures for Reasoning, Planning, and Tool Calling: A Survey' key_points=['AI-driven agents have notable limitations and areas for future improvement.', 'Multi-agent architectures are categorized into vertical and horizontal structures.', 'Reasoning, planning, and tool calling are critical to agent success.', 'The survey provides insights into single-agent and multi-agent architectures.', 'The paper outlines key themes for selecting agentic architectures and their impact.'] applications=['Enhancing AI agent capabilities for real-world problem-solving.', 'Developing autonomous agent-based systems.', 'Improving AI communication and collaboration through multi-agent architectures.'] summary='This survey paper examines advancements in AI agent implementations, foc

## 4. Advanced Pipeline - Combining All Techniques

**Concept:** Combine multiple advanced techniques into a single powerful query engine: similarity filtering + response synthesis + structured output.

**Why it matters:** Production RAG systems often need multiple techniques working together for optimal results.

Complete the function below to create a comprehensive advanced RAG pipeline.


In [22]:
def create_advanced_rag_pipeline(index, similarity_cutoff: float = 0.3, top_k: int = 10):
    """
    Create a comprehensive advanced RAG pipeline combining multiple techniques.

    TODO: Complete this function to create the ultimate advanced RAG query engine.
    HINT: Combine SimilarityPostprocessor + TreeSummarize using index.as_query_engine()

    Args:
        index: Vector index to query
        similarity_cutoff: Minimum similarity score for filtering
        top_k: Number of initial results to retrieve

    Returns:
        Advanced query engine with filtering and synthesis combined
    """
    # TODO: Create similarity postprocessor
    similarity_processor = SimilarityPostprocessor(similarity_cutoff=similarity_cutoff)

    # TODO: Create TreeSummarize for comprehensive responses
    tree_synthesizer = TreeSummarize(verbose=True)

    # TODO: Create the comprehensive query engine combining both techniques
    advanced_engine = index.as_query_engine(similarity_top_k=top_k, node_postprocessors=[similarity_processor], response_synthesizer=tree_synthesizer)

    return advanced_engine

    # PLACEHOLDER - Replace with actual implementation
    print(f"TODO: Create advanced RAG pipeline with all techniques")
    return None

# Test the comprehensive pipeline
if index:
    advanced_pipeline = create_advanced_rag_pipeline(index)

    if advanced_pipeline:
        print("‚úÖ Advanced RAG pipeline created successfully!")
        print("   üîß Similarity filtering: ‚úÖ")
        print("   üå≥ TreeSummarize synthesis: ‚úÖ")

        # Test with complex query
        complex_query = "Analyze the current state and future potential of AI agent technologies"
        print(f"\nüîç Testing complex query: '{complex_query}'")

        # Uncomment when implemented:
        response = advanced_pipeline.query(complex_query)
        print(f"üöÄ Advanced RAG Response:\n{response}")
        print("   (Complete the function above to test the full pipeline)")

        print("\nüéØ This should provide:")
        print("   - Filtered relevant results only")
        print("   - Comprehensive analytical response")
        print("   - Combined postprocessing and synthesis")
    else:
        print("‚ùå Failed to create advanced RAG pipeline")
else:
    print("‚ùå No index available - run previous cells first")


‚úÖ Advanced RAG pipeline created successfully!
   üîß Similarity filtering: ‚úÖ
   üå≥ TreeSummarize synthesis: ‚úÖ

üîç Testing complex query: 'Analyze the current state and future potential of AI agent technologies'
2 text chunks after repacking
1 text chunks after repacking
üöÄ Advanced RAG Response:
AI agent technologies are currently in a promising phase, characterized by significant advancements that enable these systems to achieve complex goals through improved reasoning, planning, and tool execution capabilities. They are evolving from simple chat interfaces to more dynamic, autonomous systems capable of performing general-purpose tasks. Single-agent systems are effective in well-defined scenarios, while multi-agent systems are advantageous for tasks requiring collaboration and diverse skills.

In the financial sector, AI agent technologies are particularly advancing, leveraging multi-agent systems and large language models to create intelligent financial agents that can c

## 5. Final Test - Compare Basic vs Advanced RAG

Once you've completed all the functions above, run this cell to compare basic RAG with your advanced techniques.


In [23]:
# Final comparison: Basic vs Advanced RAG
print("üöÄ Advanced RAG Techniques Assignment - Final Test")
print("=" * 60)

# Test queries for comparison
test_queries = [
    "What are the key capabilities of AI agents?",
    "How do you evaluate agent performance metrics?",
    "Explain the benefits and challenges of multimodal AI systems"
]

# Check if all components were created
components_status = {
    "Basic Index": index is not None,
    "Similarity Filter": 'filtered_engine' in locals() and filtered_engine is not None,
    "TreeSummarize": 'tree_engine' in locals() and tree_engine is not None,
    "Structured Output": 'structured_program' in locals() and structured_program is not None,
    "Advanced Pipeline": 'advanced_pipeline' in locals() and advanced_pipeline is not None
}

print("\nüìä Component Status:")
for component, status in components_status.items():
    status_icon = "‚úÖ" if status else "‚ùå"
    print(f"   {status_icon} {component}")

# Create basic query engine for comparison
if index:
    print("\nüîç Creating basic query engine for comparison...")
    basic_engine = index.as_query_engine(similarity_top_k=5)

    print("\n" + "=" * 60)
    print("üÜö COMPARISON: Basic vs Advanced RAG")
    print("=" * 60)

    for i, query in enumerate(test_queries, 1):
        print(f"\nüìã Test Query {i}: '{query}'")
        print("-" * 50)

        # Basic RAG
        print("üîπ Basic RAG:")
        if basic_engine:
            # Uncomment when testing:
            # basic_response = basic_engine.query(query)
            # print(f"   Response: {str(basic_response)[:200]}...")
            print("   (Standard vector search + simple response)")

        # Advanced RAG (if implemented)
        print("\nüî∏ Advanced RAG:")
        if components_status["Advanced Pipeline"]:
            # Uncomment when testing:
            # advanced_response = advanced_pipeline.query(query)
            # print(f"   Response: {advanced_response}")
            print("   (Filtered + TreeSummarize + Structured output)")
        else:
            print("   Complete the advanced pipeline function to test")

# Final status
print("\n" + "=" * 60)
print("üéØ Assignment Status:")
completed_count = sum(components_status.values())
total_count = len(components_status)

print(f"   Completed: {completed_count}/{total_count} components")

if completed_count == total_count:
    print("\nüéâ Congratulations! You've mastered Advanced RAG Techniques!")
    print("   ‚úÖ Node postprocessors for result filtering")
    print("   ‚úÖ Response synthesizers for better answers")
    print("   ‚úÖ Structured outputs for reliable data")
    print("   ‚úÖ Advanced pipelines combining all techniques")
    print("\nüöÄ You're ready for production RAG systems!")
else:
    missing = total_count - completed_count
    print(f"\nüìù Complete {missing} more components to finish the assignment:")
    for component, status in components_status.items():
        if not status:
            print(f"   - {component}")

print("\nüí° Key learnings:")
print("   - Postprocessors improve result relevance and precision")
print("   - Different synthesizers work better for different query types")
print("   - Structured outputs enable reliable system integration")
print("   - Advanced techniques can be combined for production systems")


üöÄ Advanced RAG Techniques Assignment - Final Test

üìä Component Status:
   ‚úÖ Basic Index
   ‚úÖ Similarity Filter
   ‚úÖ TreeSummarize
   ‚úÖ Structured Output
   ‚úÖ Advanced Pipeline

üîç Creating basic query engine for comparison...

üÜö COMPARISON: Basic vs Advanced RAG

üìã Test Query 1: 'What are the key capabilities of AI agents?'
--------------------------------------------------
üîπ Basic RAG:
   (Standard vector search + simple response)

üî∏ Advanced RAG:
   (Filtered + TreeSummarize + Structured output)

üìã Test Query 2: 'How do you evaluate agent performance metrics?'
--------------------------------------------------
üîπ Basic RAG:
   (Standard vector search + simple response)

üî∏ Advanced RAG:
   (Filtered + TreeSummarize + Structured output)

üìã Test Query 3: 'Explain the benefits and challenges of multimodal AI systems'
--------------------------------------------------
üîπ Basic RAG:
   (Standard vector search + simple response)

üî∏ Advanced RAG: