# 07 - Adaptive RAG (Query Complexity Router)

**Complexity:** ⭐⭐⭐⭐

**Use Cases:** Mixed workloads, cost optimization, performance/quality balance

**Key Feature:** Classifies query complexity and routes to optimal strategy.

**Routing Logic:**
```
SIMPLE queries    → Fast similarity search
MEDIUM queries    → MMR for diversity
COMPLEX queries   → HyDe for better semantic matching
```

In [1]:
import sys
sys.path.append('../..')

from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from shared.config import OPENAI_VECTOR_STORE_PATH, DEFAULT_MODEL
from shared.utils import load_vector_store, print_section_header, format_docs
from shared.prompts import COMPLEXITY_CLASSIFIER_PROMPT, ADAPTIVE_RAG_PROMPT, HYDE_PROMPT
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnableLambda

print_section_header("Setup: Adaptive RAG")

embeddings = OpenAIEmbeddings()
vectorstore = load_vector_store(OPENAI_VECTOR_STORE_PATH, embeddings)
llm = ChatOpenAI(model=DEFAULT_MODEL, temperature=0)

# Create retrievers
similarity_retriever = vectorstore.as_retriever(search_kwargs={"k": 4})
mmr_retriever = vectorstore.as_retriever(
    search_type="mmr",
    search_kwargs={"k": 4, "fetch_k": 20, "lambda_mult": 0.5}
)

print("✅ Setup complete!")


SETUP: ADAPTIVE RAG

✓ Loaded vector store from /Users/gianlucamazza/Workspace/notebooks/llm_rag/notebooks/advanced_architectures/../../data/vector_stores/openai_embeddings
✅ Setup complete!


## 2. Complexity Classifier

In [2]:
print_section_header("Complexity Classifier")

complexity_classifier = COMPLEXITY_CLASSIFIER_PROMPT | llm | StrOutputParser()

# Test classifier
test_queries = [
    "What is FAISS?",  # SIMPLE
    "Compare OpenAI vs HuggingFace embeddings",  # MEDIUM
    "How to architect production RAG with privacy and cost constraints?"  # COMPLEX
]

for query in test_queries:
    complexity = complexity_classifier.invoke({"query": query}).strip()
    print(f"{complexity:8} | {query}")


COMPLEXITY CLASSIFIER

SIMPLE   | What is FAISS?
MEDIUM   | Compare OpenAI vs HuggingFace embeddings
COMPLEX  | How to architect production RAG with privacy and cost constraints?


## 3. Adaptive Router

In [3]:
print_section_header("Adaptive Router")

# HyDe generator for complex queries
hyde_generator = HYDE_PROMPT | llm | StrOutputParser()

def adaptive_route(query: str):
    """Route query to appropriate retrieval strategy."""
    complexity = complexity_classifier.invoke({"query": query}).strip()
    
    if "SIMPLE" in complexity:
        docs = similarity_retriever.invoke(query)
        strategy = "SIMPLE-Similarity"
    elif "MEDIUM" in complexity:
        docs = mmr_retriever.invoke(query)
        strategy = "MEDIUM-MMR"
    else:  # COMPLEX
        hypo_doc = hyde_generator.invoke({"question": query})
        docs = vectorstore.similarity_search(hypo_doc, k=4)
        strategy = "COMPLEX-HyDe"
    
    return {"context": format_docs(docs), "input": query, "strategy": strategy}

print("✓ Adaptive router configured")
print("  - SIMPLE → Similarity (fast)")
print("  - MEDIUM → MMR (diverse)")
print("  - COMPLEX → HyDe (semantic)")


ADAPTIVE ROUTER

✓ Adaptive router configured
  - SIMPLE → Similarity (fast)
  - MEDIUM → MMR (diverse)
  - COMPLEX → HyDe (semantic)


## 4. Adaptive RAG Chain

In [4]:
print_section_header("Adaptive RAG Chain")

adaptive_chain = (
    RunnableLambda(adaptive_route)
    | ADAPTIVE_RAG_PROMPT
    | llm
    | StrOutputParser()
)

print("✓ Adaptive RAG chain created\n")

# Test with different complexity queries
test_cases = [
    ("What is a retriever?", "SIMPLE"),
    ("Compare similarity vs MMR", "MEDIUM"),
    ("How to handle ambiguous queries in multi-domain systems?", "COMPLEX")
]

for query, expected in test_cases:
    print(f"\nQuery ({expected}): '{query}'")
    print("=" * 80)
    response = adaptive_chain.invoke(query)
    # Show only first 200 chars
    print(response[:200] + "..." if len(response) > 200 else response)
    print()


ADAPTIVE RAG CHAIN

✓ Adaptive RAG chain created


Query (SIMPLE): 'What is a retriever?'
A retriever is a component in a Retrieval Augmented Generation (RAG) system that is responsible for searching and retrieving relevant documents or information from a storage system based on a user's i...


Query (MEDIUM): 'Compare similarity vs MMR'
Similarity and Maximum Marginal Relevance (MMR) are both techniques used in information retrieval and document summarization, but they serve different purposes and operate based on different principle...


Query (COMPLEX): 'How to handle ambiguous queries in multi-domain systems?'
Handling ambiguous queries in multi-domain systems can be challenging, but there are several strategies that can be employed to improve clarity and accuracy in responses:

1. **Clarification Questions...



## Summary

**Flow:**
```
Query → Classify Complexity → Route to Strategy → Retrieve → LLM → Response
```

**Advantages:**
✅ Optimized cost (simple queries use fast path)  
✅ Balanced quality/speed  
✅ Adaptive to query difficulty  
✅ Scalable for mixed workloads  

**Limitations:**
- Classification overhead
- Requires tuning thresholds
- More complex to debug

**Production Tips:**
- Cache classification results
- Monitor routing distribution
- A/B test routing logic
- Add fallback strategy

**Next:** [08_corrective_rag.ipynb](08_corrective_rag.ipynb) - CRAG with web search