# Demo #8: Agentic RAG - Autonomous Query Planning and Tool Selection

## Overview

This notebook demonstrates **Agentic RAG**, where an autonomous agent dynamically plans retrieval strategies, selects appropriate tools, and performs multi-step reasoning to answer complex queries.

### Key Concepts

1. **Agentic Workflow**: Thought → Action → Observation loop (ReAct pattern)
2. **Dynamic Tool Selection**: Agent chooses the right tool(s) for each query
3. **Multi-Step Reasoning**: Breaks complex queries into sub-tasks
4. **Tool Orchestration**: Coordinates multiple knowledge sources and external APIs

### The ReAct Agent Architecture

```
Complex Query
    ↓
Agent Analyzes → Plans Sub-Tasks
    ↓
For Each Sub-Task:
    ├─ Thought: "I need information about X"
    ├─ Action: Select tool and execute
    │   ├─ ML Knowledge Base
    │   ├─ Finance Knowledge Base
    │   ├─ Internet Search (DuckDuckGo)
    │   ├─ arXiv Search
    │   └─ arXiv Fetch
    └─ Observation: Process results
    ↓
Synthesize All Observations → Final Answer
```

### Citations

- **Agentic RAG Survey** (arXiv:2501.09136) - Reference #66
- **What is Agentic RAG** | Weaviate - Reference #34
- **ReAct: Synergizing Reasoning and Acting in Language Models**

## 1. Setup and Imports

In [None]:
import os
import sys
from pathlib import Path
from typing import List, Dict

# LlamaIndex core imports
from llama_index.core import (
    VectorStoreIndex,
    SimpleDirectoryReader,
    Settings,
)
from llama_index.core.node_parser import SentenceSplitter
from llama_index.core.tools import QueryEngineTool, FunctionTool
from llama_index.core.agent import ReActAgent
from llama_index.core import ToolMetadata

# Azure OpenAI imports
from llama_index.llms.azure_openai import AzureOpenAI
from llama_index.embeddings.azure_openai import AzureOpenAIEmbedding

# External tools
from duckduckgo_search import DDGS
import arxiv

# Utilities
from dotenv import load_dotenv
import time

load_dotenv()
print("✓ All imports successful")

## 2. Configure Azure OpenAI

In [None]:
# Azure OpenAI Configuration
api_key = os.getenv("AZURE_OPENAI_API_KEY")
azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT")
api_version = os.getenv("AZURE_OPENAI_API_VERSION", "2024-02-15-preview")

# Initialize LLM
llm = AzureOpenAI(
    model="gpt-4",
    deployment_name=os.getenv("AZURE_OPENAI_DEPLOYMENT_NAME"),
    api_key=api_key,
    azure_endpoint=azure_endpoint,
    api_version=api_version,
    temperature=0.1,
)

# Initialize Embedding Model
embed_model = AzureOpenAIEmbedding(
    model="text-embedding-ada-002",
    deployment_name=os.getenv("AZURE_OPENAI_EMBEDDING_DEPLOYMENT"),
    api_key=api_key,
    azure_endpoint=azure_endpoint,
    api_version=api_version,
)

# Configure global settings
Settings.llm = llm
Settings.embed_model = embed_model
Settings.chunk_size = 512
Settings.chunk_overlap = 50

print("✓ Azure OpenAI configured successfully")

## 3. Create Multiple Knowledge Sources

We'll create two distinct knowledge bases:
1. **ML Concepts** - Machine learning algorithms and techniques
2. **Finance Domain** - Financial concepts and strategies

In [None]:
# Load ML concepts documents
ml_data_path = Path("../RAG_v2/data/ml_concepts")
ml_documents = SimpleDirectoryReader(str(ml_data_path)).load_data()
print(f"✓ Loaded {len(ml_documents)} ML concept documents")

# Parse and index ML documents
ml_node_parser = SentenceSplitter(chunk_size=512, chunk_overlap=50)
ml_nodes = ml_node_parser.get_nodes_from_documents(ml_documents)
ml_index = VectorStoreIndex(nodes=ml_nodes, embed_model=embed_model)
print(f"✓ Created ML knowledge index with {len(ml_nodes)} chunks")

# Load finance documents
finance_data_path = Path("../RAG_v2/data/finance_docs")
finance_documents = SimpleDirectoryReader(str(finance_data_path)).load_data()
print(f"✓ Loaded {len(finance_documents)} finance documents")

# Parse and index finance documents
finance_node_parser = SentenceSplitter(chunk_size=512, chunk_overlap=50)
finance_nodes = finance_node_parser.get_nodes_from_documents(finance_documents)
finance_index = VectorStoreIndex(nodes=finance_nodes, embed_model=embed_model)
print(f"✓ Created finance knowledge index with {len(finance_nodes)} chunks")

## 4. Define Knowledge Base Tools

Create query engine tools for each knowledge base with clear descriptions.

In [None]:
# Create ML knowledge tool
ml_tool = QueryEngineTool(
    query_engine=ml_index.as_query_engine(similarity_top_k=3, llm=llm),
    metadata=ToolMetadata(
        name="ml_knowledge",
        description=(
            "Expert knowledge about machine learning algorithms, concepts, and techniques. "
            "Contains information about neural networks, gradient boosting, random forests, "
            "support vector machines, K-means clustering, and other ML fundamentals. "
            "Use this for questions about ML theory, algorithms, and implementations."
        ),
    ),
)

# Create finance knowledge tool
finance_tool = QueryEngineTool(
    query_engine=finance_index.as_query_engine(similarity_top_k=3, llm=llm),
    metadata=ToolMetadata(
        name="finance_knowledge",
        description=(
            "Information about financial products, market analysis, portfolio strategies, "
            "and investment management. Contains knowledge about portfolio diversification, "
            "risk management, quantitative trading, and reinforcement learning in finance. "
            "Use this for questions about financial concepts and strategies."
        ),
    ),
)

print("✓ Knowledge base tools created")

## 5. Define Internet Search Tool

DuckDuckGo search for current information not in our knowledge bases.

In [None]:
def internet_search(query: str) -> str:
    """Search the internet for current information using DuckDuckGo.
    
    Args:
        query: The search query string
    
    Returns:
        Formatted search results with titles, URLs, and snippets
    """
    try:
        ddgs = DDGS()
        results = ddgs.text(query, max_results=5)
        
        if not results:
            return "No search results found."
        
        formatted_results = "\n\n".join([
            f"Title: {r['title']}\nURL: {r['href']}\nSnippet: {r['body']}"
            for r in results
        ])
        return formatted_results
        
    except Exception as e:
        return f"Search failed: {str(e)}"

internet_tool = FunctionTool.from_defaults(
    fn=internet_search,
    name="internet_search",
    description=(
        "Search the internet for current information, news, and real-time data using DuckDuckGo. "
        "Use this tool for queries about recent events, current trends, latest developments, "
        "or any information not found in the internal knowledge bases. "
        "Returns web search results with titles, URLs, and content snippets."
    ),
)

# Test the tool
test_result = internet_search("latest AI news")
print("✓ Internet search tool created")
print(f"\nTest search result (truncated):\n{test_result[:200]}...")

## 6. Define arXiv Search Tool

Search academic papers on arXiv for research-related queries.

In [None]:
def arxiv_search(query: str, max_results: int = 5) -> str:
    """Search arXiv for academic papers related to the query.
    
    Args:
        query: The search query (e.g., "retrieval augmented generation")
        max_results: Maximum number of papers to return (default: 5)
    
    Returns:
        Formatted list of papers with titles, authors, IDs, and summaries
    """
    try:
        search = arxiv.Search(
            query=query,
            max_results=max_results,
            sort_by=arxiv.SortCriterion.Relevance
        )
        
        results = []
        for paper in search.results():
            arxiv_id = paper.entry_id.split('/')[-1]
            results.append(
                f"Title: {paper.title}\n"
                f"Authors: {', '.join([a.name for a in paper.authors[:3]])}" +
                (" et al." if len(paper.authors) > 3 else "") + "\n"
                f"Published: {paper.published.strftime('%Y-%m-%d')}\n"
                f"arXiv ID: {arxiv_id}\n"
                f"Summary: {paper.summary[:300]}...\n"
            )
        
        return "\n---\n".join(results) if results else "No papers found."
        
    except Exception as e:
        return f"arXiv search failed: {str(e)}"

arxiv_search_tool = FunctionTool.from_defaults(
    fn=arxiv_search,
    name="arxiv_search",
    description=(
        "Search arXiv for academic papers and research publications. "
        "Returns paper titles, authors, arXiv IDs, publication dates, and summaries. "
        "Use this for finding recent research papers, academic literature, "
        "and scientific publications on ML, AI, and related topics. "
        "The arXiv IDs can be used with arxiv_fetch to get more details."
    ),
)

print("✓ arXiv search tool created")

## 7. Define arXiv Fetch Tool

Fetch detailed information about a specific paper using its arXiv ID.

In [None]:
def arxiv_fetch(arxiv_id: str) -> str:
    """Fetch full details of a specific arXiv paper by its ID.
    
    Args:
        arxiv_id: The arXiv identifier (e.g., '2301.12345' or '2301.12345v1')
    
    Returns:
        Detailed paper information including full abstract
    """
    try:
        # Clean the ID (remove version if present)
        arxiv_id = arxiv_id.split('v')[0] if 'v' in arxiv_id else arxiv_id
        
        search = arxiv.Search(id_list=[arxiv_id])
        paper = next(search.results())
        
        return (
            f"Title: {paper.title}\n"
            f"Authors: {', '.join([a.name for a in paper.authors])}\n"
            f"Published: {paper.published.strftime('%Y-%m-%d')}\n"
            f"Updated: {paper.updated.strftime('%Y-%m-%d')}\n"
            f"arXiv ID: {paper.entry_id.split('/')[-1]}\n"
            f"PDF URL: {paper.pdf_url}\n"
            f"Categories: {', '.join(paper.categories)}\n\n"
            f"Abstract:\n{paper.summary}\n"
        )
        
    except StopIteration:
        return f"Paper with arXiv ID '{arxiv_id}' not found."
    except Exception as e:
        return f"Failed to fetch paper: {str(e)}"

arxiv_fetch_tool = FunctionTool.from_defaults(
    fn=arxiv_fetch,
    name="arxiv_fetch",
    description=(
        "Fetch full details and complete abstract of a specific arXiv paper using its arXiv ID. "
        "Use this after arxiv_search to get detailed information about a specific paper. "
        "Provide the arXiv ID (e.g., '2301.12345') to get the full abstract, authors, "
        "publication date, categories, and PDF URL."
    ),
)

print("✓ arXiv fetch tool created")

## 8. Create ReAct Agent

Initialize the agent with all available tools.

In [None]:
# Collect all tools
all_tools = [
    ml_tool,
    finance_tool,
    internet_tool,
    arxiv_search_tool,
    arxiv_fetch_tool,
]

# Create ReAct agent
agent = ReActAgent.from_tools(
    tools=all_tools,
    llm=llm,
    verbose=True,
    max_iterations=10,  # Increased for complex multi-step queries
)

print("✓ ReAct Agent initialized with 5 tools:")
print("  1. ml_knowledge - ML concepts and algorithms")
print("  2. finance_knowledge - Financial strategies and concepts")
print("  3. internet_search - Current web information")
print("  4. arxiv_search - Academic paper search")
print("  5. arxiv_fetch - Detailed paper retrieval")

## 9. Test Case 1: Simple Single-Domain Query

Agent should identify and use the correct single tool.

In [None]:
query_1 = "Explain how gradient boosting works and its main advantages."

print("="*80)
print(f"Query 1: {query_1}")
print("="*80)
print("\nExpected: Agent should use ml_knowledge tool\n")

response_1 = agent.chat(query_1)

print("\n" + "="*80)
print("FINAL ANSWER")
print("="*80)
print(response_1.response)

## 10. Test Case 2: Cross-Domain Query

Agent should use multiple knowledge bases.

In [None]:
query_2 = "How can machine learning algorithms be applied to portfolio optimization and risk management in finance?"

print("="*80)
print(f"Query 2: {query_2}")
print("="*80)
print("\nExpected: Agent should use both ml_knowledge AND finance_knowledge tools\n")

response_2 = agent.chat(query_2)

print("\n" + "="*80)
print("FINAL ANSWER")
print("="*80)
print(response_2.response)

## 11. Test Case 3: Current Information Query

Agent should recognize the need for current data and use internet search.

In [None]:
query_3 = "What are the latest developments in large language models as of 2025?"

print("="*80)
print(f"Query 3: {query_3}")
print("="*80)
print("\nExpected: Agent should use internet_search for current information\n")

response_3 = agent.chat(query_3)

print("\n" + "="*80)
print("FINAL ANSWER")
print("="*80)
print(response_3.response)

## 12. Test Case 4: Academic Research Query

Agent should use arXiv search and fetch tools for research papers.

In [None]:
query_4 = "Find recent papers on retrieval-augmented generation and summarize their key findings."

print("="*80)
print(f"Query 4: {query_4}")
print("="*80)
print("\nExpected: Agent should use arxiv_search and possibly arxiv_fetch\n")

response_4 = agent.chat(query_4)

print("\n" + "="*80)
print("FINAL ANSWER")
print("="*80)
print(response_4.response)

## 13. Test Case 5: Complex Multi-Hop Research Query

The most challenging test - requires synthesizing multiple sources.

In [None]:
query_5 = (
    "Compare the effectiveness of reinforcement learning approaches in portfolio management. "
    "What does our knowledge base say about RL in finance, and are there recent research papers "
    "on this topic? Provide a comprehensive analysis."
)

print("="*80)
print(f"Query 5: {query_5}")
print("="*80)
print("\nExpected: Agent should:")
print("  1. Query finance_knowledge for RL in portfolio management")
print("  2. Query ml_knowledge for RL fundamentals")
print("  3. Use arxiv_search to find recent papers")
print("  4. Synthesize all information\n")

response_5 = agent.chat(query_5)

print("\n" + "="*80)
print("FINAL ANSWER")
print("="*80)
print(response_5.response)

## 14. Test Case 6: Advanced Research with Transformer Comparison

Complex query requiring both internal knowledge and external research.

In [None]:
query_6 = (
    "What are the latest breakthroughs in transformer architectures according to recent arXiv papers, "
    "and how do they relate to the fundamental concepts in our ML knowledge base?"
)

print("="*80)
print(f"Query 6: {query_6}")
print("="*80)
print("\nExpected: Agent should:")
print("  1. Use arxiv_search for recent transformer papers")
print("  2. Use arxiv_fetch for detailed abstracts")
print("  3. Query ml_knowledge for transformer fundamentals")
print("  4. Compare and synthesize findings\n")

response_6 = agent.chat(query_6)

print("\n" + "="*80)
print("FINAL ANSWER")
print("="*80)
print(response_6.response)

## 15. Compare with Static RAG

Demonstrate how static RAG fails on complex multi-source queries.

In [None]:
# Create a static RAG baseline (single knowledge base)
static_engine = ml_index.as_query_engine(similarity_top_k=5, llm=llm)

print("="*80)
print("COMPARISON: Agentic RAG vs. Static RAG")
print("="*80)

# Use the cross-domain query
test_query = query_2

print(f"\nQuery: {test_query}")
print("\n--- Static RAG (ML KB only) ---")
static_response = static_engine.query(test_query)
print(static_response.response)

print("\n--- Agentic RAG (Multiple Tools) ---")
print(response_2.response)

print("\n" + "="*80)
print("🔍 Key Differences:")
print("="*80)
print("Static RAG:")
print("  - Limited to single knowledge base (ML only)")
print("  - Cannot access finance knowledge or external sources")
print("  - Provides incomplete answer")
print("\nAgentic RAG:")
print("  - Dynamically selects relevant tools (ML + Finance)")
print("  - Can incorporate external sources (web, arXiv)")
print("  - Provides comprehensive, multi-faceted answer")

## 16. Analyze Agent Reasoning Patterns

In [None]:
import pandas as pd

# Create summary of agent decisions
query_summary = pd.DataFrame([
    {
        "Query Type": "Single Domain",
        "Query": query_1[:50] + "...",
        "Expected Tools": "ml_knowledge",
        "Complexity": "Low",
    },
    {
        "Query Type": "Cross-Domain",
        "Query": query_2[:50] + "...",
        "Expected Tools": "ml_knowledge, finance_knowledge",
        "Complexity": "Medium",
    },
    {
        "Query Type": "Current Info",
        "Query": query_3[:50] + "...",
        "Expected Tools": "internet_search",
        "Complexity": "Medium",
    },
    {
        "Query Type": "Academic Research",
        "Query": query_4[:50] + "...",
        "Expected Tools": "arxiv_search, arxiv_fetch",
        "Complexity": "High",
    },
    {
        "Query Type": "Multi-Hop",
        "Query": query_5[:50] + "...",
        "Expected Tools": "finance_knowledge, ml_knowledge, arxiv_search",
        "Complexity": "Very High",
    },
    {
        "Query Type": "Research Synthesis",
        "Query": query_6[:50] + "...",
        "Expected Tools": "arxiv_search, arxiv_fetch, ml_knowledge",
        "Complexity": "Very High",
    },
])

print("\nAgent Query Analysis Summary")
print("="*80)
print(query_summary.to_string(index=False))
print("\n" + "="*80)

## 17. Key Takeaways

### What We Learned

1. **Dynamic Tool Selection**: The agent intelligently chooses the right tool(s) for each query without hardcoded rules.

2. **Multi-Step Reasoning**: Complex queries are automatically decomposed into sub-tasks:
   - Thought: "I need information about X"
   - Action: Select and execute tool
   - Observation: Process results
   - Repeat until complete

3. **Tool Orchestration**: The agent can coordinate multiple tools in a single query:
   - Internal knowledge bases (ML, Finance)
   - External web search (DuckDuckGo)
   - Academic databases (arXiv)

4. **Adaptability**: Unlike static RAG, agentic RAG adapts to:
   - Query complexity
   - Information availability
   - Required knowledge sources

### The ReAct Pattern

```
Reasoning Trace Example:

Thought: "I need to find ML techniques used in finance"
Action: Query ml_knowledge("reinforcement learning basics")
Observation: [ML fundamentals retrieved]

Thought: "Now I need financial context"
Action: Query finance_knowledge("RL in portfolio management")
Observation: [Finance applications retrieved]

Thought: "Let me check recent research"
Action: arxiv_search("reinforcement learning finance")
Observation: [Recent papers found]

Thought: "I have sufficient information to answer"
Final Answer: [Synthesized response]
```

### When to Use Agentic RAG

- **Complex Information Needs**: Multi-hop questions requiring synthesis
- **Multiple Data Sources**: When information is distributed across domains
- **Dynamic Requirements**: When you can't predict which sources are needed
- **Research Tasks**: Literature review, trend analysis, comparative studies
- **Current + Historical**: Combining static KB with real-time data

### Trade-offs

**Advantages**:
- Maximum flexibility and adaptability
- Can handle unpredictable queries
- Comprehensive answers from multiple sources
- Transparent reasoning process (when verbose)

**Disadvantages**:
- Higher latency (multiple LLM calls)
- Increased cost (reasoning + tool execution)
- Potential for reasoning errors
- Complexity in debugging

### Production Considerations

1. **Tool Design**: Clear, unambiguous tool descriptions are critical
2. **Error Handling**: Tools must gracefully handle failures
3. **Rate Limiting**: External APIs need proper throttling
4. **Caching**: Cache tool results for repeated queries
5. **Monitoring**: Track tool usage, success rates, and latencies
6. **Iteration Limits**: Prevent infinite loops with max_iterations

## 18. Architecture Visualization

```
┌────────────────────────────────────────────────────────────────────┐
│                      AGENTIC RAG ARCHITECTURE                      │
└────────────────────────────────────────────────────────────────────┘

                         Complex User Query
                                ↓
                    ┌─────────────────────┐
                    │   ReAct Agent       │
                    │   (GPT-4)           │
                    └─────────────────────┘
                                ↓
              ┌─────────────────┴─────────────────┐
              ↓                                     ↓
    ┌──────────────────┐                 ┌──────────────────┐
    │  REASONING       │                 │  ACTION          │
    │  "I need X..."   │────────────────→│  Select Tool     │
    └──────────────────┘                 └──────────────────┘
                                                  ↓
                            ┌─────────────────────┼─────────────────────┐
                            ↓                     ↓                     ↓
                    ┌──────────────┐    ┌──────────────┐    ┌──────────────┐
                    │ Internal KB  │    │ External API │    │ Search Tools │
                    ├──────────────┤    ├──────────────┤    ├──────────────┤
                    │ ml_knowledge │    │ arxiv_search │    │ internet_    │
                    │ finance_kb   │    │ arxiv_fetch  │    │   search     │
                    └──────────────┘    └──────────────┘    └──────────────┘
                            ↓                     ↓                     ↓
                            └─────────────────────┼─────────────────────┘
                                                  ↓
                                       ┌──────────────────┐
                                       │  OBSERVATION     │
                                       │  Process Results │
                                       └──────────────────┘
                                                  ↓
                                       ┌──────────────────┐
                                       │  Sufficient?     │
                                       └──────────────────┘
                                          ↓           ↓
                                         No          Yes
                                          ↓           ↓
                                    [Loop Back]   Final Answer
```

## References

1. **Agentic Retrieval-Augmented Generation: A Survey on Agentic RAG** - arXiv:2501.09136
   - Comprehensive survey on agentic design patterns and architectures
   - Reference #66 in workshop curriculum

2. **ReAct: Synergizing Reasoning and Acting in Language Models** - Yao et al., 2022
   - Original paper introducing the ReAct pattern
   - Thought-Action-Observation framework

3. **What is Agentic RAG** | Weaviate - Reference #34
   - Industry perspective on agentic RAG implementations

4. **GFM-RAG: Graph Foundation Model for RAG** - arXiv:2502.01113
   - Advanced graph-based knowledge integration

5. **BeamAggR: Beam Aggregation Reasoning over Multi-source Knowledge** - arXiv:2406.19820
   - Multi-source knowledge integration techniques