## 1. Introduction to Agentic RAG <a id="introduction"></a>

### What is Agentic RAG?

**Agentic RAG** extends traditional Retrieval-Augmented Generation by giving the system autonomous decision-making capabilities:

- **Dynamic Query Planning**: The agent decides when and what to retrieve
- **Self-Reflection**: The agent evaluates whether retrieved information is sufficient
- **Query Refinement**: The agent reformulates queries to get better results
- **Multi-Step Reasoning**: The agent breaks down complex questions into sub-queries
- **Adaptive Retrieval**: The agent adjusts retrieval strategies based on context

### Key Differences from Traditional RAG

| Traditional RAG | Agentic RAG |
|----------------|-------------|
| Fixed retrieve-then-generate flow | Dynamic decision-making about retrieval |
| Single retrieval step | Multiple adaptive retrieval steps |
| No query reformulation | Intelligent query refinement |
| Limited error handling | Self-correction and re-retrieval |
| Static context usage | Context-aware strategy selection |

### When to Use Agentic RAG

Agentic RAG is ideal for:
- Complex multi-step questions requiring information synthesis
- Scenarios where initial retrieval may be insufficient
- Cases requiring query disambiguation or refinement
- Applications needing high-quality, verified responses
- Systems that benefit from explanatory reasoning traces

## 2. Environment Setup <a id="setup"></a>

First, let's install the required NuGet packages and set up our connection to Azure AI Foundry.

In [None]:
#r "nuget: Azure.AI.Projects, 1.0.0-beta.4"
#r "nuget: Azure.Identity, 1.13.1"
#r "nuget: DotNetEnv, 3.1.1"
#r "nuget: Azure.AI.Inference, 1.0.0-beta.2"

using Azure;
using Azure.AI.Projects;
using Azure.AI.Inference;
using Azure.Identity;
using DotNetEnv;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text.Json;
using System.Text.RegularExpressions;

// Load environment variables
Env.Load();

var projectEndpoint = Environment.GetEnvironmentVariable("PROJECT_ENDPOINT");
var modelName = Environment.GetEnvironmentVariable("MODEL");

// Initialize the AI Project client
var credential = new DefaultAzureCredential();
var projectClient = new AIProjectClient(new Uri(projectEndpoint), credential);

// Get the Chat client
var chatClient = projectClient.GetChatClient();

Console.WriteLine("Connected to Azure AI Foundry");
Console.WriteLine($"Using model: {modelName}");

In [None]:
// Create a simple in-memory knowledge base for demonstration
// In production, this would be Azure AI Search, vector database, etc.

public class Document
{
    public string Id { get; set; }
    public string Title { get; set; }
    public string Content { get; set; }
    public string Category { get; set; }
}

var knowledgeBase = new List<Document>
{
    new Document
    {
        Id = "doc1",
        Title = "Azure AI Foundry Overview",
        Content = "Azure AI Foundry is a comprehensive platform for building, deploying, and managing AI applications. It provides tools for prompt engineering, model deployment, evaluation, and monitoring.",
        Category = "platform"
    },
    new Document
    {
        Id = "doc2",
        Title = "RAG Systems Explained",
        Content = "Retrieval-Augmented Generation (RAG) combines information retrieval with text generation. It retrieves relevant documents from a knowledge base and uses them to generate accurate, grounded responses.",
        Category = "rag"
    },
    new Document
    {
        Id = "doc3",
        Title = "Agent Frameworks",
        Content = "Microsoft Agent Framework enables building autonomous agents that can use tools, make decisions, and execute complex workflows. Agents can plan, reflect, and adapt their behavior based on context.",
        Category = "agents"
    },
    new Document
    {
        Id = "doc4",
        Title = "Vector Embeddings",
        Content = "Vector embeddings are numerical representations of text that capture semantic meaning. They enable similarity search by finding documents with similar meaning rather than just matching keywords.",
        Category = "embeddings"
    },
    new Document
    {
        Id = "doc5",
        Title = "Prompt Engineering Best Practices",
        Content = "Effective prompt engineering involves being specific, providing context, using examples (few-shot learning), and structuring prompts clearly. Chain-of-thought prompting helps with complex reasoning tasks.",
        Category = "prompts"
    },
    new Document
    {
        Id = "doc6",
        Title = "Azure AI Search Integration",
        Content = "Azure AI Search provides powerful indexing and retrieval capabilities. It supports vector search, hybrid search, and semantic ranking for improved relevance in RAG applications.",
        Category = "search"
    },
    new Document
    {
        Id = "doc7",
        Title = "Model Evaluation Techniques",
        Content = "Evaluating AI models involves metrics like relevance, groundedness, coherence, and fluency. Azure AI Foundry provides built-in evaluation tools to measure model performance systematically.",
        Category = "evaluation"
    },
    new Document
    {
        Id = "doc8",
        Title = "Agent Tool Usage",
        Content = "Agents can use various tools including search APIs, calculators, code interpreters, and custom functions. Tool selection and usage is a key capability that distinguishes agents from simple chatbots.",
        Category = "agents"
    }
};

Console.WriteLine($"Loaded {knowledgeBase.Count} documents into knowledge base");
Console.WriteLine("\nSample documents:");
foreach (var doc in knowledgeBase.Take(3))
{
    Console.WriteLine($"- {doc.Title} ({doc.Category})");
}

In [2]:
// Helper function for simple keyword-based retrieval (simplified for demo)
// In production, use semantic search with embeddings

List<Document> SimpleRetrieve(string query, int topK = 3)
{
    var queryLower = query.ToLower();
    var scoredDocs = new List<(int score, Document doc)>();
    
    foreach (var doc in knowledgeBase)
    {
        var score = 0;
        var contentLower = (doc.Title + " " + doc.Content).ToLower();
        
        foreach (var word in queryLower.Split(' '))
        {
            if (word.Length > 3)
            {
                score += Regex.Matches(contentLower, Regex.Escape(word)).Count;
            }
        }
        
        if (score > 0)
        {
            scoredDocs.Add((score, doc));
        }
    }
    
    return scoredDocs
        .OrderByDescending(x => x.score)
        .Take(topK)
        .Select(x => x.doc)
        .ToList();
}

// Test retrieval
var testQuery = "How does RAG work?";
var retrieved = SimpleRetrieve(testQuery);
Console.WriteLine($"Retrieved {retrieved.Count} documents for query: '{testQuery}'");
foreach (var doc in retrieved)
{
    Console.WriteLine($"- {doc.Title}");
}

Error: (4,6): error CS0246: The type or namespace name 'Document' could not be found (are you missing a using directive or an assembly reference?)
(7,43): error CS0246: The type or namespace name 'Document' could not be found (are you missing a using directive or an assembly reference?)
(9,25): error CS0103: The name 'knowledgeBase' does not exist in the current context
(18,26): error CS0103: The name 'Regex' does not exist in the current context
(18,54): error CS0103: The name 'Regex' does not exist in the current context
(24,28): error CS1503: Argument 1: cannot convert from '(int, var)' to '(int score, Document doc)'

## 3. Traditional RAG vs. Agentic RAG <a id="comparison"></a>

Let's compare how traditional RAG and agentic RAG handle the same question.

In [None]:
// Traditional RAG: Simple retrieve-then-generate

async Task<string> TraditionalRAG(string query)
{
    // Step 1: Retrieve relevant documents
    var retrievedDocs = SimpleRetrieve(query, 2);
    
    // Step 2: Build context from retrieved documents
    var context = string.Join("\n\n", retrievedDocs.Select(doc => $"{doc.Title}:\n{doc.Content}"));
    
    // Step 3: Generate response using context
    var messages = new List<ChatRequestMessage>
    {
        new ChatRequestSystemMessage("You are a helpful assistant. Use the provided context to answer questions accurately."),
        new ChatRequestUserMessage($"Context:\n{context}\n\nQuestion: {query}\n\nAnswer based on the context provided:")
    };
    
    var requestOptions = new ChatCompletionsOptions
    {
        Temperature = 0.3f,
        MaxTokens = 300
    };
    
    foreach (var message in messages)
    {
        requestOptions.Messages.Add(message);
    }
    
    var response = await chatClient.CompleteAsync(modelName, requestOptions);
    return response.Value.Content;
}

// Test traditional RAG
var question = "How can I build an intelligent agent that uses search capabilities?";
Console.WriteLine("Traditional RAG Response:");
Console.WriteLine(new string('=', 80));
var response = await TraditionalRAG(question);
Console.WriteLine(response);
Console.WriteLine("\n" + new string('=', 80));

**Key Observations:**

- **Traditional RAG**: Single retrieval, direct answer generation
- **Agentic RAG**: Query planning, iterative retrieval, self-reflection, comprehensive answer

The agentic approach provides:
1. Better query understanding and decomposition
2. Self-verification of information sufficiency
3. Ability to retrieve additional information if needed
4. Transparent reasoning trace for debugging and trust

## 4. Summary and Best Practices <a id="summary"></a>

### Key Takeaways

**Agentic RAG Capabilities:**
1. **Query Planning**: Break down complex questions into sub-queries
2. **Query Reformulation**: Generate alternative phrasings for better retrieval
3. **Self-Reflection**: Evaluate if retrieved information is sufficient
4. **Corrective Retrieval**: Re-retrieve if initial results are poor
5. **Adaptive Strategy**: Adjust approach based on query type
6. **Transparent Reasoning**: Provide trace of decision-making process

### Implementation Best Practices

1. **Start Simple**: Begin with basic RAG, add agentic features incrementally
2. **Use Proper Retrieval**: In production, use vector search (Azure AI Search, embeddings)
3. **Set Iteration Limits**: Prevent infinite loops in self-reflection cycles
4. **Monitor Costs**: Agentic RAG uses more API calls; optimize accordingly
5. **Log Everything**: Keep detailed traces for debugging and improvement
6. **Evaluate Performance**: Measure quality improvements vs. cost increases
7. **Handle Failures**: Implement fallbacks when retrieval or generation fails
8. **Optimize Prompts**: Fine-tune prompts for each agentic step

### When to Use Agentic RAG

**Good Use Cases:**
- Complex multi-step questions
- High-stakes applications requiring accuracy
- Scenarios with ambiguous queries
- Applications benefiting from explainability

**Maybe Not Needed:**
- Simple factual lookups
- Cost-sensitive applications
- Real-time, low-latency requirements
- Well-defined, straightforward queries

### Production Considerations

1. **Use Vector Search**: Implement proper semantic search with embeddings
2. **Caching**: Cache retrieved documents and intermediate results
3. **Parallel Processing**: Run sub-queries in parallel when possible
4. **Streaming**: Stream responses for better user experience
5. **Monitoring**: Track success rates, iteration counts, costs
6. **A/B Testing**: Compare agentic vs. traditional RAG performance

### Additional Resources

- [Agentic Retrieval in Azure AI Search](https://techcommunity.microsoft.com/blog/azure-ai-foundry-blog/introducing-agentic-retrieval-in-azure-ai-search/4414677)
- [Azure AI Search Documentation](https://learn.microsoft.com/azure/search/)
- [Microsoft Agent Framework](https://learn.microsoft.com/azure/ai-studio/)
- [RAG Patterns and Best Practices](https://learn.microsoft.com/azure/ai-studio/concepts/retrieval-augmented-generation)

In [None]:
// Your practice code here
// Try building agentic RAG enhancements!