# Chapter 14: Retrieval-Augmented Generation (RAG)


Key Takeaways:
- **Augmented Knowledge**: RAG enables agents to access and utilize external knowledge bases that they were not trained on.
- **Contextual Relevance**: By retrieving relevant information before generating a response, agents can provide more accurate and context-aware answers.
- **Simulated Memory**: In this example, we use a service to simulate retrieving documents based on semantic search.
- **Tool Integration**: Combining RAG with search tools allows for a powerful research assistant capable of synthesizing internal and external information.

### Heuristic: *Connect memory to logic. RAG grounds generation in retrieved facts.*

## Setup and Initialization

In [None]:
import os
import sys
from dotenv import load_dotenv
from typing import Optional, Dict, List, Any

# Add scripts directory to path to import custom modules
PROJECT_ROOT = os.path.dirname(os.getcwd())
SCRIPTS_DIR = os.path.join(PROJECT_ROOT, "scripts")
sys.path.insert(0, SCRIPTS_DIR)

# Import our custom ADK implementation and RAG memory service
from custom_adk import Agent, google_search
from rag_memory import VertexAiRagMemoryService

# Load environment variables
load_dotenv()

# Verify API key
if not os.getenv("GOOGLE_API_KEY"):
    print("❌ Please set the GOOGLE_API_KEY environment variable.")
else:
    print("✅ Configuration Loaded")

## 1. Initialize RAG Memory Service

We setup the connection to our vector database (simulated here) to retrieve relevant chunks of information.

In [None]:
# Configuration for RAG
RAG_CORPUS_RESOURCE_NAME = os.getenv("REASONING_ENGINE_ID")
SIMILARITY_TOP_K = 5
VECTOR_DISTANCE_THRESHOLD = 0.7

# Initialize the memory service
memory_service = VertexAiRagMemoryService(
    rag_corpus=RAG_CORPUS_RESOURCE_NAME,
    similarity_top_k=SIMILARITY_TOP_K,
    vector_distance_threshold=VECTOR_DISTANCE_THRESHOLD
)

## 2. Define Agent

We create a Research Assistant agent equipped with the Google Search tool. In a full RAG system, the agent might also have a tool to query the memory service directly, or the memory service might be used to pre-populate context.

In [None]:
search_agent = Agent(
    name="research_assistant",
    model="gemini-2.5-flash",
    instruction="""
You help users research topics. 
Use the google_search tool when you need current information from the web.
Synthesize the information you find into clear, concise summaries.
""",
    tools=[google_search]
)

## 3. Execution Examples

### Scenario 1: Basic Retrieval
Simulating the retrieval of context relevant to a user query.

In [None]:
user_query = "What data "

# 1. Retrieve relevant context
retrieved_docs = memory_service.retrieve(user_query)

# 2. Format context for the agent (demonstration)
context_str = "\n\n".join([f"Source: {d['source']}\nContent: {d['content']}" for d in retrieved_docs])

print(f"--- Constructed Context ---\n{context_str}\n---------------------------")

### Scenario 2: Agent Execution with Context
Now we pass this context to the agent along with the user's question, allowing it to answer based on the retrieved knowledge.

In [None]:
# Construct the full prompt with context
augmented_prompt = f"""
Context Information:
{context_str}

Question: {user_query}
"""

state = {
    "last_user_message": augmented_prompt
}

response = search_agent.run(state)

### Scenario 3: Using External Tools
The agent can also use its tools (Google Search) to find information not present in the RAG corpus.

In [None]:
state_search = {
    "last_user_message": "Search for the latest Agentic Design Patterns released in 2025"
}

response_search = search_agent.run(state_search)

## Conclusion

This notebook demonstrated the components of a RAG system: a memory service for retrieval and an agent for generation and tool use. By combining these, we create grounded, capable AI assistants.