<img src="https://opea.dev/wp-content/uploads/sites/9/2024/04/opea-horizontal-color.svg" alt="OPEA Logo">

<div style="text-align: center; margin: 20px 0;">
<a href="https://colab.research.google.com/github/opea-project/Course-Material/blob/main/Curriculas/Kubernetes/AgentQnA/1_Deploy_AgentQnA.ipynb" target="_blank">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/> 
</a>
</div>

> **💡 Quick Start:** Click the button above to open this notebook directly in Google Colab - no local setup required!


# <span style="color: #2E86AB; font-weight: bold;">🤖 Building Effective AI Agents</span>

## <span style="color: #A23B72; font-size: 1.2em;">⚡ Deploy OPEA AgentQnA Blueprint on </span><span style="color: #F18F01; font-size: 1.2em;">☸️ Kubernetes</span>

---

### <span style="color: #C73E1D;">🎯 What You'll Learn</span>

<div style="background: linear-gradient(135deg, #667eea 0%, #764ba2 100%); padding: 15px; border-radius: 10px; color: white; margin: 10px 0;">
<strong>🚀 By the end of this tutorial, you'll be able to deploy production-ready AI agents that can:</strong>
<ul>
<li>🧠 <strong>Think</strong> - Use advanced reasoning capabilities with Large Language Models (LLMs)</li>
<li>🔍 <strong>Search</strong> - Query knowledge bases intelligently using Retrieval-Augmented Generation (RAG)</li>
<li>⚙️ <strong>Act</strong> - Execute real-world tasks using external tools and APIs</li>
<li>🐳 <strong>Scale</strong> - Run reliably and efficiently on Kubernetes infrastructure</li>
</ul>
</div>


# Prerequisites

Before running this notebook, ensure you have the following components set up:

## 1. **Jupyter Environment**
This notebook can be executed in any of these environments:
   - Local Jupyter Notebook/JupyterLab installation
   - VS Code with Jupyter extension
   - Google Colab (simply upload this notebook file)

## 2. **Required Command-Line Tools**
Make sure these tools are installed and available in your PATH:
   - **`kubectl`** - Kubernetes command-line tool configured with access to your cluster
   - **`helm`** - Helm package manager (version 3.x or later) for deploying OPEA charts
   - **`docker`** - Container runtime (required if using KIND for local testing)
   - **`curl`** and **`jq`** - For testing and parsing API endpoint responses

## 3. **Kubernetes Cluster Access**
   - A running Kubernetes cluster (can be local KIND cluster, managed cloud cluster, etc.)
   - Properly configured `kubeconfig` file to access your cluster
   - Sufficient cluster resources (minimum 4GB RAM, 2 CPU cores recommended)
   - Cluster should have internet access for pulling container images

## 4. **API Keys** (choose one option based on your LLM preference)
   - **OpenAI API key** - If you plan to use OpenAI models (GPT-4, GPT-3.5-turbo, etc.)
   - **HuggingFace API token** - For accessing open-source models via HuggingFace Hub
   - **Remote inference endpoint credentials** - If using custom inference endpoints

## 5. **Network Requirements**
   - Outbound internet access for downloading Helm charts and container images
   - Port-forwarding capabilities for accessing deployed services locally

# 1. Introduction: Why Build an Agent?

## 1.1 🧠 From Talking to Doing: Why We Need More Than Just LLMs

Imagine you ask a **consultant**:

> “Can you help me plan my trip to Paris?”

They reply:

> “Sure! You should book a flight, find a hotel near the Eiffel Tower, and maybe buy museum tickets in advance.”

Helpful — but **you’re still doing all the work**.

That’s what a **plain LLM** does: it provides good advice, but doesn’t take action.


Now imagine asking a **personal assistant** the same question:

> “Can you plan my trip to Paris?”

They respond:

> ✅ Flight booked  
> ✅ Hotel reserved  
> ✅ Museum tickets purchased  
> ✅ Itinerary sent to your inbox

That’s what an **agent** does: it combines language understanding with **tools, APIs, and logic** to get real-world tasks done.

 
- 🧑‍🏫 **LLM = Smart talker (consultant)**  
- 🧑‍💼 **Agent = Smart doer (assistant with tools)**


#### **We Need It All — LLMs, RAG, and Agents Working Together**

| Layer       | What It Brings                              | But It’s Not Enough…                   |
|-------------|----------------------------------------------|----------------------------------------|
| **LLMs**    | Powerful at understanding and generating     | Can’t act or fetch real-time data      |
| **RAG**     | Boosts relevance by pulling in context       | Still just text — no action            |
| **Agents**  | Execute tasks using tools, APIs, and logic   | Need structure, memory, and oversight  |

To build real-world AI systems, we need **everything**, because:

- 🔗 **External sources matter** — APIs, databases, real-time feeds  
- ⚙️ **Action matters** — booking, triggering, integrating  
- 🧑‍💼 **Expertise matters** — agents must specialize by domain

> **Agents aren’t just smarter — they *do*.**  

---


## 1.2 🧰 Tools, Workers, and Agents — The Execution Layer

> 💡 **Agents aren’t just text generators — they _use tools_, delegate tasks to _workers_, and execute complex workflows.**

- **Tools**: Interfaces for querying APIs, triggering functions, or transforming data (e.g., search, calculator, database access).
- **Workers**: Specialized sub-agents or microservices that **retrieve**, **analyze**, or **transform** data for specific tasks.
- **Agents**: Decision-makers that orchestrate tools and workers to achieve a goal. They follow predefined rules or learned policies.

Agents can **interact with users**, **retrieve and process data**, and **execute tasks** based on rules or learning models.

### 🔑 Key Features of AI Agents:
- ✅ **Autonomous** – Operate with minimal human intervention  
- ✅ **Perceptual** – Analyze information from text, voice, images, or structured data  
- ✅ **Goal-oriented** – Work to accomplish tasks such as answering questions or summarizing content  
- ✅ **Interactive** – Communicate with humans or with other agents  
- ✅ **Adaptive** – Improve over time through learning or feedback  

### **Types of Agents**  
Agents can be classified based on their reasoning and decision-making capabilities:

| Type | Description | Example |
|------|-------------|---------|
| **Reactive Agents** | Respond only to current inputs, without memory | Chatbot answering direct questions |
| **Goal-Based Agents** | Use goals to guide their decisions | Recommendation systems |
| **Utility-Based Agents** | Optimize for the best possible decision | Self-driving cars adjusting speed |
| **Learning Agents** | Improve with data and experience | AI models fine-tuned for specific tasks |

#### **Example: Difference Between Reactive and Goal-Based Agents**  
🚗 **Reactive Agent:** A self-driving car stops when the traffic light is red but doesn’t plan ahead.  
🛣️ **Goal-Based Agent:** The car not only stops but also adjusts its speed to optimize fuel efficiency.



## 1.3 Example of an agent

In [None]:
# Clone Course Material
!pip install -r https://raw.githubusercontent.com/opea-project/Course-Material/main/Curriculas/Kubernetes/AgentQnA/requirements.txt

> **💡 NOTE:** Add your OpenAI key to test it

In [None]:
from langchain.chat_models import ChatOpenAI
from langchain.agents import AgentType, initialize_agent
from langchain.tools import Tool
import os

# Create a function for mathematical calculations
def calculate(input_str):
    """Evaluates simple mathematical expressions"""
    try:
        return str(eval(input_str))
    except Exception as e:
        return f"Calculation error: {e}"

# Initialize the language model
llm = ChatOpenAI(
    temperature=0,
    model="gpt-3.5-turbo",
    openai_api_key="your openai_api_key"
)

# Use the Tool method
Math_tool = Tool(
    name="Calculator",
    func=calculate,
    description="Useful ONLY for calculating simple math expressions. Expected input: '2+2'"
)

# You can try removing the word ONLY and see the difference.
# The model only has one tool and will try to use it.
# When an agent has limited tools, it may try to use whatever it has—even if it's not appropriate.

# Initialize the agent with additional tools
agent = initialize_agent(
    tools=[Math_tool],
    llm=llm,
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    verbose=True,
    handle_parsing_errors=True
)

# Test the agent
print(agent.run("What is the capital of France?"))   # This is a non-math question
print(agent.run("What is 12 * 8?"))                  # This is a math question


## 1.4 🧠 From Language to Action: The Challenge of Real AI Agents

Large Language Models (LLMs) are excellent at generating natural language — but real-world applications demand more than just fluent responses. To truly assist users, AI systems must:

- Use tools and APIs to take action
- Retain and reason across multiple steps
- Query databases or retrieve relevant documents
- Operate reliably in production environments

> Building these capabilities from scratch is complex. Most frameworks either oversimplify agent logic or lock you into rigid architectures that are hard to scale or customize.

---



# 2. OPEA and Agents



**OPEA (Open Platform for Enterprise Agents)** is an open source framework designed to simplify the development and deployment of production-ready GenAI deployments.

It gives you the scaffolding to go beyond chat and build systems that can reason, act, and specialize.


## 2.1 Why Use OPEA?

| Feature                      | What It Enables                                |
|-----------------------------|-------------------------------------------------|
| **Agentic Blueprints**       | Skip boilerplate code — use prebuilt, customizable agent workflows (RAG, tool use, multi-agent coordination) |
| **Modular by Design**        | Easily swap in any LLM, memory backend, or external tool without major code changes |
| **Deployment-Ready**         | Includes Docker containers, REST APIs, distributed tracing, and monitoring out-of-the-box |
| **Enterprise-Friendly**      | Secure, observable, and compatible with on-premises or cloud deployments |
| **Community-Driven**         | Actively maintained by contributors from Intel and the broader open-source ecosystem |


### 🔍 Built for Real Applications

OPEA bridges the gap between research prototypes and production-ready deployments:

- **Vector Database Integration**: Connects seamlessly to vector databases like Chroma, Qdrant, and Weaviate for efficient similarity search
- **Stateful Workflows**: Supports persistent memory and multi-step agent workflows with conversation history
- **Flexible Tool Integration**: Enables tool calling via REST APIs, SQL databases, or custom function schemas
- **Production Observability**: Provides comprehensive logging, tracing, and debugging capabilities for agent workflows
- **Scalable Architecture**: Designed to handle enterprise workloads with Kubernetes-native scaling and load balancing

---

# 3  Deploy OPEA AgentQnA in Kubernetes

This notebook assumes you have a Kubernetes cluster already deployed and configured. You can follow this guide to deploy it on any platform.

**What is AgentQnA?** 
AgentQnA is an OPEA blueprint that demonstrates a multi-agent Question & Answer system. It includes:
- A **Supervisor Agent** that routes questions to specialized worker agents
- A **RAG Agent** that retrieves information from knowledge bases  
- A **SQL Agent** that queries databases using natural language
- Tool integration for web search, calculations, and more

This example showcases a hierarchical multi-agent system for question-answering applications. The architecture diagram below shows a supervisor agent that interfaces with the user and dispatches tasks to two worker agents to gather information and come up with answers. The worker RAG agent uses the retrieval tool to retrieve relevant documents from a knowledge base - a vector database. The worker SQL agent retrieves relevant data from a SQL database. Although not included in this example by default, other tools such as a web search tool or a knowledge graph query tool can be used by the supervisor agent to gather information from additional sources.

![agent_qna_arch.png](https://github.com/opea-project/GenAIExamples/raw/main/AgentQnA/assets/img/agent_qna_arch.png)

## 3.1 Verify Kubernetes Connection

Before proceeding, let's verify your Kubernetes cluster connection:

In [None]:
# Download your kubeconfig access 

!export KUBECONFIG=/path/to/your/kubeconfig

In [None]:
# Verify kubectl connection and cluster info
!kubectl cluster-info
!kubectl get nodes


## 3.2 Download Required Resources

Let's start by cloning the necessary repository and downloading agent configuration files:

In [None]:
# Clone GenAIExamples repo, you will need it for later

!git clone https://github.com/opea-project/GenAIExamples.git

## 3.3 Understanding AgentQnA Architecture

AgentQnA follows a **multi-agent architecture** with specialized roles:

```mermaid
graph TD
    A[User Query] --> B[Supervisor Agent]
    B --> C{Route Decision}
    C -->|Knowledge Question| D[RAG Agent]
    C -->|Database Question| E[SQL Agent]
    C -->|General Question| F[Direct Response]
    D --> G[Vector Database]
    E --> H[SQL Database]
    D --> I[Response]
    E --> I
    F --> I
    I --> J[User]
```

**Key Components:**
- **Supervisor Agent**: Analyzes incoming questions and routes them to appropriate specialized agents
- **RAG Agent**: Handles knowledge-based questions by retrieving relevant documents from vector databases
- **SQL Agent**: Converts natural language to SQL queries for structured data retrieval
- **Tool Integration**: Agents can use external tools like web search, calculators, and APIs

In [None]:
# Download Agents configurations locally  
import os

# Get current working directory
WORKDIR = os.getcwd()

# Define the target directory
tools_dir = os.path.join(WORKDIR, "mnt", "tools")

# Create the directory if it doesn't exist
os.makedirs(tools_dir, exist_ok=True)

# Download the files to the correct directory
!wget https://raw.githubusercontent.com/opea-project/GenAIExamples/refs/heads/main/AgentQnA/tools/supervisor_agent_tools.yaml -O {tools_dir}/supervisor_agent_tools.yaml
!wget https://raw.githubusercontent.com/opea-project/GenAIExamples/refs/heads/main/AgentQnA/tools/tools.py -O {tools_dir}/tools.py
!wget https://raw.githubusercontent.com/opea-project/GenAIExamples/refs/heads/main/AgentQnA/tools/pycragapi.py -O {tools_dir}/pycragapi.py

!wget https://raw.githubusercontent.com/opea-project/GenAIExamples/refs/heads/main/AgentQnA/tools/worker_agent_tools.yaml -O {tools_dir}/worker_agent_tools.yaml
!wget https://raw.githubusercontent.com/opea-project/GenAIExamples/refs/heads/main/AgentQnA/tools/worker_agent_tools.py -O {tools_dir}/worker_agent_tools.py

!wget https://raw.githubusercontent.com/lerocha/chinook-database/refs/heads/master/ChinookDatabase/DataSources/Chinook_Sqlite.sqlite -O {tools_dir}/Chinook_Sqlite.sqlite


## 3.4 Prepare Configuration Files
### Option 1: For KIND (Kubernetes in Docker) Clusters

If you're using a KIND cluster, we need to copy the downloaded tools into the cluster's filesystem:

In [2]:
# Add Agents tools and configuration to control-plane

import os

# Define local tools directory
WORKDIR = os.getcwd()
TOOLS_DIR = os.path.join(WORKDIR, "mnt/tools")

# Create destination folder in the kind node
!docker exec kind-control-plane mkdir -p /mnt/tools

if os.system(f"docker cp {TOOLS_DIR}/. kind-control-plane:/mnt/tools/") == 0:
    print("✅ Tools copied successfully")
else:
    print("❌ Failed to copy tools")


Successfully copied 1.1MB to kind-control-plane:/mnt/tools/
✅ All tools copied into kind-control-plane:/mnt/tools/


### Option 2: For Regular Kubernetes Clusters

For non-KIND clusters, you'll need to create a persistent volume or use an alternative method to make the tools available to your pods. This could involve:
- Creating a ConfigMap with the tool files
- Using a persistent volume claim
- Building the tools into your container images 

## 3.5 Configure Your LLM Engine

Next, we'll download the Helm chart and configure it for your specific LLM provider:

In [46]:
!helm pull oci://ghcr.io/opea-project/charts/agentqna --version 0-latest --untar


Pulled: ghcr.io/opea-project/charts/agentqna:1.3.0
Digest: sha256:6ae324080c6af874964d74bf7c9e35bfb599bf00ba5d320ed3a64c6283dd6d5c
Error: failed to untar: a file or directory with the name agentqna already exists


Navigate to your variant configuration directory `/agentqna/`. You'll find several pre-configured options:

![image](./Images/variant.png)

The variant files contain different configurations for various LLM providers and deployment scenarios.

You have multiple options for where to run your LLM:

- **OpenAI**: To use OpenAI models, generate a key following [these instructions](https://platform.openai.com/api-keys)
- **Local Models (Open Source)**: Deploy models locally using TGI, vLLM, or similar inference engines
- **Remote Inference (OpenAI-Compatible)**: Use third-party inference providers with OpenAI-compatible APIs 

**For Remote Inference (OpenAI-Compatible APIs):**

Edit the variant configuration file (`agentqna/variant_openai-values.yaml`) to set all agents to use your remote inference endpoint:

```yaml
supervisor:
  model: "meta-llama/Llama-3.3-70B-Instruct"                                                  #<-----LLM model used by the agent
  llm_endpoint_url: "https://api.inference.denvrdata.com"                                     #<-----OpenAI Like endpoint URL
  llm_engine: openai
  OPENAI_API_KEY: "Your_remote_inference_key"

ragagent:
  model: "meta-llama/Llama-3.3-70B-Instruct" 
  llm_endpoint_url: "https://api.inference.denvrdata.com"
  OPENAI_API_KEY: "Your_remote_inference_key"
  llm_engine: openai

sqlagent:
  model: "meta-llama/Llama-3.3-70B-Instruct" 
  llm_endpoint_url: "https://api.inference.denvrdata.com"
  llm_engine: openai
  OPENAI_API_KEY: "Your_remote_inference_key"
```

**For OpenAI Models:**

If you're using OpenAI, modify all agents in your variant file like this:

```yaml
supervisor:
  model: "gpt-4o-mini-2024-07-18"
  llm_engine: openai
  OPENAI_API_KEY: "your_openai_key"

ragagent:
  model: "gpt-4o-mini-2024-07-18"
  llm_engine: openai
  OPENAI_API_KEY: "your_openai_key"

sqlagent:
  model: "gpt-4o-mini-2024-07-18"
  llm_engine: openai
  OPENAI_API_KEY: "your_openai_key"
```

## 3.6 Deploy the Helm Chart

Now we'll deploy AgentQnA to your Kubernetes cluster using Helm:

In [None]:
# Deploy Helm chart using your configured variant
# Replace "your_hf_key" with your actual HuggingFace API token
!helm upgrade --install agentqna agentqna \
  -f agentqna/variant_openai-values.yaml \
  --set global.HUGGINGFACEHUB_API_TOKEN="your_hf_key" 

Release "agentqna" does not exist. Installing it now.
NAME: agentqna
LAST DEPLOYED: Thu Jun  5 00:59:49 2025
NAMESPACE: default
STATUS: deployed
REVISION: 1


In [3]:
# Check pod status - ALL CONTAINERS MUST BE IN RUNNING STATE (1/1) 
# This can take approximately 10 minutes for all images to download and pods to start
!kubectl get pods

NAME                                        READY   STATUS    RESTARTS   AGE
agentqna-agentqna-ui-757c79dcf9-wq8js       1/1     Running   0          17h
agentqna-crag-6b876d6788-jtq25              1/1     Running   0          17h
agentqna-data-prep-7d6964b5b7-rfq76         1/1     Running   0          17h
agentqna-docretriever-65f8b7dc74-vkxzj      1/1     Running   0          17h
agentqna-embedding-usvc-76959d7967-bf964    1/1     Running   0          17h
agentqna-ragagent-cbcc5d88-8fpvm            1/1     Running   0          17h
agentqna-redis-vector-db-7b5d76c94c-9pcn9   1/1     Running   0          17h
agentqna-reranking-usvc-6f9c775475-tl9jm    1/1     Running   0          17h
agentqna-retriever-usvc-7bd7f8c595-7q6q4    1/1     Running   0          17h
agentqna-sqlagent-6cdcf5b54d-hg2n8          1/1     Running   0          17h
agentqna-supervisor-54cc4648bd-ppd2z        1/1     Running   0          17h
agentqna-tei-55c54b8cbb-txqlj               1/1     Running   0          17h

## 3.7 Set Up Port Forwarding

Now we'll set up port forwarding to access the Supervisor microservice locally on port 9090:

In [None]:
import socket
import subprocess
import time

# Check if port 9090 is already open
def is_port_open(host='localhost', port=9090):
    with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as sock:
        sock.settimeout(1)  # short timeout for responsiveness in notebook
        try:
            sock.connect((host, port))
            return True
        except:
            return False

if is_port_open():
    print("✅ Port-forward already running on localhost:9090.")
else:
    print("🔁 Port not open. Starting port-forward to svc/agentqna-supervisor...")
    subprocess.Popen(
        ["kubectl", "port-forward", "svc/agentqna-supervisor", "9090:9090"],
        stdout=subprocess.DEVNULL,
        stderr=subprocess.DEVNULL
    )
    # Optional: wait a moment to let it start
    time.sleep(2)
    if is_port_open():
        print("✅ Port-forward started successfully.")
    else:
        print("❌ Failed to start port-forward.")

In [6]:
# Check Behavior

!curl -s http://localhost:9090/v1/chat/completions \
  -X POST \
  -H "Content-Type: application/json" \
  -d '{"messages": "2+2"}' | jq -r '.text'

Handling connection for 9090
4


## 3.8 Ingest Documents for the RAG Agent

In [7]:
import subprocess
subprocess.Popen(["kubectl", "port-forward", "svc/agentqna-data-prep", "6007:6007"])

<Popen: returncode: None args: ['kubectl', 'port-forward', 'svc/agentqna-dat...>

Forwarding from 127.0.0.1:6007 -> 5000
Forwarding from [::1]:6007 -> 5000


In [None]:
!export WORKDIR=$(pwd) && \
export host_ip=localhost && \
cd GenAIExamples/AgentQnA/retrieval_tool/ && \
bash run_ingest_data.sh

In [None]:
# This information will be used by the RAG agent
!cat GenAIExamples/AgentQnA/example_data/test_docs_music.jsonl

# 4 Validate Agent Behavior

Let's test the different agent behaviors with various types of questions: 

## 4.1 Test Supervisor with a General Question

In [15]:
!curl -s http://localhost:9090/v1/chat/completions \
  -X POST \
  -H "Content-Type: application/json" \
  -d '{"messages": "What is Deep Learning?"}' | jq -r '.text'

Handling connection for 9090
Deep learning is a subset of machine learning that uses neural networks with many layers (hence 'deep') to analyze various forms of data. It is particularly effective in tasks such as image and speech recognition, natural language processing, and autonomous driving. Deep learning models learn from large amounts of data, automatically extracting features and patterns without the need for manual feature engineering.


In [None]:
!kubectl logs -l app.kubernetes.io/name=supervisor --tail=30

## 4.2 Test Question That Should Route to RAG Agent

In [18]:
# Check RAG Agent 

!curl -s http://localhost:9090/v1/chat/completions \
  -X POST \
  -H "Content-Type: application/json" \
  -d '{"messages": "Who wrote the song Thriller?"}' | jq -r '.text'

Handling connection for 9090
Rod Temperton


## 4.3 Test Question That Should Route to SQL Agent

Now let's test a question that should be routed to the SQL agent for database queries:

In [None]:
# Check CRAG

!curl -s http://localhost:9090/v1/chat/completions \
  -X POST \
  -H "Content-Type: application/json" \
  -d '{"messages": "Grammy Best New Artist for 2020?"}' | jq -r '.text'

In [None]:
# Check CRAG

!curl -s http://localhost:9090/v1/chat/completions \
  -X POST \
  -H "Content-Type: application/json" \
  -d '{"messages": "Grammy Best New Artist for 2013?"}' | jq -r '.text'

In [None]:
# Check SQL DataBase

!curl -s http://localhost:9090/v1/chat/completions \
  -X POST \
  -H "Content-Type: application/json" \
  -d '{"messages": "How many albums does Iron Maiden have?"}' | jq -r '.text'

## 4.4 Change Thread ID for Conversation Management

 The thread ID is an important parameter that helps manage separate conversations with the agent. By specifying a thread ID, you can:

 - Maintain conversation context across multiple messages
 - Have multiple independent conversations simultaneously 
 - Reference and continue previous conversations
 - Track conversation history for specific interactions

In [None]:
!curl -s http://localhost:9090/v1/chat/completions \
  -X POST \
  -H "Content-Type: application/json" \
  -d '{"role": "user", "messages": "How many albums does Iron Maiden have?", "thread_id": "abc123", "stream": false}' \
  | jq -r '.text'


# 🎉 Congratulations!

You have successfully deployed and tested OPEA AgentQnA on Kubernetes! You now have:

✅ **A multi-agent system** that can route questions intelligently  
✅ **RAG capabilities** for knowledge-based queries  
✅ **SQL agent** for database interactions  
✅ **Scalable deployment** on Kubernetes infrastructure  
✅ **Production-ready architecture** with monitoring and observability

### Next Steps

- Experiment with different types of questions to see how the supervisor routes them
- Add your own documents to the RAG agent's knowledge base
- Customize the agent prompts for your specific use cases
- Explore the OPEA documentation for more advanced configurations

# Additional Information: How AgentQnA Works Under the Hood

## ReAct Methodology

The Supervisor agent uses the **ReAct (Reasoning and Acting)** methodology, which follows this pattern: 

**Thought → Action → Observation → Thought → ...**

This iterative process allows the agent to:
1. **Think** about what to do next
2. **Act** by using available tools or calling sub-agents
3. **Observe** the results of the action
4. **Think** again about the next step based on new information

## OPEA Agent Implementation

OPEA implements this supervisor logic in the codebase at:
`GenAIComps/comps/agent/src/integrations/strategy/react/prompt.py`

The implementation follows these steps:
1. **Create the prompt** with the ReAct system message and user query
2. **Call the LLM** using `llm.invoke()` to get reasoning and action decisions
3. **Execute tools** based on the agent's decision
4. **Build the agent's toolkit** with available tools and sub-agents


## Example System Prompt

Here's an example of the ReAct system message used by OPEA agents:
```
REACT_SYS_MESSAGE = """\
Decompose the user request into a series of simple tasks when necessary and solve the problem step by step.
When you cannot get the answer at first, do not give up. Reflect on the info you have from the tools and try to solve the problem in a different way.
Please follow these guidelines when formulating your answer:
1. If the question contains a false premise or assumption, answer “invalid question”.
2. If you are uncertain or do not know the answer, respond with “I don’t know”.
3. Give concise, factual and relevant answers.
"""
```

Steps are:
- Create the prompt
- Call the llm, `llm.invoke`


Build the tool set the agent have available, in this case 



This system prompt guides the agent to:
- Break down complex requests into manageable steps
- Persist through initial failures by trying different approaches  
- Maintain accuracy by acknowledging uncertainty when appropriate
- Provide concise, relevant responses

## Agent Implementation Workflow

The agent implementation follows this workflow:
1. **Create the prompt** with the ReAct system message and user query
2. **Call the LLM** using `llm.invoke()` to get reasoning and action decisions
3. **Execute tools** based on the agent's decision
4. **Build the agent's toolkit** with available tools and sub-agents

Each agent in AgentQnA has access to specialized tools that enable them to perform their specific functions, whether that's querying vector databases, executing SQL commands, or calling external APIs.

---

