Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 16 additions & 27 deletions AgentQnA/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,7 @@ flowchart LR
3. Hierarchical multi-agents can improve performance.
Expert worker agents, such as RAG agent and SQL agent, can provide high-quality output for different aspects of a complex query, and the supervisor agent can aggregate the information together to provide a comprehensive answer. If we only use one agent and provide all the tools to this single agent, it may get overwhelmed and not able to provide accurate answers.

## Deployment with docker
## Deploy with docker

1. Build agent docker image [Optional]

Expand Down Expand Up @@ -217,13 +217,19 @@ docker build -t opea/agent:latest --build-arg https_proxy=$https_proxy --build-a
:::
::::

## Deploy AgentQnA UI

The AgentQnA UI can be deployed locally or using Docker.

For detailed instructions on deploying AgentQnA UI, refer to the [AgentQnA UI Guide](./ui/svelte/README.md).

## Deploy using Helm Chart

Refer to the [AgentQnA helm chart](./kubernetes/helm/README.md) for instructions on deploying AgentQnA on Kubernetes.

## Validate services

First look at logs of the agent docker containers:
1. First look at logs of the agent docker containers:

```
# worker RAG agent
Expand All @@ -240,35 +246,18 @@ docker logs react-agent-endpoint

You should see something like "HTTP server setup successful" if the docker containers are started successfully.</p>

Second, validate worker RAG agent:
2. You can use python to validate the agent system

```
curl http://${host_ip}:9095/v1/chat/completions -X POST -H "Content-Type: application/json" -d '{
"messages": "Michael Jackson song Thriller"
}'
```
```bash
# RAG worker agent
python tests/test.py --prompt "Tell me about Michael Jackson song Thriller" --agent_role "worker" --ext_port 9095

Third, validate worker SQL agent:
# SQL agent
python tests/test.py --prompt "How many employees in company" --agent_role "worker" --ext_port 9096

# supervisor agent: this will test a two-turn conversation
python tests/test.py --agent_role "supervisor" --ext_port 9090
```
curl http://${host_ip}:9096/v1/chat/completions -X POST -H "Content-Type: application/json" -d '{
"messages": "How many employees are in the company"
}'
```

Finally, validate supervisor agent:

```
curl http://${host_ip}:9090/v1/chat/completions -X POST -H "Content-Type: application/json" -d '{
"messages": "How many albums does Iron Maiden have?"
}'
```

## Deploy AgentQnA UI

The AgentQnA UI can be deployed locally or using Docker.

For detailed instructions on deploying AgentQnA UI, refer to the [AgentQnA UI Guide](./ui/svelte/README.md).

## How to register your own tools with agent

Expand Down
11 changes: 7 additions & 4 deletions AgentQnA/docker_compose/intel/cpu/xeon/compose_openai.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ services:
environment:
ip_address: ${ip_address}
strategy: rag_agent
with_memory: false
recursion_limit: ${recursion_limit_worker}
llm_engine: openai
OPENAI_API_KEY: ${OPENAI_API_KEY}
Expand All @@ -35,17 +36,17 @@ services:
image: opea/agent:latest
container_name: sql-agent-endpoint
volumes:
- ${WORKDIR}/TAG-Bench/:/home/user/TAG-Bench # SQL database
- ${WORKDIR}/GenAIExamples/AgentQnA/tests:/home/user/chinook-db # SQL database
ports:
- "9096:9096"
ipc: host
environment:
ip_address: ${ip_address}
strategy: sql_agent
with_memory: false
db_name: ${db_name}
db_path: ${db_path}
use_hints: false
hints_file: /home/user/TAG-Bench/${db_name}_hints.csv
recursion_limit: ${recursion_limit_worker}
llm_engine: openai
OPENAI_API_KEY: ${OPENAI_API_KEY}
Expand All @@ -64,21 +65,23 @@ services:
container_name: react-agent-endpoint
depends_on:
- worker-rag-agent
- worker-sql-agent
volumes:
- ${TOOLSET_PATH}:/home/user/tools/
ports:
- "9090:9090"
ipc: host
environment:
ip_address: ${ip_address}
strategy: react_langgraph
strategy: react_llama
with_memory: true
recursion_limit: ${recursion_limit_supervisor}
llm_engine: openai
OPENAI_API_KEY: ${OPENAI_API_KEY}
model: ${model}
temperature: ${temperature}
max_new_tokens: ${max_new_tokens}
stream: false
stream: true
tools: /home/user/tools/supervisor_agent_tools.yaml
require_human_feedback: false
no_proxy: ${no_proxy}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ export WORKER_AGENT_URL="http://${ip_address}:9095/v1/chat/completions"
export SQL_AGENT_URL="http://${ip_address}:9096/v1/chat/completions"
export RETRIEVAL_TOOL_URL="http://${ip_address}:8889/v1/retrievaltool"
export CRAG_SERVER=http://${ip_address}:8080
export db_name=california_schools
export db_path="sqlite:////home/user/TAG-Bench/dev_folder/dev_databases/${db_name}/${db_name}.sqlite"
export db_name=Chinook
export db_path="sqlite:////home/user/chinook-db/Chinook_Sqlite.sqlite"

docker compose -f compose_openai.yaml up -d
5 changes: 4 additions & 1 deletion AgentQnA/docker_compose/intel/hpu/gaudi/compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ services:
environment:
ip_address: ${ip_address}
strategy: rag_agent_llama
with_memory: false
recursion_limit: ${recursion_limit_worker}
llm_engine: vllm
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
Expand Down Expand Up @@ -43,6 +44,7 @@ services:
environment:
ip_address: ${ip_address}
strategy: sql_agent_llama
with_memory: false
db_name: ${db_name}
db_path: ${db_path}
use_hints: false
Expand Down Expand Up @@ -74,14 +76,15 @@ services:
environment:
ip_address: ${ip_address}
strategy: react_llama
with_memory: true
recursion_limit: ${recursion_limit_supervisor}
llm_engine: vllm
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
llm_endpoint_url: ${LLM_ENDPOINT_URL}
model: ${LLM_MODEL_ID}
temperature: ${temperature}
max_new_tokens: ${max_new_tokens}
stream: false
stream: true
tools: /home/user/tools/supervisor_agent_tools.yaml
require_human_feedback: false
no_proxy: ${no_proxy}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ export HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
export HF_CACHE_DIR=${HF_CACHE_DIR}
ls $HF_CACHE_DIR
export HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
export LLM_MODEL_ID="meta-llama/Meta-Llama-3.1-70B-Instruct"
export LLM_MODEL_ID="meta-llama/Llama-3.3-70B-Instruct" #"meta-llama/Meta-Llama-3.1-70B-Instruct"
export NUM_SHARDS=4
export LLM_ENDPOINT_URL="http://${ip_address}:8086"
export temperature=0
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ export ip_address=$(hostname -I | awk '{print $1}')
export TOOLSET_PATH=$WORKPATH/tools/
export HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
HF_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
model="meta-llama/Meta-Llama-3.1-70B-Instruct"
model="meta-llama/Llama-3.3-70B-Instruct" #"meta-llama/Meta-Llama-3.1-70B-Instruct"

export HF_CACHE_DIR=/data2/huggingface
if [ ! -d "$HF_CACHE_DIR" ]; then
Expand Down Expand Up @@ -60,23 +60,6 @@ function start_vllm_service_70B() {
echo "Service started successfully"
}


function prepare_data() {
cd $WORKDIR

echo "Downloading data..."
git clone https://github.com/TAG-Research/TAG-Bench.git
cd TAG-Bench/setup
chmod +x get_dbs.sh
./get_dbs.sh

echo "Split data..."
cd $WORKPATH/tests/sql_agent_test
bash run_data_split.sh

echo "Data preparation done!"
}

function download_chinook_data(){
echo "Downloading chinook data..."
cd $WORKDIR
Expand Down Expand Up @@ -113,7 +96,7 @@ function validate_agent_service() {
echo "======================Testing worker rag agent======================"
export agent_port="9095"
prompt="Tell me about Michael Jackson song Thriller"
local CONTENT=$(python3 $WORKDIR/GenAIExamples/AgentQnA/tests/test.py --prompt "$prompt")
local CONTENT=$(python3 $WORKDIR/GenAIExamples/AgentQnA/tests/test.py --prompt "$prompt" --agent_role "worker" --ext_port $agent_port)
# echo $CONTENT
local EXIT_CODE=$(validate "$CONTENT" "Thriller" "rag-agent-endpoint")
echo $EXIT_CODE
Expand All @@ -127,7 +110,7 @@ function validate_agent_service() {
echo "======================Testing worker sql agent======================"
export agent_port="9096"
prompt="How many employees are there in the company?"
local CONTENT=$(python3 $WORKDIR/GenAIExamples/AgentQnA/tests/test.py --prompt "$prompt")
local CONTENT=$(python3 $WORKDIR/GenAIExamples/AgentQnA/tests/test.py --prompt "$prompt" --agent_role "worker" --ext_port $agent_port)
local EXIT_CODE=$(validate "$CONTENT" "8" "sql-agent-endpoint")
echo $CONTENT
# echo $EXIT_CODE
Expand All @@ -140,9 +123,8 @@ function validate_agent_service() {
# test supervisor react agent
echo "======================Testing supervisor react agent======================"
export agent_port="9090"
prompt="How many albums does Iron Maiden have?"
local CONTENT=$(python3 $WORKDIR/GenAIExamples/AgentQnA/tests/test.py --prompt "$prompt")
local EXIT_CODE=$(validate "$CONTENT" "21" "react-agent-endpoint")
local CONTENT=$(python3 $WORKDIR/GenAIExamples/AgentQnA/tests/test.py --agent_role "supervisor" --ext_port $agent_port --stream)
local EXIT_CODE=$(validate "$CONTENT" "Iron" "react-agent-endpoint")
# echo $CONTENT
echo $EXIT_CODE
local EXIT_CODE="${EXIT_CODE:0-1}"
Expand All @@ -153,15 +135,6 @@ function validate_agent_service() {

}

function remove_data() {
echo "Removing data..."
cd $WORKDIR
if [ -d "TAG-Bench" ]; then
rm -rf TAG-Bench
fi
echo "Data removed!"
}

function remove_chinook_data(){
echo "Removing chinook data..."
cd $WORKDIR
Expand Down Expand Up @@ -189,8 +162,9 @@ function main() {
echo "==================== Agent service validated ===================="
}

remove_data

remove_chinook_data

main
remove_data

remove_chinook_data
79 changes: 50 additions & 29 deletions AgentQnA/tests/test.py
Original file line number Diff line number Diff line change
@@ -1,34 +1,20 @@
# Copyright (C) 2024 Intel Corporation
# Copyright (C) 2025 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

import argparse
import os
import json
import uuid

import requests


def generate_answer_agent_api(url, prompt):
proxies = {"http": ""}
payload = {
"messages": prompt,
}
response = requests.post(url, json=payload, proxies=proxies)
answer = response.json()["text"]
return answer


def process_request(url, query, is_stream=False):
proxies = {"http": ""}

payload = {
"messages": query,
}

content = json.dumps(query) if query is not None else None
try:
resp = requests.post(url=url, json=payload, proxies=proxies, stream=is_stream)
resp = requests.post(url=url, data=content, proxies=proxies, stream=is_stream)
if not is_stream:
ret = resp.json()["text"]
print(ret)
else:
for line in resp.iter_lines(decode_unicode=True):
print(line)
Expand All @@ -38,19 +24,54 @@ def process_request(url, query, is_stream=False):
return ret
except requests.exceptions.RequestException as e:
ret = f"An error occurred:{e}"
print(ret)
return False
return None


def test_worker_agent(args):
url = f"http://{args.ip_addr}:{args.ext_port}/v1/chat/completions"
query = {"role": "user", "messages": args.prompt, "stream": "false"}
ret = process_request(url, query)
print("Response: ", ret)


def add_message_and_run(url, user_message, thread_id, stream=False):
print("User message: ", user_message)
query = {"role": "user", "messages": user_message, "thread_id": thread_id, "stream": stream}
ret = process_request(url, query, is_stream=stream)
print("Response: ", ret)


def test_chat_completion_multi_turn(args):
url = f"http://{args.ip_addr}:{args.ext_port}/v1/chat/completions"
thread_id = f"{uuid.uuid4()}"

# first turn
print("===============First turn==================")
user_message = "Which artist has the most albums in the database?"
add_message_and_run(url, user_message, thread_id, stream=args.stream)
print("===============End of first turn==================")

# second turn
print("===============Second turn==================")
user_message = "Give me a few examples of the artist's albums?"
add_message_and_run(url, user_message, thread_id, stream=args.stream)
print("===============End of second turn==================")


if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("--prompt", type=str)
parser.add_argument("--stream", action="store_true")
args = parser.parse_args()
parser.add_argument("--ip_addr", type=str, default="127.0.0.1", help="endpoint ip address")
parser.add_argument("--ext_port", type=str, default="9090", help="endpoint port")
parser.add_argument("--stream", action="store_true", help="streaming mode")
parser.add_argument("--prompt", type=str, help="prompt message")
parser.add_argument("--agent_role", type=str, default="supervisor", help="supervisor or worker")
args, _ = parser.parse_known_args()

ip_address = os.getenv("ip_address", "localhost")
agent_port = os.getenv("agent_port", "9090")
url = f"http://{ip_address}:{agent_port}/v1/chat/completions"
prompt = args.prompt
print(args)

process_request(url, prompt, args.stream)
if args.agent_role == "supervisor":
test_chat_completion_multi_turn(args)
elif args.agent_role == "worker":
test_worker_agent(args)
else:
raise ValueError("Invalid agent role")
2 changes: 1 addition & 1 deletion AgentQnA/tests/test_compose_on_gaudi.sh
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ bash step3_ingest_data_and_validate_retrieval.sh
echo "=================== #3 Data ingestion and validation completed===================="

echo "=================== #4 Start agent and API server===================="
bash step4_launch_and_validate_agent_tgi.sh
bash step4_launch_and_validate_agent_gaudi.sh
echo "=================== #4 Agent test passed ===================="

echo "=================== #5 Stop agent and API server===================="
Expand Down
Loading
Loading