Skip to content

Commit 62e06a0

Browse files
Update DocIndexRetriever Example to allow user passing in retriever/reranker params (#880)
Signed-off-by: minmin-intel <minmin.hou@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
1 parent bd32b03 commit 62e06a0

File tree

8 files changed

+188
-12
lines changed

8 files changed

+188
-12
lines changed

DocIndexRetriever/README.md

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,22 @@
11
# DocRetriever Application
22

3-
DocRetriever are the most widely adopted use case for leveraging the different methodologies to match user query against a set of free-text records. DocRetriever is essential to RAG system, which bridges the knowledge gap by dynamically fetching relevant information from external sources, ensuring that responses generated remain factual and current. The core of this architecture are vector databases, which are instrumental in enabling efficient and semantic retrieval of information. These databases store data as vectors, allowing RAG to swiftly access the most pertinent documents or data points based on semantic similarity.
3+
DocRetriever is the most widely adopted use case for leveraging the different methodologies to match user query against a set of free-text records. DocRetriever is essential to RAG system, which bridges the knowledge gap by dynamically fetching relevant information from external sources, ensuring that responses generated remain factual and current. The core of this architecture are vector databases, which are instrumental in enabling efficient and semantic retrieval of information. These databases store data as vectors, allowing RAG to swiftly access the most pertinent documents or data points based on semantic similarity.
44

55
## We provided DocRetriever with different deployment infra
66

77
- [docker xeon version](docker_compose/intel/cpu/xeon/README.md) => minimum endpoints, easy to setup
88
- [docker gaudi version](docker_compose/intel/hpu/gaudi/README.md) => with extra tei_gaudi endpoint, faster
9+
10+
## We allow users to set retriever/reranker hyperparams via requests
11+
12+
Example usage:
13+
14+
```python
15+
url = "http://{host_ip}:{port}/v1/retrievaltool".format(host_ip=host_ip, port=port)
16+
payload = {
17+
"messages": query,
18+
"k": 5, # retriever top k
19+
"top_n": 2, # reranker top n
20+
}
21+
response = requests.post(url, json=payload)
22+
```

DocIndexRetriever/docker_compose/intel/cpu/xeon/README.md

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -79,13 +79,26 @@ Retrieval from KnowledgeBase
7979

8080
```bash
8181
curl http://${host_ip}:8889/v1/retrievaltool -X POST -H "Content-Type: application/json" -d '{
82-
"text": "Explain the OPEA project?"
82+
"messages": "Explain the OPEA project?"
8383
}'
8484

8585
# expected output
8686
{"id":"354e62c703caac8c547b3061433ec5e8","reranked_docs":[{"id":"06d5a5cefc06cf9a9e0b5fa74a9f233c","text":"Close SearchsearchMenu WikiNewsCommunity Daysx-twitter linkedin github searchStreamlining implementation of enterprise-grade Generative AIEfficiently integrate secure, performant, and cost-effective Generative AI workflows into business value.TODAYOPEA..."}],"initial_query":"Explain the OPEA project?"}
8787
```
8888

89+
**Note**: `messages` is the required field. You can also pass in parameters for the retriever and reranker in the request. The parameters that can changed are listed below.
90+
91+
1. retriever
92+
* search_type: str = "similarity"
93+
* k: int = 4
94+
* distance_threshold: Optional[float] = None
95+
* fetch_k: int = 20
96+
* lambda_mult: float = 0.5
97+
* score_threshold: float = 0.2
98+
99+
2. reranker
100+
* top_n: int = 1
101+
89102
## 5. Trouble shooting
90103

91104
1. check all containers are alive

DocIndexRetriever/docker_compose/intel/cpu/xeon/compose.yaml

Lines changed: 18 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -74,13 +74,30 @@ services:
7474
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
7575
TEI_EMBEDDING_ENDPOINT: ${TEI_EMBEDDING_ENDPOINT}
7676
restart: unless-stopped
77+
tei-reranking-service:
78+
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.5
79+
container_name: tei-reranking-server
80+
ports:
81+
- "8808:80"
82+
volumes:
83+
- "./data:/data"
84+
shm_size: 1g
85+
environment:
86+
no_proxy: ${no_proxy}
87+
http_proxy: ${http_proxy}
88+
https_proxy: ${https_proxy}
89+
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
90+
HF_HUB_DISABLE_PROGRESS_BARS: 1
91+
HF_HUB_ENABLE_HF_TRANSFER: 0
92+
command: --model-id ${RERANK_MODEL_ID} --auto-truncate
7793
reranking:
7894
image: ${REGISTRY:-opea}/reranking-tei:${TAG:-latest}
7995
container_name: reranking-tei-xeon-server
96+
depends_on:
97+
- tei-reranking-service
8098
ports:
8199
- "8000:8000"
82100
ipc: host
83-
entrypoint: python local_reranking.py
84101
environment:
85102
no_proxy: ${no_proxy}
86103
http_proxy: ${http_proxy}

DocIndexRetriever/docker_compose/intel/hpu/gaudi/README.md

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -80,13 +80,26 @@ Retrieval from KnowledgeBase
8080

8181
```bash
8282
curl http://${host_ip}:8889/v1/retrievaltool -X POST -H "Content-Type: application/json" -d '{
83-
"text": "Explain the OPEA project?"
83+
"messages": "Explain the OPEA project?"
8484
}'
8585

8686
# expected output
8787
{"id":"354e62c703caac8c547b3061433ec5e8","reranked_docs":[{"id":"06d5a5cefc06cf9a9e0b5fa74a9f233c","text":"Close SearchsearchMenu WikiNewsCommunity Daysx-twitter linkedin github searchStreamlining implementation of enterprise-grade Generative AIEfficiently integrate secure, performant, and cost-effective Generative AI workflows into business value.TODAYOPEA..."}],"initial_query":"Explain the OPEA project?"}
8888
```
8989

90+
**Note**: `messages` is the required field. You can also pass in parameters for the retriever and reranker in the request. The parameters that can changed are listed below.
91+
92+
1. retriever
93+
* search_type: str = "similarity"
94+
* k: int = 4
95+
* distance_threshold: Optional[float] = None
96+
* fetch_k: int = 20
97+
* lambda_mult: float = 0.5
98+
* score_threshold: float = 0.2
99+
100+
2. reranker
101+
* top_n: int = 1
102+
90103
## 5. Trouble shooting
91104

92105
1. check all containers are alive

DocIndexRetriever/docker_compose/intel/hpu/gaudi/compose.yaml

Lines changed: 18 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -77,13 +77,30 @@ services:
7777
REDIS_URL: ${REDIS_URL}
7878
INDEX_NAME: ${INDEX_NAME}
7979
restart: unless-stopped
80+
tei-reranking-service:
81+
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.5
82+
container_name: tei-reranking-gaudi-server
83+
ports:
84+
- "8808:80"
85+
volumes:
86+
- "./data:/data"
87+
shm_size: 1g
88+
environment:
89+
no_proxy: ${no_proxy}
90+
http_proxy: ${http_proxy}
91+
https_proxy: ${https_proxy}
92+
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
93+
HF_HUB_DISABLE_PROGRESS_BARS: 1
94+
HF_HUB_ENABLE_HF_TRANSFER: 0
95+
command: --model-id ${RERANK_MODEL_ID} --auto-truncate
8096
reranking:
8197
image: ${REGISTRY:-opea}/reranking-tei:${TAG:-latest}
8298
container_name: reranking-tei-gaudi-server
99+
depends_on:
100+
- tei-reranking-service
83101
ports:
84102
- "8000:8000"
85103
ipc: host
86-
entrypoint: python local_reranking.py
87104
environment:
88105
no_proxy: ${no_proxy}
89106
http_proxy: ${http_proxy}

DocIndexRetriever/tests/test.py

Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
# Copyright (C) 2024 Intel Corporation
2+
# SPDX-License-Identifier: Apache-2.0
3+
4+
import argparse
5+
6+
import requests
7+
8+
9+
def search_knowledge_base(query: str, url: str, request_type="chat_completion") -> str:
10+
"""Search the knowledge base for a specific query."""
11+
print(url)
12+
proxies = {"http": ""}
13+
if request_type == "chat_completion":
14+
print("Sending chat completion request")
15+
payload = {
16+
"messages": query,
17+
"k": 5,
18+
"top_n": 2,
19+
}
20+
else:
21+
print("Sending text request")
22+
payload = {
23+
"text": query,
24+
}
25+
response = requests.post(url, json=payload, proxies=proxies)
26+
print(response)
27+
if "documents" in response.json():
28+
docs = response.json()["documents"]
29+
context = ""
30+
for i, doc in enumerate(docs):
31+
if i == 0:
32+
context = str(i) + ": " + doc
33+
else:
34+
context += "\n" + str(i) + ": " + doc
35+
# print(context)
36+
return context
37+
elif "text" in response.json():
38+
return response.json()["text"]
39+
elif "reranked_docs" in response.json():
40+
docs = response.json()["reranked_docs"]
41+
context = ""
42+
for i, doc in enumerate(docs):
43+
if i == 0:
44+
context = doc["text"]
45+
else:
46+
context += "\n" + doc["text"]
47+
# print(context)
48+
return context
49+
else:
50+
return "Error parsing response from the knowledge base."
51+
52+
53+
def main():
54+
parser = argparse.ArgumentParser(description="Index data")
55+
parser.add_argument("--host_ip", type=str, default="localhost", help="Host IP")
56+
parser.add_argument("--port", type=int, default=8889, help="Port")
57+
parser.add_argument("--request_type", type=str, default="chat_completion", help="Test type")
58+
args = parser.parse_args()
59+
print(args)
60+
61+
host_ip = args.host_ip
62+
port = args.port
63+
url = "http://{host_ip}:{port}/v1/retrievaltool".format(host_ip=host_ip, port=port)
64+
65+
response = search_knowledge_base("OPEA", url, request_type=args.request_type)
66+
67+
print(response)
68+
69+
70+
if __name__ == "__main__":
71+
main()

DocIndexRetriever/tests/test_compose_on_gaudi.sh

Lines changed: 17 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -64,7 +64,7 @@ function validate() {
6464
}
6565

6666
function validate_megaservice() {
67-
echo "Testing DataPrep Service"
67+
echo "=========Ingest data=================="
6868
local CONTENT=$(curl -X POST "http://${ip_address}:6007/v1/dataprep" \
6969
-H "Content-Type: multipart/form-data" \
7070
-F 'link_list=["https://opea.dev"]')
@@ -78,7 +78,7 @@ function validate_megaservice() {
7878
fi
7979

8080
# Curl the Mega Service
81-
echo "Testing retriever service"
81+
echo "==============Testing retriever service: Text Request================="
8282
local CONTENT=$(curl http://${ip_address}:8889/v1/retrievaltool -X POST -H "Content-Type: application/json" -d '{
8383
"text": "Explain the OPEA project?"
8484
}')
@@ -93,6 +93,21 @@ function validate_megaservice() {
9393
docker logs doc-index-retriever-server | tee -a ${LOG_PATH}/doc-index-retriever-service-gaudi.log
9494
exit 1
9595
fi
96+
97+
echo "==============Testing retriever service: ChatCompletion Request================"
98+
cd $WORKPATH/tests
99+
local CONTENT=$(python test.py --host_ip ${ip_address} --request_type chat_completion)
100+
local EXIT_CODE=$(validate "$CONTENT" "OPEA" "doc-index-retriever-service-gaudi")
101+
echo "$EXIT_CODE"
102+
local EXIT_CODE="${EXIT_CODE:0-1}"
103+
echo "return value is $EXIT_CODE"
104+
if [ "$EXIT_CODE" == "1" ]; then
105+
docker logs tei-embedding-gaudi-server | tee -a ${LOG_PATH}/doc-index-retriever-service-gaudi.log
106+
docker logs retriever-redis-server | tee -a ${LOG_PATH}/doc-index-retriever-service-gaudi.log
107+
docker logs reranking-tei-server | tee -a ${LOG_PATH}/doc-index-retriever-service-gaudi.log
108+
docker logs doc-index-retriever-server | tee -a ${LOG_PATH}/doc-index-retriever-service-gaudi.log
109+
exit 1
110+
fi
96111
}
97112

98113
function stop_docker() {

DocIndexRetriever/tests/test_compose_on_xeon.sh

Lines changed: 21 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -63,8 +63,8 @@ function validate() {
6363
}
6464

6565
function validate_megaservice() {
66-
echo "Testing DataPrep Service"
67-
local CONTENT=$(curl -X POST "http://${ip_address}:6007/v1/dataprep" \
66+
echo "===========Ingest data=================="
67+
local CONTENT=$(http_proxy="" curl -X POST "http://${ip_address}:6007/v1/dataprep" \
6868
-H "Content-Type: multipart/form-data" \
6969
-F 'link_list=["https://opea.dev"]')
7070
local EXIT_CODE=$(validate "$CONTENT" "Data preparation succeeded" "dataprep-redis-service-xeon")
@@ -77,16 +77,32 @@ function validate_megaservice() {
7777
fi
7878

7979
# Curl the Mega Service
80-
echo "Testing retriever service"
80+
echo "================Testing retriever service: Default params================"
81+
8182
local CONTENT=$(curl http://${ip_address}:8889/v1/retrievaltool -X POST -H "Content-Type: application/json" -d '{
82-
"text": "Explain the OPEA project?"
83+
"messages": "Explain the OPEA project?"
8384
}')
8485
local EXIT_CODE=$(validate "$CONTENT" "OPEA" "doc-index-retriever-service-xeon")
8586
echo "$EXIT_CODE"
8687
local EXIT_CODE="${EXIT_CODE:0-1}"
8788
echo "return value is $EXIT_CODE"
8889
if [ "$EXIT_CODE" == "1" ]; then
89-
docker logs tei-embedding-xeon-server | tee -a ${LOG_PATH}/doc-index-retriever-service-xeon.log
90+
docker logs tei-embedding-server | tee -a ${LOG_PATH}/doc-index-retriever-service-xeon.log
91+
docker logs retriever-redis-server | tee -a ${LOG_PATH}/doc-index-retriever-service-xeon.log
92+
docker logs reranking-tei-server | tee -a ${LOG_PATH}/doc-index-retriever-service-xeon.log
93+
docker logs doc-index-retriever-server | tee -a ${LOG_PATH}/doc-index-retriever-service-xeon.log
94+
exit 1
95+
fi
96+
97+
echo "================Testing retriever service: ChatCompletion Request================"
98+
cd $WORKPATH/tests
99+
local CONTENT=$(python test.py --host_ip ${ip_address} --request_type chat_completion)
100+
local EXIT_CODE=$(validate "$CONTENT" "OPEA" "doc-index-retriever-service-xeon")
101+
echo "$EXIT_CODE"
102+
local EXIT_CODE="${EXIT_CODE:0-1}"
103+
echo "return value is $EXIT_CODE"
104+
if [ "$EXIT_CODE" == "1" ]; then
105+
docker logs tei-embedding-server | tee -a ${LOG_PATH}/doc-index-retriever-service-xeon.log
90106
docker logs retriever-redis-server | tee -a ${LOG_PATH}/doc-index-retriever-service-xeon.log
91107
docker logs reranking-tei-server | tee -a ${LOG_PATH}/doc-index-retriever-service-xeon.log
92108
docker logs doc-index-retriever-server | tee -a ${LOG_PATH}/doc-index-retriever-service-xeon.log

0 commit comments

Comments
 (0)