Skip to content

Commit 236ea6b

Browse files
Added compose example for MultimodalQnA deployment on AMD ROCm systems (#1233)
Signed-off-by: artem-astafev <a.astafev@datamonsters.com>
1 parent 67634df commit 236ea6b

File tree

4 files changed

+845
-0
lines changed

4 files changed

+845
-0
lines changed
Lines changed: 308 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,308 @@
1+
# Build Mega Service of MultimodalQnA for AMD ROCm
2+
3+
This document outlines the deployment process for a MultimodalQnA application utilizing the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline on AMD server with ROCm GPUs. The steps include Docker image creation, container deployment via Docker Compose, and service execution to integrate microservices such as `multimodal_embedding` that employs [BridgeTower](https://huggingface.co/BridgeTower/bridgetower-large-itm-mlm-gaudi) model as embedding model, `multimodal_retriever`, `lvm`, and `multimodal-data-prep`. We will publish the Docker images to Docker Hub soon, it will simplify the deployment process for this service.
4+
5+
For detailed information about these instance types, you can refer to this [link](https://aws.amazon.com/ec2/instance-types/m7i/). Once you've chosen the appropriate instance type, proceed with configuring your instance settings, including network configurations, security groups, and storage options.
6+
7+
After launching your instance, you can connect to it using SSH (for Linux instances) or Remote Desktop Protocol (RDP) (for Windows instances). From there, you'll have full access to your Xeon server, allowing you to install, configure, and manage your applications as needed.
8+
9+
## Setup Environment Variables
10+
11+
Since the `compose.yaml` will consume some environment variables, you need to setup them in advance as below.
12+
13+
Please use `./set_env.sh` (. set_env.sh) script to set up all needed Environment Variables.
14+
15+
**Export the value of the public IP address of your server to the `host_ip` environment variable**
16+
17+
Note: Please replace with `host_ip` with you external IP address, do not use localhost.
18+
19+
## 🚀 Build Docker Images
20+
21+
### 1. Build embedding-multimodal-bridgetower Image
22+
23+
Build embedding-multimodal-bridgetower docker image
24+
25+
```bash
26+
git clone https://github.com/opea-project/GenAIComps.git
27+
cd GenAIComps
28+
docker build --no-cache -t opea/embedding-multimodal-bridgetower:latest --build-arg EMBEDDER_PORT=$EMBEDDER_PORT --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/embeddings/multimodal/bridgetower/Dockerfile .
29+
```
30+
31+
Build embedding-multimodal microservice image
32+
33+
```bash
34+
docker build --no-cache -t opea/embedding-multimodal:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/embeddings/multimodal/multimodal_langchain/Dockerfile .
35+
```
36+
37+
### 2. Build LVM Images
38+
39+
Build lvm-llava image
40+
41+
```bash
42+
docker build --no-cache -t opea/lvm-llava:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/lvms/llava/dependency/Dockerfile .
43+
```
44+
45+
### 3. Build retriever-multimodal-redis Image
46+
47+
```bash
48+
docker build --no-cache -t opea/retriever-redis:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/retrievers/redis/langchain/Dockerfile .
49+
```
50+
51+
### 4. Build dataprep-multimodal-redis Image
52+
53+
```bash
54+
docker build --no-cache -t opea/dataprep-multimodal-redis:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/dataprep/multimodal/redis/langchain/Dockerfile .
55+
```
56+
57+
### 5. Build MegaService Docker Image
58+
59+
To construct the Mega Service, we utilize the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline within the [multimodalqna.py](../../../../multimodalqna.py) Python script. Build MegaService Docker image via below command:
60+
61+
```bash
62+
git clone https://github.com/opea-project/GenAIExamples.git
63+
cd GenAIExamples/MultimodalQnA
64+
docker build --no-cache -t opea/multimodalqna:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile .
65+
cd ../..
66+
```
67+
68+
### 6. Build UI Docker Image
69+
70+
Build frontend Docker image via below command:
71+
72+
```bash
73+
cd GenAIExamples/MultimodalQnA/ui/
74+
docker build --no-cache -t opea/multimodalqna-ui:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f ./docker/Dockerfile .
75+
cd ../../../
76+
```
77+
78+
### 7. Pull TGI AMD ROCm Image
79+
80+
```bash
81+
docker pull ghcr.io/huggingface/text-generation-inference:2.4.1-rocm
82+
```
83+
84+
Then run the command `docker images`, you will have the following 8 Docker Images:
85+
86+
1. `opea/dataprep-multimodal-redis:latest`
87+
2. `ghcr.io/huggingface/text-generation-inference:2.4.1-rocm`
88+
3. `opea/lvm-tgi:latest`
89+
4. `opea/retriever-multimodal-redis:latest`
90+
5. `opea/embedding-multimodal:latest`
91+
6. `opea/embedding-multimodal-bridgetower:latest`
92+
7. `opea/multimodalqna:latest`
93+
8. `opea/multimodalqna-ui:latest`
94+
95+
## 🚀 Start Microservices
96+
97+
### Required Models
98+
99+
By default, the multimodal-embedding and LVM models are set to a default value as listed below:
100+
101+
| Service | Model |
102+
| -------------------- | ------------------------------------------- |
103+
| embedding-multimodal | BridgeTower/bridgetower-large-itm-mlm-gaudi |
104+
| LVM | llava-hf/llava-1.5-7b-hf |
105+
| LVM | Xkev/Llama-3.2V-11B-cot |
106+
107+
Note:
108+
109+
For AMD ROCm System "Xkev/Llama-3.2V-11B-cot" is recommended to run on ghcr.io/huggingface/text-generation-inference:2.4.1-rocm
110+
111+
### Start all the services Docker Containers
112+
113+
> Before running the docker compose command, you need to be in the folder that has the docker compose yaml file
114+
115+
```bash
116+
cd GenAIExamples/MultimodalQnA/docker_compose/amd/gpu/rocm
117+
. set_env.sh
118+
docker compose -f compose.yaml up -d
119+
```
120+
121+
Note: Please replace with `host_ip` with your external IP address, do not use localhost.
122+
123+
Note: In order to limit access to a subset of GPUs, please pass each device individually using one or more -device /dev/dri/rendered<node>, where <node> is the card index, starting from 128. (https://rocm.docs.amd.com/projects/install-on-linux/en/latest/how-to/docker.html#docker-restrict-gpus)
124+
125+
Example for set isolation for 1 GPU
126+
127+
```
128+
- /dev/dri/card0:/dev/dri/card0
129+
- /dev/dri/renderD128:/dev/dri/renderD128
130+
```
131+
132+
Example for set isolation for 2 GPUs
133+
134+
```
135+
- /dev/dri/card0:/dev/dri/card0
136+
- /dev/dri/renderD128:/dev/dri/renderD128
137+
- /dev/dri/card1:/dev/dri/card1
138+
- /dev/dri/renderD129:/dev/dri/renderD129
139+
```
140+
141+
Please find more information about accessing and restricting AMD GPUs in the link (https://rocm.docs.amd.com/projects/install-on-linux/en/latest/how-to/docker.html#docker-restrict-gpus)
142+
143+
### Validate Microservices
144+
145+
1. embedding-multimodal-bridgetower
146+
147+
```bash
148+
curl http://${host_ip}:${EMBEDDER_PORT}/v1/encode \
149+
-X POST \
150+
-H "Content-Type:application/json" \
151+
-d '{"text":"This is example"}'
152+
```
153+
154+
```bash
155+
curl http://${host_ip}:${EMBEDDER_PORT}/v1/encode \
156+
-X POST \
157+
-H "Content-Type:application/json" \
158+
-d '{"text":"This is example", "img_b64_str": "iVBORw0KGgoAAAANSUhEUgAAAAoAAAAKCAYAAACNMs+9AAAAFUlEQVR42mP8/5+hnoEIwDiqkL4KAcT9GO0U4BxoAAAAAElFTkSuQmCC"}'
159+
```
160+
161+
2. embedding-multimodal
162+
163+
```bash
164+
curl http://${host_ip}:$MM_EMBEDDING_PORT_MICROSERVICE/v1/embeddings \
165+
-X POST \
166+
-H "Content-Type: application/json" \
167+
-d '{"text" : "This is some sample text."}'
168+
```
169+
170+
```bash
171+
curl http://${host_ip}:$MM_EMBEDDING_PORT_MICROSERVICE/v1/embeddings \
172+
-X POST \
173+
-H "Content-Type: application/json" \
174+
-d '{"text": {"text" : "This is some sample text."}, "image" : {"url": "https://github.com/docarray/docarray/blob/main/tests/toydata/image-data/apple.png?raw=true"}}'
175+
```
176+
177+
3. retriever-multimodal-redis
178+
179+
```bash
180+
export your_embedding=$(python3 -c "import random; embedding = [random.uniform(-1, 1) for _ in range(512)]; print(embedding)")
181+
curl http://${host_ip}:7000/v1/multimodal_retrieval \
182+
-X POST \
183+
-H "Content-Type: application/json" \
184+
-d "{\"text\":\"test\",\"embedding\":${your_embedding}}"
185+
```
186+
187+
4. lvm-llava
188+
189+
```bash
190+
curl http://${host_ip}:${LLAVA_SERVER_PORT}/generate \
191+
-X POST \
192+
-H "Content-Type:application/json" \
193+
-d '{"prompt":"Describe the image please.", "img_b64_str": "iVBORw0KGgoAAAANSUhEUgAAAAoAAAAKCAYAAACNMs+9AAAAFUlEQVR42mP8/5+hnoEIwDiqkL4KAcT9GO0U4BxoAAAAAElFTkSuQmCC"}'
194+
```
195+
196+
5. lvm-llava-svc
197+
198+
```bash
199+
curl http://${host_ip}:9399/v1/lvm \
200+
-X POST \
201+
-H 'Content-Type: application/json' \
202+
-d '{"retrieved_docs": [], "initial_query": "What is this?", "top_n": 1, "metadata": [{"b64_img_str": "iVBORw0KGgoAAAANSUhEUgAAAAoAAAAKCAYAAACNMs+9AAAAFUlEQVR42mP8/5+hnoEIwDiqkL4KAcT9GO0U4BxoAAAAAElFTkSuQmCC", "transcript_for_inference": "yellow image", "video_id": "8c7461df-b373-4a00-8696-9a2234359fe0", "time_of_frame_ms":"37000000", "source_video":"WeAreGoingOnBullrun_8c7461df-b373-4a00-8696-9a2234359fe0.mp4"}], "chat_template":"The caption of the image is: '\''{context}'\''. {question}"}'
203+
```
204+
205+
```bash
206+
curl http://${host_ip}:9399/v1/lvm \
207+
-X POST \
208+
-H 'Content-Type: application/json' \
209+
-d '{"image": "iVBORw0KGgoAAAANSUhEUgAAAAoAAAAKCAYAAACNMs+9AAAAFUlEQVR42mP8/5+hnoEIwDiqkL4KAcT9GO0U4BxoAAAAAElFTkSuQmCC", "prompt":"What is this?"}'
210+
```
211+
212+
Also, validate LVM Microservice with empty retrieval results
213+
214+
```bash
215+
curl http://${host_ip}:9399/v1/lvm \
216+
-X POST \
217+
-H 'Content-Type: application/json' \
218+
-d '{"retrieved_docs": [], "initial_query": "What is this?", "top_n": 1, "metadata": [], "chat_template":"The caption of the image is: '\''{context}'\''. {question}"}'
219+
```
220+
221+
6. dataprep-multimodal-redis
222+
223+
Download a sample video, image, and audio file and create a caption
224+
225+
```bash
226+
export video_fn="WeAreGoingOnBullrun.mp4"
227+
wget http://commondatastorage.googleapis.com/gtv-videos-bucket/sample/WeAreGoingOnBullrun.mp4 -O ${video_fn}
228+
229+
export image_fn="apple.png"
230+
wget https://github.com/docarray/docarray/blob/main/tests/toydata/image-data/apple.png?raw=true -O ${image_fn}
231+
232+
export caption_fn="apple.txt"
233+
echo "This is an apple." > ${caption_fn}
234+
235+
export audio_fn="AudioSample.wav"
236+
wget https://github.com/intel/intel-extension-for-transformers/raw/main/intel_extension_for_transformers/neural_chat/assets/audio/sample.wav -O ${audio_fn}
237+
```
238+
239+
Test dataprep microservice with generating transcript. This command updates a knowledge base by uploading a local video .mp4 and an audio .wav file.
240+
241+
```bash
242+
curl --silent --write-out "HTTPSTATUS:%{http_code}" \
243+
${DATAPREP_GEN_TRANSCRIPT_SERVICE_ENDPOINT} \
244+
-H 'Content-Type: multipart/form-data' \
245+
-X POST \
246+
-F "files=@./${video_fn}" \
247+
-F "files=@./${audio_fn}"
248+
```
249+
250+
Also, test dataprep microservice with generating an image caption using lvm microservice
251+
252+
```bash
253+
curl --silent --write-out "HTTPSTATUS:%{http_code}" \
254+
${DATAPREP_GEN_CAPTION_SERVICE_ENDPOINT} \
255+
-H 'Content-Type: multipart/form-data' \
256+
-X POST -F "files=@./${image_fn}"
257+
```
258+
259+
Now, test the microservice with posting a custom caption along with an image
260+
261+
```bash
262+
curl --silent --write-out "HTTPSTATUS:%{http_code}" \
263+
${DATAPREP_INGEST_SERVICE_ENDPOINT} \
264+
-H 'Content-Type: multipart/form-data' \
265+
-X POST -F "files=@./${image_fn}" -F "files=@./${caption_fn}"
266+
```
267+
268+
Also, you are able to get the list of all files that you uploaded:
269+
270+
```bash
271+
curl -X POST \
272+
-H "Content-Type: application/json" \
273+
${DATAPREP_GET_FILE_ENDPOINT}
274+
```
275+
276+
Then you will get the response python-style LIST like this. Notice the name of each uploaded file e.g., `videoname.mp4` will become `videoname_uuid.mp4` where `uuid` is a unique ID for each uploaded file. The same files that are uploaded twice will have different `uuid`.
277+
278+
```bash
279+
[
280+
"WeAreGoingOnBullrun_7ac553a1-116c-40a2-9fc5-deccbb89b507.mp4",
281+
"WeAreGoingOnBullrun_6d13cf26-8ba2-4026-a3a9-ab2e5eb73a29.mp4",
282+
"apple_fcade6e6-11a5-44a2-833a-3e534cbe4419.png",
283+
"AudioSample_976a85a6-dc3e-43ab-966c-9d81beef780c.wav
284+
]
285+
```
286+
287+
To delete all uploaded files along with data indexed with `$INDEX_NAME` in REDIS.
288+
289+
```bash
290+
curl -X POST \
291+
-H "Content-Type: application/json" \
292+
${DATAPREP_DELETE_FILE_ENDPOINT}
293+
```
294+
295+
7. MegaService
296+
297+
```bash
298+
curl http://${host_ip}:8888/v1/multimodalqna \
299+
-H "Content-Type: application/json" \
300+
-X POST \
301+
-d '{"messages": "What is the revenue of Nike in 2023?"}'
302+
```
303+
304+
```bash
305+
curl http://${host_ip}:8888/v1/multimodalqna \
306+
-H "Content-Type: application/json" \
307+
-d '{"messages": [{"role": "user", "content": [{"type": "text", "text": "hello, "}, {"type": "image_url", "image_url": {"url": "https://www.ilankelman.org/stopsigns/australia.jpg"}}]}, {"role": "assistant", "content": "opea project! "}, {"role": "user", "content": "chao, "}], "max_tokens": 10}'
308+
```

0 commit comments

Comments
 (0)