From c53c676932028c0e44f83b02e30138ae38075443 Mon Sep 17 00:00:00 2001
From: Spycsh <sihan.chen@intel.com>
Date: Sun, 13 Apr 2025 00:15:36 -0700
Subject: [PATCH 01/11] refine audioqna readmes

---
 AudioQnA/README.md                            |  48 ++--
 AudioQnA/README_miscellaneous.md              |  40 ++++
 .../docker_compose/intel/cpu/xeon/README.md   | 189 ++++++++--------
 .../docker_compose/intel/hpu/gaudi/README.md  | 207 ++++++++++--------
 4 files changed, 270 insertions(+), 214 deletions(-)
 create mode 100644 AudioQnA/README_miscellaneous.md

diff --git a/AudioQnA/README.md b/AudioQnA/README.md
index b664d52783..2391e53570 100644
--- a/AudioQnA/README.md
+++ b/AudioQnA/README.md
@@ -2,6 +2,15 @@
 
 AudioQnA is an example that demonstrates the integration of Generative AI (GenAI) models for performing question-answering (QnA) on audio files, with the added functionality of Text-to-Speech (TTS) for generating spoken responses. The example showcases how to convert audio input to text using Automatic Speech Recognition (ASR), generate answers to user queries using a language model, and then convert those answers back to speech using Text-to-Speech (TTS).
 
+# Table of Contents
+
+1. [Architecture](#architecture)
+2. [Deployment Options](#deployment-options)
+3. [Monitoring and Tracing](./README_miscellaneous.md)
+
+
+## Architecture
+
 The AudioQnA example is implemented using the component-level microservices defined in [GenAIComps](https://github.com/opea-project/GenAIComps). The flow chart below shows the information flow between different microservices for this example.
 
 ```mermaid
@@ -59,37 +68,14 @@ flowchart LR
 
 ```
 
-## Deploy AudioQnA Service
-
-The AudioQnA service can be deployed on either Intel Gaudi2 or Intel Xeon Scalable Processor.
-
-### Deploy AudioQnA on Gaudi
-
-Refer to the [Gaudi Guide](./docker_compose/intel/hpu/gaudi/README.md) for instructions on deploying AudioQnA on Gaudi.
-
-### Deploy AudioQnA on Xeon
 
-Refer to the [Xeon Guide](./docker_compose/intel/cpu/xeon/README.md) for instructions on deploying AudioQnA on Xeon.
-
-## Deploy using Helm Chart
-
-Refer to the [AudioQnA helm chart](./kubernetes/helm/README.md) for instructions on deploying AudioQnA on Kubernetes.
-
-## Supported Models
-
-### ASR
-
-The default model is [openai/whisper-small](https://huggingface.co/openai/whisper-small). It also supports all models in the Whisper family, such as `openai/whisper-large-v3`, `openai/whisper-medium`, `openai/whisper-base`, `openai/whisper-tiny`, etc.
-
-To replace the model, please edit the `compose.yaml` and add the `command` line to pass the name of the model you want to use:
-
-```yaml
-services:
-  whisper-service:
-    ...
-    command: --model_name_or_path openai/whisper-tiny
-```
+## Deployment Options
 
-### TTS
+The table below lists currently available deployment options. They outline in detail the implementation of this example on selected hardware.
 
-The default model is [microsoft/SpeechT5](https://huggingface.co/microsoft/speecht5_tts). We currently do not support replacing the model. More models under the commercial license will be added in the future.
+| Category               | Deployment Option    | Description                                                       |
+| ---------------------- | -------------------- | ----------------------------------------------------------------- |
+| On-premise Deployments | Docker compose       | [AudioQnA deployment on Xeon](./docker_compose/intel/cpu/xeon)   |
+|                        |                      | [AudioQnA deployment on Gaudi](./docker_compose/intel/hpu/gaudi) |
+|                        |                      | [AudioQnA deployment on AMD ROCm](./docker_compose/amd/gpu/rocm) |
+|                        | Kubernetes           | [Helm Charts](./kubernetes/helm)                                  |
diff --git a/AudioQnA/README_miscellaneous.md b/AudioQnA/README_miscellaneous.md
new file mode 100644
index 0000000000..8b2b6b66e3
--- /dev/null
+++ b/AudioQnA/README_miscellaneous.md
@@ -0,0 +1,40 @@
+# Table of Contents
+
+1. [Build MegaService Docker Image](#build-megaservice-docker-image)
+2. [Build UI Docker Image](#build-ui-docker-image)
+3. [Generate a HuggingFace Access Token](#generate-a-huggingface-access-token)
+4. [Troubleshooting](#troubleshooting)
+
+## Build MegaService Docker Image
+
+To construct the Megaservice of AudioQnA, the [GenAIExamples](https://github.com/opea-project/GenAIExamples.git) repository is utilized. Build Megaservice Docker image via command below:
+
+```bash
+git clone https://github.com/opea-project/GenAIExamples.git
+cd GenAIExamples/AudioQnA
+docker build --no-cache -t opea/audioqna:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile .
+```
+
+## Build UI Docker Image
+
+Build frontend Docker image via below command:
+
+```bash
+cd GenAIExamples/AudioQnA/ui
+docker build -t opea/audioqna-ui:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f ./docker/Dockerfile .
+```
+
+## Generate a HuggingFace Access Token
+
+Some HuggingFace resources, such as some models, are only accessible if the developer have an access token. In the absence of a HuggingFace access token, the developer can create one by first creating an account by following the steps provided at [HuggingFace](https://huggingface.co/) and then generating a [user access token](https://huggingface.co/docs/transformers.js/en/guides/private#step-1-generating-a-user-access-token).
+
+## Troubleshooting
+
+1. If you get errors like "Access Denied", [validate micro service](https://github.com/opea-project/GenAIExamples/tree/main/AudioQnA/docker_compose/intel/cpu/xeon/README.md#validate-microservices) first. A simple example:
+
+   ```bash
+   curl http://${host_ip}:7055/v1/audio/speech -XPOST -d '{"input": "Who are you?"}' -H 'Content-Type: application/json' --output speech.mp3
+   ```
+
+2. (Docker only) If all microservices work well, check the port ${host_ip}:7777, the port may be allocated by other users, you can modify the `compose.yaml`.
+3. (Docker only) If you get errors like "The container name is in use", change container name in `compose.yaml`.
diff --git a/AudioQnA/docker_compose/intel/cpu/xeon/README.md b/AudioQnA/docker_compose/intel/cpu/xeon/README.md
index 6fdb3d8fdb..5487fb5b11 100644
--- a/AudioQnA/docker_compose/intel/cpu/xeon/README.md
+++ b/AudioQnA/docker_compose/intel/cpu/xeon/README.md
@@ -1,123 +1,149 @@
-# Build Mega Service of AudioQnA on Xeon
+# Deploying AudioQnA on Intel® Xeon® Processors
 
-This document outlines the deployment process for a AudioQnA application utilizing the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline on Intel Xeon server.
-
-The default pipeline deploys with vLLM as the LLM serving component. It also provides options of using TGI backend for LLM microservice, please refer to [Start the MegaService](#-start-the-megaservice) section in this page.
+This document outlines the single node deployment process for a AudioQnA application utilizing the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservices on Intel Xeon server. The steps include pulling Docker images, container deployment via Docker Compose, and service execution using microservices `llm`.
 
 Note: The default LLM is `meta-llama/Meta-Llama-3-8B-Instruct`. Before deploying the application, please make sure either you've requested and been granted the access to it on [Huggingface](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) or you've downloaded the model locally from [ModelScope](https://www.modelscope.cn/models).
 
-## 🚀 Build Docker images
 
-### 1. Source Code install GenAIComps
+# Table of Contents
 
-```bash
-git clone https://github.com/opea-project/GenAIComps.git
-cd GenAIComps
-```
+1. [AudioQnA Quick Start Deployment](#audioqna-quick-start-deployment)
+2. [AudioQnA Docker Compose Files](#audioqna-docker-compose-files)
+3. [Validate Microservices](#validate-microservices)
+4. [Conclusion](#conclusion)
 
-### 2. Build ASR Image
+## AudioQnA Quick Start Deployment
 
-```bash
-docker build -t opea/whisper:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/third_parties/whisper/src/Dockerfile .
-```
+This section describes how to quickly deploy and test the AudioQnA service manually on an Intel® Xeon® processor. The basic steps are:
 
-### 3. Build vLLM Image
+1. [Access the Code](#access-the-code)
+2. [Configure the Deployment Environment](#configure-the-deployment-environment)
+3. [Deploy the Services Using Docker Compose](#deploy-the-services-using-docker-compose)
+4. [Check the Deployment Status](#check-the-deployment-status)
+5. [Validate the Pipeline](#validate-the-pipeline)
+6. [Cleanup the Deployment](#cleanup-the-deployment)
+
+### Access the Code
+
+Clone the GenAIExample repository and access the AudioQnA Intel® Xeon® platform Docker Compose files and supporting scripts:
 
 ```bash
-git clone https://github.com/vllm-project/vllm.git
-cd ./vllm/
-VLLM_VER="v0.8.3"
-git checkout ${VLLM_VER}
-docker build --no-cache --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f docker/Dockerfile.cpu -t opea/vllm:latest --shm-size=128g .
+git clone https://github.com/opea-project/GenAIExamples.git
+cd GenAIExamples/AudioQnA
 ```
 
-### 4. Build TTS Image
+Then checkout a released version, such as v1.2:
 
 ```bash
-docker build -t opea/speecht5:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/third_parties/speecht5/src/Dockerfile .
-
-# multilang tts (optional)
-docker build -t opea/gpt-sovits:latest --build-arg http_proxy=$http_proxy --build-arg https_proxy=$https_proxy -f comps/third_parties/gpt-sovits/src/Dockerfile .
+git checkout v1.2
 ```
 
-### 5. Build MegaService Docker Image
+### Configure the Deployment Environment
 
-To construct the Mega Service, we utilize the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline within the `audioqna.py` Python script. Build the MegaService Docker image using the command below:
+To set up environment variables for deploying AudioQnA services, set up some parameters specific to the deployment environment and source the `set_env.sh` script in this directory:
 
 ```bash
-git clone https://github.com/opea-project/GenAIExamples.git
-cd GenAIExamples/AudioQnA/
-docker build --no-cache -t opea/audioqna:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile .
+export host_ip="External_Public_IP"           # ip address of the node
+export HUGGINGFACEHUB_API_TOKEN="Your_HuggingFace_API_Token"
+export http_proxy="Your_HTTP_Proxy"           # http proxy if any
+export https_proxy="Your_HTTPs_Proxy"         # https proxy if any
+export no_proxy=localhost,127.0.0.1,$host_ip  # additional no proxies if needed
+export NGINX_PORT=${your_nginx_port}          # your usable port for nginx, 80 for example
+source ./set_env.sh
 ```
 
-Then run the command `docker images`, you will have following images ready:
+Consult the section on [AudioQnA Service configuration](#audioqna-configuration) for information on how service specific configuration parameters affect deployments.
 
-1. `opea/whisper:latest`
-2. `opea/vllm:latest`
-3. `opea/speecht5:latest`
-4. `opea/audioqna:latest`
-5. `opea/gpt-sovits:latest` (optional)
+### Deploy the Services Using Docker Compose
 
-## 🚀 Set the environment variables
-
-Before starting the services with `docker compose`, you have to recheck the following environment variables.
+To deploy the AudioQnA services, execute the `docker compose up` command with the appropriate arguments. For a default deployment, execute the command below. It uses the 'compose.yaml' file.
 
 ```bash
-export host_ip=<your External Public IP>    # export host_ip=$(hostname -I | awk '{print $1}')
-export HUGGINGFACEHUB_API_TOKEN=<your HF token>
+cd docker_compose/intel/cpu/xeon
+docker compose -f compose.yaml up -d
+```
 
-export LLM_MODEL_ID="meta-llama/Meta-Llama-3-8B-Instruct"
+> **Note**: developers should build docker image from source when:
+>
+> - Developing off the git main branch (as the container's ports in the repo may be different > from the published docker image).
+> - Unable to download the docker image.
+> - Use a specific version of Docker image.
 
-export MEGA_SERVICE_HOST_IP=${host_ip}
-export WHISPER_SERVER_HOST_IP=${host_ip}
-export SPEECHT5_SERVER_HOST_IP=${host_ip}
-export LLM_SERVER_HOST_IP=${host_ip}
-export GPT_SOVITS_SERVER_HOST_IP=${host_ip}
+Please refer to the table below to build different microservices from source:
 
-export WHISPER_SERVER_PORT=7066
-export SPEECHT5_SERVER_PORT=7055
-export GPT_SOVITS_SERVER_PORT=9880
-export LLM_SERVER_PORT=3006
+| Microservice | Deployment Guide                                                                                               |
+| ------------ | -------------------------------------------------------------------------------------------------------------- |
+| vLLM         | [vLLM build guide](https://github.com/opea-project/GenAIComps/tree/main/comps/third_parties/vllm#build-docker) |
+| LLM          | [LLM build guide](https://github.com/opea-project/GenAIComps/tree/main/comps/llms)                             |
+| WHISPER      | [Whisper build guide](https://github.com/opea-project/GenAIComps/tree/main/comps/asr/src#211-whisper-server-image)                              |
+| SPEECHT5     | [SpeechT5 build guide](https://github.com/opea-project/GenAIComps/tree/main/comps/tts/src#211-speecht5-server-image)                              |
+| GPT-SOVITS | [GPT-SOVITS build guide](https://github.com/opea-project/GenAIComps/tree/main/comps/third_parties/gpt-sovits/src#build-the-image) |
+| MegaService  | [MegaService build guide](../../../../README_miscellaneous.md#build-megaservice-docker-image)                  |
+| UI           | [Basic UI build guide](../../../../README_miscellaneous.md#build-ui-docker-image)                              |
 
-export BACKEND_SERVICE_ENDPOINT=http://${host_ip}:3008/v1/audioqna
-```
 
-or use set_env.sh file to setup environment variables.
+### Check the Deployment Status
 
-Note:
+After running docker compose, check if all the containers launched via docker compose have started:
+
+```bash
+docker ps -a
+```
 
-- Please replace with host_ip with your external IP address, do not use localhost.
-- If you are in a proxy environment, also set the proxy-related environment variables:
+For the default deployment, the following 5 containers should have started:
 
 ```
-export http_proxy="Your_HTTP_Proxy"
-export https_proxy="Your_HTTPs_Proxy"
-# Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1"
-export no_proxy="Your_No_Proxy",${host_ip},whisper-service,speecht5-service,gpt-sovits-service,tgi-service,vllm-service,audioqna-xeon-backend-server,audioqna-xeon-ui-server
+1c67e44c39d2   opea/audioqna-ui:latest   "docker-entrypoint.s…"   About a minute ago   Up About a minute             0.0.0.0:5173->5173/tcp, :::5173->5173/tcp   audioqna-xeon-ui-server
+833a42677247   opea/audioqna:latest      "python audioqna.py"     About a minute ago   Up About a minute             0.0.0.0:3008->8888/tcp, :::3008->8888/tcp   audioqna-xeon-backend-server
+5dc4eb9bf499   opea/speecht5:latest      "python speecht5_ser…"   About a minute ago   Up About a minute             0.0.0.0:7055->7055/tcp, :::7055->7055/tcp   speecht5-service
+814e6efb1166   opea/vllm:latest          "python3 -m vllm.ent…"   About a minute ago   Up About a minute (healthy)   0.0.0.0:3006->80/tcp, :::3006->80/tcp       vllm-service
+46f7a00f4612   opea/whisper:latest       "python whisper_serv…"   About a minute ago   Up About a minute             0.0.0.0:7066->7066/tcp, :::7066->7066/tcp   whisper-service
 ```
 
-## 🚀 Start the MegaService
+If any issues are encountered during deployment, refer to the [Troubleshooting](../../../../README_miscellaneous.md#troubleshooting) section.
+
+
+### Validate the Pipeline
+
+Once the AudioQnA services are running, test the pipeline using the following command:
 
 ```bash
-cd GenAIExamples/AudioQnA/docker_compose/intel/cpu/xeon/
-```
+# Test the AudioQnA megaservice by recording a .wav file, encoding the file into the base64 format, and then sending the base64 string to the megaservice endpoint.
+# The megaservice will return a spoken response as a base64 string. To listen to the response, decode the base64 string and save it as a .wav file.
+wget https://github.com/intel/intel-extension-for-transformers/raw/refs/heads/main/intel_extension_for_transformers/neural_chat/assets/audio/sample_2.wav
+base64_audio=$(base64 -w 0 sample_2.wav)
 
-If use vLLM as the LLM serving backend:
+# if you are using speecht5 as the tts service, voice can be "default" or "male"
+# if you are using gpt-sovits for the tts service, you can set the reference audio following https://github.com/opea-project/GenAIComps/blob/main/comps/third_parties/gpt-sovits/src/README.md
 
+curl http://${host_ip}:3008/v1/audioqna \
+  -X POST \
+  -H "Content-Type: application/json" \
+  -d "{\"audio\": \"${base64_audio}\", \"max_tokens\": 64, \"voice\": \"default\"}" \
+  | sed 's/^"//;s/"$//' | base64 -d > output.wav
 ```
-docker compose up -d
 
-# multilang tts (optional)
-docker compose -f compose_multilang.yaml up -d
-```
+**Note** : Access the AudioQnA UI by web browser through this URL: `http://${host_ip}:5173`. Please confirm the `5173` port is opened in the firewall. To validate each microservie used in the pipeline refer to the [Validate Microservices](#validate-microservices) section.
 
-If use TGI as the LLM serving backend:
+### Cleanup the Deployment
 
+To stop the containers associated with the deployment, execute the following command:
+
+```bash
+docker compose -f compose.yaml down
 ```
-docker compose -f compose_tgi.yaml up -d
-```
+## AudioQnA Docker Compose Files
+
+In the context of deploying an AudioQnA pipeline on an Intel® Xeon® platform, we can pick and choose different large language model serving frameworks, or single English TTS/multi-language TTS component. The table below outlines the various configurations that are available as part of the application. These configurations can be used as templates and can be extended to different components available in [GenAIComps](https://github.com/opea-project/GenAIComps.git).
+
+| File                                   | Description                                                                               |
+| -------------------------------------- | ----------------------------------------------------------------------------------------- |
+| [compose.yaml](./compose.yaml)         | Default compose file using vllm as serving framework and redis as vector database         |
+| [compose_tgi.yaml](./compose_tgi.yaml) | The LLM serving framework is TGI. All other configurations remain the same as the default |
+| [compose_multilang.yaml](./compose_multilang.yaml) | The TTS component is GPT-SoVITS. All other configurations remain the same as the default |
 
-## 🚀 Test MicroServices
+
+## Validate MicroServices
 
 1. Whisper Service
 
@@ -161,7 +187,7 @@ docker compose -f compose_tgi.yaml up -d
 
 3. TTS Service
 
-   ```
+   ```bash
    # speecht5 service
    curl http://${host_ip}:${SPEECHT5_SERVER_PORT}/v1/audio/speech -XPOST -d '{"input": "Who are you?"}' -H 'Content-Type: application/json' --output speech.mp3
 
@@ -169,17 +195,6 @@ docker compose -f compose_tgi.yaml up -d
    curl http://${host_ip}:${GPT_SOVITS_SERVER_PORT}/v1/audio/speech -XPOST -d '{"input": "Who are you?"}' -H 'Content-Type: application/json' --output speech.mp3
    ```
 
-## 🚀 Test MegaService
+## Conclusion
 
-Test the AudioQnA megaservice by recording a .wav file, encoding the file into the base64 format, and then sending the
-base64 string to the megaservice endpoint. The megaservice will return a spoken response as a base64 string. To listen
-to the response, decode the base64 string and save it as a .wav file.
-
-```bash
-# if you are using speecht5 as the tts service, voice can be "default" or "male"
-# if you are using gpt-sovits for the tts service, you can set the reference audio following https://github.com/opea-project/GenAIComps/blob/main/comps/third_parties/gpt-sovits/src/README.md
-curl http://${host_ip}:3008/v1/audioqna \
-  -X POST \
-  -d '{"audio": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA", "max_tokens":64, "voice":"default"}' \
-  -H 'Content-Type: application/json' | sed 's/^"//;s/"$//' | base64 -d > output.wav
-```
+This guide should enable developers to deploy the default configuration or any of the other compose yaml files for different configurations. It also highlights the configurable parameters that can be set before deployment.
diff --git a/AudioQnA/docker_compose/intel/hpu/gaudi/README.md b/AudioQnA/docker_compose/intel/hpu/gaudi/README.md
index 6f62fdac55..9582b92bdb 100644
--- a/AudioQnA/docker_compose/intel/hpu/gaudi/README.md
+++ b/AudioQnA/docker_compose/intel/hpu/gaudi/README.md
@@ -1,145 +1,173 @@
-# Build Mega Service of AudioQnA on Gaudi
+# Deploying AudioQnA on Intel® Gaudi® Processors
 
-This document outlines the deployment process for a AudioQnA application utilizing the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline on Intel Gaudi server.
-
-The default pipeline deploys with vLLM as the LLM serving component. It also provides options of using TGI backend for LLM microservice, please refer to [Start the MegaService](#-start-the-megaservice) section in this page.
+This document outlines the single node deployment process for a AudioQnA application utilizing the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservices on Intel Gaudi server. The steps include pulling Docker images, container deployment via Docker Compose, and service execution using microservices `llm`.
 
 Note: The default LLM is `meta-llama/Meta-Llama-3-8B-Instruct`. Before deploying the application, please make sure either you've requested and been granted the access to it on [Huggingface](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) or you've downloaded the model locally from [ModelScope](https://www.modelscope.cn/models).
 
-## 🚀 Build Docker images
 
-### 1. Source Code install GenAIComps
+# Table of Contents
+
+1. [AudioQnA Quick Start Deployment](#audioqna-quick-start-deployment)
+2. [AudioQnA Docker Compose Files](#audioqna-docker-compose-files)
+3. [Validate Microservices](#validate-microservices)
+4. [Conclusion](#conclusion)
+
+## AudioQnA Quick Start Deployment
+
+This section describes how to quickly deploy and test the AudioQnA service manually on an Intel® Gaudi® processor. The basic steps are:
+
+1. [Access the Code](#access-the-code)
+2. [Configure the Deployment Environment](#configure-the-deployment-environment)
+3. [Deploy the Services Using Docker Compose](#deploy-the-services-using-docker-compose)
+4. [Check the Deployment Status](#check-the-deployment-status)
+5. [Validate the Pipeline](#validate-the-pipeline)
+6. [Cleanup the Deployment](#cleanup-the-deployment)
+
+### Access the Code
+
+Clone the GenAIExample repository and access the AudioQnA Intel® Gaudi® platform Docker Compose files and supporting scripts:
 
 ```bash
-git clone https://github.com/opea-project/GenAIComps.git
-cd GenAIComps
+git clone https://github.com/opea-project/GenAIExamples.git
+cd GenAIExamples/AudioQnA
 ```
 
-### 2. Build ASR Image
+Then checkout a released version, such as v1.2:
 
 ```bash
-docker build -t opea/whisper-gaudi:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/third_parties/whisper/src/Dockerfile.intel_hpu .
+git checkout v1.2
 ```
 
-### 3. Build vLLM Image
+### Configure the Deployment Environment
 
-git clone https://github.com/HabanaAI/vllm-fork.git
-cd vllm-fork/
-VLLM_VER=v0.6.6.post1+Gaudi-1.20.0
-git checkout ${VLLM_VER}
-docker build --no-cache --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile.hpu -t opea/vllm-gaudi:latest --shm-size=128g .
-
-### 4. Build TTS Image
+To set up environment variables for deploying AudioQnA services, set up some parameters specific to the deployment environment and source the `set_env.sh` script in this directory:
 
 ```bash
-docker build -t opea/speecht5-gaudi:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/third_parties/speecht5/src/Dockerfile.intel_hpu .
+export host_ip="External_Public_IP"           # ip address of the node
+export HUGGINGFACEHUB_API_TOKEN="Your_HuggingFace_API_Token"
+export http_proxy="Your_HTTP_Proxy"           # http proxy if any
+export https_proxy="Your_HTTPs_Proxy"         # https proxy if any
+export no_proxy=localhost,127.0.0.1,$host_ip  # additional no proxies if needed
+export NGINX_PORT=${your_nginx_port}          # your usable port for nginx, 80 for example
+source ./set_env.sh
 ```
 
-### 5. Build MegaService Docker Image
+Consult the section on [AudioQnA Service configuration](#audioqna-configuration) for information on how service specific configuration parameters affect deployments.
+
+### Deploy the Services Using Docker Compose
 
-To construct the Mega Service, we utilize the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline within the `audioqna.py` Python script. Build the MegaService Docker image using the command below:
+To deploy the AudioQnA services, execute the `docker compose up` command with the appropriate arguments. For a default deployment, execute the command below. It uses the 'compose.yaml' file.
 
 ```bash
-git clone https://github.com/opea-project/GenAIExamples.git
-cd GenAIExamples/AudioQnA/
-docker build --no-cache -t opea/audioqna:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile .
+cd docker_compose/intel/hpu/gaudi
+docker compose -f compose.yaml up -d
 ```
 
-Then run the command `docker images`, you will have following images ready:
+> **Note**: developers should build docker image from source when:
+>
+> - Developing off the git main branch (as the container's ports in the repo may be different > from the published docker image).
+> - Unable to download the docker image.
+> - Use a specific version of Docker image.
 
-1. `opea/whisper-gaudi:latest`
-2. `opea/vllm-gaudi:latest`
-3. `opea/speecht5-gaudi:latest`
-4. `opea/audioqna:latest`
+Please refer to the table below to build different microservices from source:
 
-## 🚀 Set the environment variables
+| Microservice | Deployment Guide                                                                                               |
+| ------------ | -------------------------------------------------------------------------------------------------------------- |
+| vLLM-gaudi         | [vLLM build guide](https://github.com/opea-project/GenAIComps/tree/main/comps/third_parties/vllm#build-docker-1) |
+| LLM          | [LLM build guide](https://github.com/opea-project/GenAIComps/tree/main/comps/llms)                             |
+| WHISPER      | [Whisper build guide](https://github.com/opea-project/GenAIComps/tree/main/comps/asr/src#211-whisper-server-image)                              |
+| SPEECHT5     | [SpeechT5 build guide](https://github.com/opea-project/GenAIComps/tree/main/comps/tts/src#211-speecht5-server-image)                              |
+| MegaService  | [MegaService build guide](../../../../README_miscellaneous.md#build-megaservice-docker-image)                  |
+| UI           | [Basic UI build guide](../../../../README_miscellaneous.md#build-ui-docker-image)                              |
 
-Before starting the services with `docker compose`, you have to recheck the following environment variables.
 
-```bash
-export host_ip=<your External Public IP>    # export host_ip=$(hostname -I | awk '{print $1}')
-export HUGGINGFACEHUB_API_TOKEN=<your HF token>
-
-export LLM_MODEL_ID="meta-llama/Meta-Llama-3-8B-Instruct"
-# set vLLM parameters
-export NUM_CARDS=1
-export BLOCK_SIZE=128
-export MAX_NUM_SEQS=256
-export MAX_SEQ_LEN_TO_CAPTURE=2048
-
-export MEGA_SERVICE_HOST_IP=${host_ip}
-export WHISPER_SERVER_HOST_IP=${host_ip}
-export SPEECHT5_SERVER_HOST_IP=${host_ip}
-export LLM_SERVER_HOST_IP=${host_ip}
-
-export WHISPER_SERVER_PORT=7066
-export SPEECHT5_SERVER_PORT=7055
-export LLM_SERVER_PORT=3006
-
-export BACKEND_SERVICE_ENDPOINT=http://${host_ip}:3008/v1/audioqna
-```
+### Check the Deployment Status
 
-or use set_env.sh file to setup environment variables.
+After running docker compose, check if all the containers launched via docker compose have started:
 
-Note:
+```bash
+docker ps -a
+```
 
-- Please replace with host_ip with your external IP address, do not use localhost.
-- If you are in a proxy environment, also set the proxy-related environment variables:
+For the default deployment, the following 5 containers should have started:
 
 ```
-export http_proxy="Your_HTTP_Proxy"
-export https_proxy="Your_HTTPs_Proxy"
-# Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1"
-export no_proxy="Your_No_Proxy",${host_ip},whisper-service,speecht5-service,tgi-service,vllm-service,audioqna-gaudi-backend-server,audioqna-gaudi-ui-server
+1c67e44c39d2   opea/audioqna-ui:latest   "docker-entrypoint.s…"   About a minute ago   Up About a minute             0.0.0.0:5173->5173/tcp, :::5173->5173/tcp   audioqna-gaudi-ui-server
+833a42677247   opea/audioqna:latest      "python audioqna.py"     About a minute ago   Up About a minute             0.0.0.0:3008->8888/tcp, :::3008->8888/tcp   audioqna-gaudi-backend-server
+5dc4eb9bf499   opea/speecht5-gaudi:latest      "python speecht5_ser…"   About a minute ago   Up About a minute             0.0.0.0:7055->7055/tcp, :::7055->7055/tcp   speecht5-service
+814e6efb1166   opea/vllm-gaudi:latest          "python3 -m vllm.ent…"   About a minute ago   Up About a minute (healthy)   0.0.0.0:3006->80/tcp, :::3006->80/tcp       vllm-service
+46f7a00f4612   opea/whisper-gaudi:latest       "python whisper_serv…"   About a minute ago   Up About a minute             0.0.0.0:7066->7066/tcp, :::7066->7066/tcp   whisper-service
 ```
 
-## 🚀 Start the MegaService
+If any issues are encountered during deployment, refer to the [Troubleshooting](../../../../README_miscellaneous.md#troubleshooting) section.
 
-> **_NOTE:_** Users will need at least three Gaudi cards for AudioQnA.
+
+### Validate the Pipeline
+
+Once the AudioQnA services are running, test the pipeline using the following command:
 
 ```bash
-cd GenAIExamples/AudioQnA/docker_compose/intel/hpu/gaudi/
-```
+# Test the AudioQnA megaservice by recording a .wav file, encoding the file into the base64 format, and then sending the base64 string to the megaservice endpoint.
+# The megaservice will return a spoken response as a base64 string. To listen to the response, decode the base64 string and save it as a .wav file.
+wget https://github.com/intel/intel-extension-for-transformers/raw/refs/heads/main/intel_extension_for_transformers/neural_chat/assets/audio/sample_2.wav
+base64_audio=$(base64 -w 0 sample_2.wav)
 
-If use vLLM as the LLM serving backend:
+# if you are using speecht5 as the tts service, voice can be "default" or "male"
 
-```
-docker compose up -d
+curl http://${host_ip}:3008/v1/audioqna \
+  -X POST \
+  -H "Content-Type: application/json" \
+  -d "{\"audio\": \"${base64_audio}\", \"max_tokens\": 64, \"voice\": \"default\"}" \
+  | sed 's/^"//;s/"$//' | base64 -d > output.wav
 ```
 
-If use TGI as the LLM serving backend:
+**Note** : Access the AudioQnA UI by web browser through this URL: `http://${host_ip}:5173`. Please confirm the `5173` port is opened in the firewall. To validate each microservie used in the pipeline refer to the [Validate Microservices](#validate-microservices) section.
 
+### Cleanup the Deployment
+
+To stop the containers associated with the deployment, execute the following command:
+
+```bash
+docker compose -f compose.yaml down
 ```
-docker compose -f compose_tgi.yaml up -d
-```
+## AudioQnA Docker Compose Files
 
-## 🚀 Test MicroServices
+In the context of deploying an AudioQnA pipeline on an Intel® Gaudi® platform, we can pick and choose different large language model serving frameworks. The table below outlines the various configurations that are available as part of the application. These configurations can be used as templates and can be extended to different components available in [GenAIComps](https://github.com/opea-project/GenAIComps.git).
+
+| File                                   | Description                                                                               |
+| -------------------------------------- | ----------------------------------------------------------------------------------------- |
+| [compose.yaml](./compose.yaml)         | Default compose file using vllm as serving framework and redis as vector database         |
+| [compose_tgi.yaml](./compose_tgi.yaml) | The LLM serving framework is TGI. All other configurations remain the same as the default |
+
+
+## Validate MicroServices
 
 1. Whisper Service
 
    ```bash
-   curl http://${host_ip}:${WHISPER_SERVER_PORT}/v1/asr \
-     -X POST \
-     -d '{"audio": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA"}' \
-     -H 'Content-Type: application/json'
+   wget https://github.com/intel/intel-extension-for-transformers/raw/main/intel_extension_for_transformers/neural_chat/assets/audio/sample.wav
+   curl http://${host_ip}:${WHISPER_SERVER_PORT}/v1/audio/transcriptions \
+     -H "Content-Type: multipart/form-data" \
+     -F file="@./sample.wav" \
+     -F model="openai/whisper-small"
    ```
 
 2. LLM backend Service
 
-   In the first startup, this service will take more time to download, load and warm up the model. After it's finished, the service will be ready and the container (`vllm-gaudi-service` or `tgi-gaudi-service`) status shown via `docker ps` will be `healthy`. Before that, the status will be `health: starting`.
+   In the first startup, this service will take more time to download, load and warm up the model. After it's finished, the service will be ready and the container (`vllm-service` or `tgi-service`) status shown via `docker ps` will be `healthy`. Before that, the status will be `health: starting`.
 
    Or try the command below to check whether the LLM serving is ready.
 
    ```bash
    # vLLM service
-   docker logs vllm-gaudi-service 2>&1 | grep complete
+   docker logs vllm-service 2>&1 | grep complete
    # If the service is ready, you will get the response like below.
    INFO:     Application startup complete.
    ```
 
    ```bash
    # TGI service
-   docker logs tgi-gaudi-service | grep Connected
+   docker logs tgi-service | grep Connected
    # If the service is ready, you will get the response like below.
    2024-09-03T02:47:53.402023Z  INFO text_generation_router::server: router/src/server.rs:2311: Connected
    ```
@@ -156,24 +184,11 @@ docker compose -f compose_tgi.yaml up -d
 
 3. TTS Service
 
-   ```
+   ```bash
    # speecht5 service
-   curl http://${host_ip}:${SPEECHT5_SERVER_PORT}/v1/tts
-     -X POST \
-     -d '{"text": "Who are you?"}' \
-     -H 'Content-Type: application/json'
+   curl http://${host_ip}:${SPEECHT5_SERVER_PORT}/v1/audio/speech -XPOST -d '{"input": "Who are you?"}' -H 'Content-Type: application/json' --output speech.mp3
    ```
 
-## 🚀 Test MegaService
-
-Test the AudioQnA megaservice by recording a .wav file, encoding the file into the base64 format, and then sending the
-base64 string to the megaservice endpoint. The megaservice will return a spoken response as a base64 string. To listen
-to the response, decode the base64 string and save it as a .wav file.
+## Conclusion
 
-```bash
-# voice can be "default" or "male"
-curl http://${host_ip}:3008/v1/audioqna \
-  -X POST \
-  -d '{"audio": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA", "max_tokens":64, "voice":"default"}' \
-  -H 'Content-Type: application/json' | sed 's/^"//;s/"$//' | base64 -d > output.wav
-```
+This guide should enable developers to deploy the default configuration or any of the other compose yaml files for different configurations. It also highlights the configurable parameters that can be set before deployment.

From c4a5355812099019e30ec3c077f31b69e964ea63 Mon Sep 17 00:00:00 2001
From: "pre-commit-ci[bot]"
 <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Date: Sun, 13 Apr 2025 07:25:23 +0000
Subject: [PATCH 02/11] [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci
---
 AudioQnA/README.md                            | 14 ++++----
 .../docker_compose/intel/cpu/xeon/README.md   | 33 +++++++++----------
 .../docker_compose/intel/hpu/gaudi/README.md  | 21 +++++-------
 3 files changed, 30 insertions(+), 38 deletions(-)

diff --git a/AudioQnA/README.md b/AudioQnA/README.md
index 2391e53570..6797fe6a6b 100644
--- a/AudioQnA/README.md
+++ b/AudioQnA/README.md
@@ -8,7 +8,6 @@ AudioQnA is an example that demonstrates the integration of Generative AI (GenAI
 2. [Deployment Options](#deployment-options)
 3. [Monitoring and Tracing](./README_miscellaneous.md)
 
-
 ## Architecture
 
 The AudioQnA example is implemented using the component-level microservices defined in [GenAIComps](https://github.com/opea-project/GenAIComps). The flow chart below shows the information flow between different microservices for this example.
@@ -68,14 +67,13 @@ flowchart LR
 
 ```
 
-
 ## Deployment Options
 
 The table below lists currently available deployment options. They outline in detail the implementation of this example on selected hardware.
 
-| Category               | Deployment Option    | Description                                                       |
-| ---------------------- | -------------------- | ----------------------------------------------------------------- |
-| On-premise Deployments | Docker compose       | [AudioQnA deployment on Xeon](./docker_compose/intel/cpu/xeon)   |
-|                        |                      | [AudioQnA deployment on Gaudi](./docker_compose/intel/hpu/gaudi) |
-|                        |                      | [AudioQnA deployment on AMD ROCm](./docker_compose/amd/gpu/rocm) |
-|                        | Kubernetes           | [Helm Charts](./kubernetes/helm)                                  |
+| Category               | Deployment Option | Description                                                      |
+| ---------------------- | ----------------- | ---------------------------------------------------------------- |
+| On-premise Deployments | Docker compose    | [AudioQnA deployment on Xeon](./docker_compose/intel/cpu/xeon)   |
+|                        |                   | [AudioQnA deployment on Gaudi](./docker_compose/intel/hpu/gaudi) |
+|                        |                   | [AudioQnA deployment on AMD ROCm](./docker_compose/amd/gpu/rocm) |
+|                        | Kubernetes        | [Helm Charts](./kubernetes/helm)                                 |
diff --git a/AudioQnA/docker_compose/intel/cpu/xeon/README.md b/AudioQnA/docker_compose/intel/cpu/xeon/README.md
index 5487fb5b11..a575c9bc70 100644
--- a/AudioQnA/docker_compose/intel/cpu/xeon/README.md
+++ b/AudioQnA/docker_compose/intel/cpu/xeon/README.md
@@ -4,7 +4,6 @@ This document outlines the single node deployment process for a AudioQnA applica
 
 Note: The default LLM is `meta-llama/Meta-Llama-3-8B-Instruct`. Before deploying the application, please make sure either you've requested and been granted the access to it on [Huggingface](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) or you've downloaded the model locally from [ModelScope](https://www.modelscope.cn/models).
 
-
 # Table of Contents
 
 1. [AudioQnA Quick Start Deployment](#audioqna-quick-start-deployment)
@@ -71,16 +70,15 @@ docker compose -f compose.yaml up -d
 
 Please refer to the table below to build different microservices from source:
 
-| Microservice | Deployment Guide                                                                                               |
-| ------------ | -------------------------------------------------------------------------------------------------------------- |
-| vLLM         | [vLLM build guide](https://github.com/opea-project/GenAIComps/tree/main/comps/third_parties/vllm#build-docker) |
-| LLM          | [LLM build guide](https://github.com/opea-project/GenAIComps/tree/main/comps/llms)                             |
-| WHISPER      | [Whisper build guide](https://github.com/opea-project/GenAIComps/tree/main/comps/asr/src#211-whisper-server-image)                              |
-| SPEECHT5     | [SpeechT5 build guide](https://github.com/opea-project/GenAIComps/tree/main/comps/tts/src#211-speecht5-server-image)                              |
-| GPT-SOVITS | [GPT-SOVITS build guide](https://github.com/opea-project/GenAIComps/tree/main/comps/third_parties/gpt-sovits/src#build-the-image) |
-| MegaService  | [MegaService build guide](../../../../README_miscellaneous.md#build-megaservice-docker-image)                  |
-| UI           | [Basic UI build guide](../../../../README_miscellaneous.md#build-ui-docker-image)                              |
-
+| Microservice | Deployment Guide                                                                                                                  |
+| ------------ | --------------------------------------------------------------------------------------------------------------------------------- |
+| vLLM         | [vLLM build guide](https://github.com/opea-project/GenAIComps/tree/main/comps/third_parties/vllm#build-docker)                    |
+| LLM          | [LLM build guide](https://github.com/opea-project/GenAIComps/tree/main/comps/llms)                                                |
+| WHISPER      | [Whisper build guide](https://github.com/opea-project/GenAIComps/tree/main/comps/asr/src#211-whisper-server-image)                |
+| SPEECHT5     | [SpeechT5 build guide](https://github.com/opea-project/GenAIComps/tree/main/comps/tts/src#211-speecht5-server-image)              |
+| GPT-SOVITS   | [GPT-SOVITS build guide](https://github.com/opea-project/GenAIComps/tree/main/comps/third_parties/gpt-sovits/src#build-the-image) |
+| MegaService  | [MegaService build guide](../../../../README_miscellaneous.md#build-megaservice-docker-image)                                     |
+| UI           | [Basic UI build guide](../../../../README_miscellaneous.md#build-ui-docker-image)                                                 |
 
 ### Check the Deployment Status
 
@@ -102,7 +100,6 @@ For the default deployment, the following 5 containers should have started:
 
 If any issues are encountered during deployment, refer to the [Troubleshooting](../../../../README_miscellaneous.md#troubleshooting) section.
 
-
 ### Validate the Pipeline
 
 Once the AudioQnA services are running, test the pipeline using the following command:
@@ -132,16 +129,16 @@ To stop the containers associated with the deployment, execute the following com
 ```bash
 docker compose -f compose.yaml down
 ```
+
 ## AudioQnA Docker Compose Files
 
 In the context of deploying an AudioQnA pipeline on an Intel® Xeon® platform, we can pick and choose different large language model serving frameworks, or single English TTS/multi-language TTS component. The table below outlines the various configurations that are available as part of the application. These configurations can be used as templates and can be extended to different components available in [GenAIComps](https://github.com/opea-project/GenAIComps.git).
 
-| File                                   | Description                                                                               |
-| -------------------------------------- | ----------------------------------------------------------------------------------------- |
-| [compose.yaml](./compose.yaml)         | Default compose file using vllm as serving framework and redis as vector database         |
-| [compose_tgi.yaml](./compose_tgi.yaml) | The LLM serving framework is TGI. All other configurations remain the same as the default |
-| [compose_multilang.yaml](./compose_multilang.yaml) | The TTS component is GPT-SoVITS. All other configurations remain the same as the default |
-
+| File                                               | Description                                                                               |
+| -------------------------------------------------- | ----------------------------------------------------------------------------------------- |
+| [compose.yaml](./compose.yaml)                     | Default compose file using vllm as serving framework and redis as vector database         |
+| [compose_tgi.yaml](./compose_tgi.yaml)             | The LLM serving framework is TGI. All other configurations remain the same as the default |
+| [compose_multilang.yaml](./compose_multilang.yaml) | The TTS component is GPT-SoVITS. All other configurations remain the same as the default  |
 
 ## Validate MicroServices
 
diff --git a/AudioQnA/docker_compose/intel/hpu/gaudi/README.md b/AudioQnA/docker_compose/intel/hpu/gaudi/README.md
index 9582b92bdb..4558b577fa 100644
--- a/AudioQnA/docker_compose/intel/hpu/gaudi/README.md
+++ b/AudioQnA/docker_compose/intel/hpu/gaudi/README.md
@@ -4,7 +4,6 @@ This document outlines the single node deployment process for a AudioQnA applica
 
 Note: The default LLM is `meta-llama/Meta-Llama-3-8B-Instruct`. Before deploying the application, please make sure either you've requested and been granted the access to it on [Huggingface](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) or you've downloaded the model locally from [ModelScope](https://www.modelscope.cn/models).
 
-
 # Table of Contents
 
 1. [AudioQnA Quick Start Deployment](#audioqna-quick-start-deployment)
@@ -71,15 +70,14 @@ docker compose -f compose.yaml up -d
 
 Please refer to the table below to build different microservices from source:
 
-| Microservice | Deployment Guide                                                                                               |
-| ------------ | -------------------------------------------------------------------------------------------------------------- |
-| vLLM-gaudi         | [vLLM build guide](https://github.com/opea-project/GenAIComps/tree/main/comps/third_parties/vllm#build-docker-1) |
-| LLM          | [LLM build guide](https://github.com/opea-project/GenAIComps/tree/main/comps/llms)                             |
-| WHISPER      | [Whisper build guide](https://github.com/opea-project/GenAIComps/tree/main/comps/asr/src#211-whisper-server-image)                              |
-| SPEECHT5     | [SpeechT5 build guide](https://github.com/opea-project/GenAIComps/tree/main/comps/tts/src#211-speecht5-server-image)                              |
-| MegaService  | [MegaService build guide](../../../../README_miscellaneous.md#build-megaservice-docker-image)                  |
-| UI           | [Basic UI build guide](../../../../README_miscellaneous.md#build-ui-docker-image)                              |
-
+| Microservice | Deployment Guide                                                                                                     |
+| ------------ | -------------------------------------------------------------------------------------------------------------------- |
+| vLLM-gaudi   | [vLLM build guide](https://github.com/opea-project/GenAIComps/tree/main/comps/third_parties/vllm#build-docker-1)     |
+| LLM          | [LLM build guide](https://github.com/opea-project/GenAIComps/tree/main/comps/llms)                                   |
+| WHISPER      | [Whisper build guide](https://github.com/opea-project/GenAIComps/tree/main/comps/asr/src#211-whisper-server-image)   |
+| SPEECHT5     | [SpeechT5 build guide](https://github.com/opea-project/GenAIComps/tree/main/comps/tts/src#211-speecht5-server-image) |
+| MegaService  | [MegaService build guide](../../../../README_miscellaneous.md#build-megaservice-docker-image)                        |
+| UI           | [Basic UI build guide](../../../../README_miscellaneous.md#build-ui-docker-image)                                    |
 
 ### Check the Deployment Status
 
@@ -101,7 +99,6 @@ For the default deployment, the following 5 containers should have started:
 
 If any issues are encountered during deployment, refer to the [Troubleshooting](../../../../README_miscellaneous.md#troubleshooting) section.
 
-
 ### Validate the Pipeline
 
 Once the AudioQnA services are running, test the pipeline using the following command:
@@ -130,6 +127,7 @@ To stop the containers associated with the deployment, execute the following com
 ```bash
 docker compose -f compose.yaml down
 ```
+
 ## AudioQnA Docker Compose Files
 
 In the context of deploying an AudioQnA pipeline on an Intel® Gaudi® platform, we can pick and choose different large language model serving frameworks. The table below outlines the various configurations that are available as part of the application. These configurations can be used as templates and can be extended to different components available in [GenAIComps](https://github.com/opea-project/GenAIComps.git).
@@ -139,7 +137,6 @@ In the context of deploying an AudioQnA pipeline on an Intel® Gaudi® platform,
 | [compose.yaml](./compose.yaml)         | Default compose file using vllm as serving framework and redis as vector database         |
 | [compose_tgi.yaml](./compose_tgi.yaml) | The LLM serving framework is TGI. All other configurations remain the same as the default |
 
-
 ## Validate MicroServices
 
 1. Whisper Service

From 8e127c1b96a1c19dea0a4104c2b42c4e3f324838 Mon Sep 17 00:00:00 2001
From: Spycsh <sihan.chen@intel.com>
Date: Sun, 13 Apr 2025 00:38:24 -0700
Subject: [PATCH 03/11] Fix docker ps log

---
 AudioQnA/docker_compose/intel/hpu/gaudi/README.md | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/AudioQnA/docker_compose/intel/hpu/gaudi/README.md b/AudioQnA/docker_compose/intel/hpu/gaudi/README.md
index 9582b92bdb..e8551093f6 100644
--- a/AudioQnA/docker_compose/intel/hpu/gaudi/README.md
+++ b/AudioQnA/docker_compose/intel/hpu/gaudi/README.md
@@ -92,11 +92,11 @@ docker ps -a
 For the default deployment, the following 5 containers should have started:
 
 ```
-1c67e44c39d2   opea/audioqna-ui:latest   "docker-entrypoint.s…"   About a minute ago   Up About a minute             0.0.0.0:5173->5173/tcp, :::5173->5173/tcp   audioqna-gaudi-ui-server
-833a42677247   opea/audioqna:latest      "python audioqna.py"     About a minute ago   Up About a minute             0.0.0.0:3008->8888/tcp, :::3008->8888/tcp   audioqna-gaudi-backend-server
-5dc4eb9bf499   opea/speecht5-gaudi:latest      "python speecht5_ser…"   About a minute ago   Up About a minute             0.0.0.0:7055->7055/tcp, :::7055->7055/tcp   speecht5-service
-814e6efb1166   opea/vllm-gaudi:latest          "python3 -m vllm.ent…"   About a minute ago   Up About a minute (healthy)   0.0.0.0:3006->80/tcp, :::3006->80/tcp       vllm-service
-46f7a00f4612   opea/whisper-gaudi:latest       "python whisper_serv…"   About a minute ago   Up About a minute             0.0.0.0:7066->7066/tcp, :::7066->7066/tcp   whisper-service
+23f27dab14a5   opea/whisper-gaudi:latest                                                                   "python whisper_serv…"   18 minutes ago   Up 18 minutes             0.0.0.0:7066->7066/tcp, :::7066->7066/tcp                                              whisper-service
+629da06b7fb2   opea/audioqna-ui:latest                                                                     "docker-entrypoint.s…"   19 minutes ago   Up 18 minutes             0.0.0.0:5173->5173/tcp, :::5173->5173/tcp                                              audioqna-gaudi-ui-server
+8a74d9806b87   opea/audioqna:latest                                                                        "python audioqna.py"     19 minutes ago   Up 18 minutes             0.0.0.0:3008->8888/tcp, [::]:3008->8888/tcp                                            audioqna-gaudi-backend-server
+29324430f42e   opea/vllm-gaudi:latest                                                                      "python3 -m vllm.ent…"   19 minutes ago   Up 19 minutes (healthy)   0.0.0.0:3006->80/tcp, [::]:3006->80/tcp                                                vllm-gaudi-service
+dbd585f0a95a   opea/speecht5-gaudi:latest                                                                  "python speecht5_ser…"   19 minutes ago   Up 19 minutes             0.0.0.0:7055->7055/tcp, :::7055->7055/tcp                                              speecht5-service
 ```
 
 If any issues are encountered during deployment, refer to the [Troubleshooting](../../../../README_miscellaneous.md#troubleshooting) section.

From c070eb4ad959d79bd9aba1d7797c2adc52bb1732 Mon Sep 17 00:00:00 2001
From: Spycsh <sihan.chen@intel.com>
Date: Sun, 13 Apr 2025 21:57:35 -0700
Subject: [PATCH 04/11] fix grammar

---
 AudioQnA/README_miscellaneous.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/AudioQnA/README_miscellaneous.md b/AudioQnA/README_miscellaneous.md
index 8b2b6b66e3..487b319422 100644
--- a/AudioQnA/README_miscellaneous.md
+++ b/AudioQnA/README_miscellaneous.md
@@ -1,4 +1,4 @@
-# Table of Contents
+## Table of Contents
 
 1. [Build MegaService Docker Image](#build-megaservice-docker-image)
 2. [Build UI Docker Image](#build-ui-docker-image)
@@ -26,7 +26,7 @@ docker build -t opea/audioqna-ui:latest --build-arg https_proxy=$https_proxy --b
 
 ## Generate a HuggingFace Access Token
 
-Some HuggingFace resources, such as some models, are only accessible if the developer have an access token. In the absence of a HuggingFace access token, the developer can create one by first creating an account by following the steps provided at [HuggingFace](https://huggingface.co/) and then generating a [user access token](https://huggingface.co/docs/transformers.js/en/guides/private#step-1-generating-a-user-access-token).
+Some HuggingFace resources, such as some models, are only accessible if the developer has an access token. In the absence of a HuggingFace access token, the developer can create one by first creating an account by following the steps provided at [HuggingFace](https://huggingface.co/) and then generating a [user access token](https://huggingface.co/docs/transformers.js/en/guides/private#step-1-generating-a-user-access-token).
 
 ## Troubleshooting
 

From d004abd692a4599cb44738d2bc0e2d4d9834d256 Mon Sep 17 00:00:00 2001
From: Spycsh <sihan.chen@intel.com>
Date: Sun, 13 Apr 2025 22:44:39 -0700
Subject: [PATCH 05/11] fix typo

---
 AudioQnA/README_miscellaneous.md                  | 2 +-
 AudioQnA/docker_compose/intel/cpu/xeon/README.md  | 2 +-
 AudioQnA/docker_compose/intel/hpu/gaudi/README.md | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/AudioQnA/README_miscellaneous.md b/AudioQnA/README_miscellaneous.md
index 487b319422..716f6fd813 100644
--- a/AudioQnA/README_miscellaneous.md
+++ b/AudioQnA/README_miscellaneous.md
@@ -1,4 +1,4 @@
-## Table of Contents
+# Table of Contents
 
 1. [Build MegaService Docker Image](#build-megaservice-docker-image)
 2. [Build UI Docker Image](#build-ui-docker-image)
diff --git a/AudioQnA/docker_compose/intel/cpu/xeon/README.md b/AudioQnA/docker_compose/intel/cpu/xeon/README.md
index a575c9bc70..d311269ab2 100644
--- a/AudioQnA/docker_compose/intel/cpu/xeon/README.md
+++ b/AudioQnA/docker_compose/intel/cpu/xeon/README.md
@@ -120,7 +120,7 @@ curl http://${host_ip}:3008/v1/audioqna \
   | sed 's/^"//;s/"$//' | base64 -d > output.wav
 ```
 
-**Note** : Access the AudioQnA UI by web browser through this URL: `http://${host_ip}:5173`. Please confirm the `5173` port is opened in the firewall. To validate each microservie used in the pipeline refer to the [Validate Microservices](#validate-microservices) section.
+**Note** : Access the AudioQnA UI by web browser through this URL: `http://${host_ip}:5173`. Please confirm the `5173` port is opened in the firewall. To validate each microservice used in the pipeline refer to the [Validate Microservices](#validate-microservices) section.
 
 ### Cleanup the Deployment
 
diff --git a/AudioQnA/docker_compose/intel/hpu/gaudi/README.md b/AudioQnA/docker_compose/intel/hpu/gaudi/README.md
index 305e846989..3c7773c217 100644
--- a/AudioQnA/docker_compose/intel/hpu/gaudi/README.md
+++ b/AudioQnA/docker_compose/intel/hpu/gaudi/README.md
@@ -118,7 +118,7 @@ curl http://${host_ip}:3008/v1/audioqna \
   | sed 's/^"//;s/"$//' | base64 -d > output.wav
 ```
 
-**Note** : Access the AudioQnA UI by web browser through this URL: `http://${host_ip}:5173`. Please confirm the `5173` port is opened in the firewall. To validate each microservie used in the pipeline refer to the [Validate Microservices](#validate-microservices) section.
+**Note** : Access the AudioQnA UI by web browser through this URL: `http://${host_ip}:5173`. Please confirm the `5173` port is opened in the firewall. To validate each microservice used in the pipeline refer to the [Validate Microservices](#validate-microservices) section.
 
 ### Cleanup the Deployment
 

From 6d53680db724da808c45944d79d6acea9913d039 Mon Sep 17 00:00:00 2001
From: Ying Hu <ying.hu@intel.com>
Date: Fri, 18 Apr 2025 11:51:35 +0800
Subject: [PATCH 06/11] Update README.md for layout

---
 AudioQnA/docker_compose/intel/cpu/xeon/README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/AudioQnA/docker_compose/intel/cpu/xeon/README.md b/AudioQnA/docker_compose/intel/cpu/xeon/README.md
index d311269ab2..c20eb1543f 100644
--- a/AudioQnA/docker_compose/intel/cpu/xeon/README.md
+++ b/AudioQnA/docker_compose/intel/cpu/xeon/README.md
@@ -4,7 +4,7 @@ This document outlines the single node deployment process for a AudioQnA applica
 
 Note: The default LLM is `meta-llama/Meta-Llama-3-8B-Instruct`. Before deploying the application, please make sure either you've requested and been granted the access to it on [Huggingface](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) or you've downloaded the model locally from [ModelScope](https://www.modelscope.cn/models).
 
-# Table of Contents
+## Table of Contents
 
 1. [AudioQnA Quick Start Deployment](#audioqna-quick-start-deployment)
 2. [AudioQnA Docker Compose Files](#audioqna-docker-compose-files)

From 84fa8e26893eabd6b7c37a59d4bc5e54d6221133 Mon Sep 17 00:00:00 2001
From: Ying Hu <ying.hu@intel.com>
Date: Fri, 18 Apr 2025 11:52:12 +0800
Subject: [PATCH 07/11] Update README.md

fix layout of Heading level
---
 AudioQnA/docker_compose/intel/hpu/gaudi/README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/AudioQnA/docker_compose/intel/hpu/gaudi/README.md b/AudioQnA/docker_compose/intel/hpu/gaudi/README.md
index 3c7773c217..b876f5d392 100644
--- a/AudioQnA/docker_compose/intel/hpu/gaudi/README.md
+++ b/AudioQnA/docker_compose/intel/hpu/gaudi/README.md
@@ -4,7 +4,7 @@ This document outlines the single node deployment process for a AudioQnA applica
 
 Note: The default LLM is `meta-llama/Meta-Llama-3-8B-Instruct`. Before deploying the application, please make sure either you've requested and been granted the access to it on [Huggingface](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) or you've downloaded the model locally from [ModelScope](https://www.modelscope.cn/models).
 
-# Table of Contents
+## Table of Contents
 
 1. [AudioQnA Quick Start Deployment](#audioqna-quick-start-deployment)
 2. [AudioQnA Docker Compose Files](#audioqna-docker-compose-files)

From 09bb3db859e5754451dc9cfe52ba8e88bcf127ef Mon Sep 17 00:00:00 2001
From: Ying Hu <ying.hu@intel.com>
Date: Fri, 18 Apr 2025 11:54:06 +0800
Subject: [PATCH 08/11] Update README_miscellaneous.md

fix the layout
---
 AudioQnA/README_miscellaneous.md | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/AudioQnA/README_miscellaneous.md b/AudioQnA/README_miscellaneous.md
index 716f6fd813..56e4cdfeba 100644
--- a/AudioQnA/README_miscellaneous.md
+++ b/AudioQnA/README_miscellaneous.md
@@ -1,4 +1,5 @@
-# Table of Contents
+# AudioQnA Docker Image Build
+## Table of Contents
 
 1. [Build MegaService Docker Image](#build-megaservice-docker-image)
 2. [Build UI Docker Image](#build-ui-docker-image)

From 11cc316691249e1f4a5a4fdb38bc3e9790018658 Mon Sep 17 00:00:00 2001
From: "pre-commit-ci[bot]"
 <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Date: Fri, 18 Apr 2025 03:54:33 +0000
Subject: [PATCH 09/11] [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci
---
 AudioQnA/README_miscellaneous.md | 1 +
 1 file changed, 1 insertion(+)

diff --git a/AudioQnA/README_miscellaneous.md b/AudioQnA/README_miscellaneous.md
index 56e4cdfeba..9d1be7202b 100644
--- a/AudioQnA/README_miscellaneous.md
+++ b/AudioQnA/README_miscellaneous.md
@@ -1,4 +1,5 @@
 # AudioQnA Docker Image Build
+
 ## Table of Contents
 
 1. [Build MegaService Docker Image](#build-megaservice-docker-image)

From 18512ee2395047de40f635f52829fa80a75025e9 Mon Sep 17 00:00:00 2001
From: Ying Hu <ying.hu@intel.com>
Date: Fri, 18 Apr 2025 11:55:08 +0800
Subject: [PATCH 10/11] Update README.md for layout

---
 AudioQnA/README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/AudioQnA/README.md b/AudioQnA/README.md
index 6797fe6a6b..a36e599bf8 100644
--- a/AudioQnA/README.md
+++ b/AudioQnA/README.md
@@ -2,7 +2,7 @@
 
 AudioQnA is an example that demonstrates the integration of Generative AI (GenAI) models for performing question-answering (QnA) on audio files, with the added functionality of Text-to-Speech (TTS) for generating spoken responses. The example showcases how to convert audio input to text using Automatic Speech Recognition (ASR), generate answers to user queries using a language model, and then convert those answers back to speech using Text-to-Speech (TTS).
 
-# Table of Contents
+## Table of Contents
 
 1. [Architecture](#architecture)
 2. [Deployment Options](#deployment-options)

From 64a5ee39a752518b13e9a4001ada20fa5f172eea Mon Sep 17 00:00:00 2001
From: Spycsh <sihan.chen@intel.com>
Date: Sun, 20 Apr 2025 19:39:18 -0700
Subject: [PATCH 11/11] fix issues

---
 AudioQnA/README.md                                | 1 -
 AudioQnA/README_miscellaneous.md                  | 4 ++--
 AudioQnA/docker_compose/intel/cpu/xeon/README.md  | 2 +-
 AudioQnA/docker_compose/intel/hpu/gaudi/README.md | 2 +-
 4 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/AudioQnA/README.md b/AudioQnA/README.md
index 6797fe6a6b..3b8538923f 100644
--- a/AudioQnA/README.md
+++ b/AudioQnA/README.md
@@ -6,7 +6,6 @@ AudioQnA is an example that demonstrates the integration of Generative AI (GenAI
 
 1. [Architecture](#architecture)
 2. [Deployment Options](#deployment-options)
-3. [Monitoring and Tracing](./README_miscellaneous.md)
 
 ## Architecture
 
diff --git a/AudioQnA/README_miscellaneous.md b/AudioQnA/README_miscellaneous.md
index 716f6fd813..df0619e0a2 100644
--- a/AudioQnA/README_miscellaneous.md
+++ b/AudioQnA/README_miscellaneous.md
@@ -7,7 +7,7 @@
 
 ## Build MegaService Docker Image
 
-To construct the Megaservice of AudioQnA, the [GenAIExamples](https://github.com/opea-project/GenAIExamples.git) repository is utilized. Build Megaservice Docker image via command below:
+To construct the Megaservice of AudioQnA, the [GenAIExamples](https://github.com/opea-project/GenAIExamples.git) repository is utilized. Build Megaservice Docker image using command below:
 
 ```bash
 git clone https://github.com/opea-project/GenAIExamples.git
@@ -17,7 +17,7 @@ docker build --no-cache -t opea/audioqna:latest --build-arg https_proxy=$https_p
 
 ## Build UI Docker Image
 
-Build frontend Docker image via below command:
+Build frontend Docker image using below command:
 
 ```bash
 cd GenAIExamples/AudioQnA/ui
diff --git a/AudioQnA/docker_compose/intel/cpu/xeon/README.md b/AudioQnA/docker_compose/intel/cpu/xeon/README.md
index d311269ab2..ea8fccecb3 100644
--- a/AudioQnA/docker_compose/intel/cpu/xeon/README.md
+++ b/AudioQnA/docker_compose/intel/cpu/xeon/README.md
@@ -46,7 +46,7 @@ export host_ip="External_Public_IP"           # ip address of the node
 export HUGGINGFACEHUB_API_TOKEN="Your_HuggingFace_API_Token"
 export http_proxy="Your_HTTP_Proxy"           # http proxy if any
 export https_proxy="Your_HTTPs_Proxy"         # https proxy if any
-export no_proxy=localhost,127.0.0.1,$host_ip  # additional no proxies if needed
+export no_proxy=localhost,127.0.0.1,$host_ip,whisper-service,speecht5-service,vllm-service,tgi-service,audioqna-xeon-backend-server,audioqna-xeon-ui-server  # additional no proxies if needed
 export NGINX_PORT=${your_nginx_port}          # your usable port for nginx, 80 for example
 source ./set_env.sh
 ```
diff --git a/AudioQnA/docker_compose/intel/hpu/gaudi/README.md b/AudioQnA/docker_compose/intel/hpu/gaudi/README.md
index 3c7773c217..0d77e52cd5 100644
--- a/AudioQnA/docker_compose/intel/hpu/gaudi/README.md
+++ b/AudioQnA/docker_compose/intel/hpu/gaudi/README.md
@@ -46,7 +46,7 @@ export host_ip="External_Public_IP"           # ip address of the node
 export HUGGINGFACEHUB_API_TOKEN="Your_HuggingFace_API_Token"
 export http_proxy="Your_HTTP_Proxy"           # http proxy if any
 export https_proxy="Your_HTTPs_Proxy"         # https proxy if any
-export no_proxy=localhost,127.0.0.1,$host_ip  # additional no proxies if needed
+export no_proxy=localhost,127.0.0.1,$host_ip,whisper-service,speecht5-service,vllm-service,tgi-service,audioqna-gaudi-backend-server,audioqna-gaudi-ui-server  # additional no proxies if needed
 export NGINX_PORT=${your_nginx_port}          # your usable port for nginx, 80 for example
 source ./set_env.sh
 ```