'ghcr.io/huggingface/text-generation-inference:sha-e4201f4-intel-cpu' is intel cpu optimized tgi image, we need to use this one for all xeon platform. (#444)

zhlsunshine · web-flow · commit c84ac4c74c9d · 2024-09-20T09:19:46.000+08:00
Signed-off-by: zhlsunshine &lt;huailong.zhang@intel.com&gt;
diff --git a/helm-charts/common/tgi/values.yaml b/helm-charts/common/tgi/values.yaml
@@ -26,7 +26,8 @@ image:
   repository: ghcr.io/huggingface/text-generation-inference
   pullPolicy: IfNotPresent
   # Overrides the image tag whose default is the chart appVersion.
-  tag: "2.2.0"
+  # `sha-e4201f4-intel-cpu` is the image tag for intel cpu optimized tgi image
+  tag: "sha-e4201f4-intel-cpu"
 
 # empty for CPU
 accelDevice: ""
diff --git a/microservices-connector/config/manifests/tgi.yaml b/microservices-connector/config/manifests/tgi.yaml
@@ -87,7 +87,7 @@ spec:
                 optional: true
           securityContext:
             {}
-          image: "ghcr.io/huggingface/text-generation-inference:2.2.0"
+          image: "ghcr.io/huggingface/text-generation-inference:sha-e4201f4-intel-cpu"
           imagePullPolicy: IfNotPresent
           volumeMounts:
             - mountPath: /data
diff --git a/microservices-connector/config/samples/ChatQnA/use_cases.md b/microservices-connector/config/samples/ChatQnA/use_cases.md
@@ -19,7 +19,7 @@ The ChatQnA uses the below prebuilt images if you choose a Xeon deployment
 - dataprep-redis: opea/dataprep-redis:latest
 - tei_xeon_service: ghcr.io/huggingface/text-embeddings-inference:cpu-1.5
 - tei_embedding_service: ghcr.io/huggingface/text-embeddings-inference:cpu-1.5
-- tgi-service: ghcr.io/huggingface/text-generation-inference:2.2.0
+- tgi-service: ghcr.io/huggingface/text-generation-inference:sha-e4201f4-intel-cpu
 - redis-vector-db: redis/redis-stack:7.2.0-v9
 
 Should you desire to use the Gaudi accelerator, two alternate images are used for the embedding and llm services.