Skip to content

Commit c84ac4c

Browse files
authored
'ghcr.io/huggingface/text-generation-inference:sha-e4201f4-intel-cpu' is intel cpu optimized tgi image, we need to use this one for all xeon platform. (#444)
Signed-off-by: zhlsunshine <huailong.zhang@intel.com>
1 parent 2517e79 commit c84ac4c

File tree

3 files changed

+4
-3
lines changed

3 files changed

+4
-3
lines changed

helm-charts/common/tgi/values.yaml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,8 @@ image:
2626
repository: ghcr.io/huggingface/text-generation-inference
2727
pullPolicy: IfNotPresent
2828
# Overrides the image tag whose default is the chart appVersion.
29-
tag: "2.2.0"
29+
# `sha-e4201f4-intel-cpu` is the image tag for intel cpu optimized tgi image
30+
tag: "sha-e4201f4-intel-cpu"
3031

3132
# empty for CPU
3233
accelDevice: ""

microservices-connector/config/manifests/tgi.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -87,7 +87,7 @@ spec:
8787
optional: true
8888
securityContext:
8989
{}
90-
image: "ghcr.io/huggingface/text-generation-inference:2.2.0"
90+
image: "ghcr.io/huggingface/text-generation-inference:sha-e4201f4-intel-cpu"
9191
imagePullPolicy: IfNotPresent
9292
volumeMounts:
9393
- mountPath: /data

microservices-connector/config/samples/ChatQnA/use_cases.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ The ChatQnA uses the below prebuilt images if you choose a Xeon deployment
1919
- dataprep-redis: opea/dataprep-redis:latest
2020
- tei_xeon_service: ghcr.io/huggingface/text-embeddings-inference:cpu-1.5
2121
- tei_embedding_service: ghcr.io/huggingface/text-embeddings-inference:cpu-1.5
22-
- tgi-service: ghcr.io/huggingface/text-generation-inference:2.2.0
22+
- tgi-service: ghcr.io/huggingface/text-generation-inference:sha-e4201f4-intel-cpu
2323
- redis-vector-db: redis/redis-stack:7.2.0-v9
2424

2525
Should you desire to use the Gaudi accelerator, two alternate images are used for the embedding and llm services.

0 commit comments

Comments
 (0)