Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions ChatQnA/kubernetes/manifests/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ kubectl create deployment client-test -n chatqa --image=python:3.8.13 -- sleep i
5. Access the application using the above URL from the client pod

```sh
export CLIENT_POD=$(kubectl get pod -l app=client-test -o jsonpath={.items..metadata.name})
export CLIENT_POD=$(kubectl get pod -n chatqa -l app=client-test -o jsonpath={.items..metadata.name})
export accessUrl=$(kubectl get gmc -n chatqa -o jsonpath="{.items[?(@.metadata.name=='chatqa')].status.accessUrl}")
kubectl exec "$CLIENT_POD" -n chatqa -- curl $accessUrl -X POST -d '{"text":"What is the revenue of Nike in 2023?","parameters":{"max_new_tokens":17, "do_sample": true}}' -H 'Content-Type: application/json'
```
Expand All @@ -79,7 +79,7 @@ For example, to use Llama-2-7b-chat-hf make the following edit:
```yaml
- name: Tgi
internalService:
serviceName: tgi-svc
serviceName: tgi-service-m
config:
LLM_MODEL_ID: Llama-2-7b-chat-hf
```
Expand All @@ -92,7 +92,7 @@ kubectl apply -f $(pwd)/chatQnA_xeon.yaml
8. Check that the tgi-svc-deployment has been changed to use the new LLM Model

```sh
kubectl get deployment tgi-svc-deployment -n chatqa -o jsonpath="{.spec.template.spec.containers[*].env[?(@.name=='LLM_MODEL_ID')].value}"
kubectl get deployment tgi-service-m-deployment -n chatqa -o jsonpath="{.spec.template.spec.containers[*].env[?(@.name=='LLM_MODEL_ID')].value}"
```

9. Access the updated pipeline using the same URL from above using the client pod
Expand Down
2 changes: 1 addition & 1 deletion DocSum/kubernetes/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ kubectl create deployment client-test -n ${ns} --image=python:3.8.13 -- sleep in
6. Access the pipeline using the above URL from the client pod and execute a request

```bash
export CLIENT_POD=$(kubectl get pod -l app=client-test -o jsonpath={.items..metadata.name})
export CLIENT_POD=$(kubectl get pod -n ${ns} -l app=client-test -o jsonpath={.items..metadata.name})
export accessUrl=$(kubectl get gmc -n $ns -o jsonpath="{.items[?(@.metadata.name=='docsum')].status.accessUrl}")
kubectl exec "$CLIENT_POD" -n $ns -- curl $accessUrl -X POST -d '{"query":"Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5."}' -H 'Content-Type: application/json'
```
Expand Down