-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Code for AI on GKE guide series #1228
Code for AI on GKE guide series #1228
Conversation
ganochenkodg
commented
Apr 4, 2024
- Kustomize patches to run various quantized models in vLLM and TGI runtimes.
- --quantization=awq | ||
env: | ||
- name: MODEL_ID | ||
value: dganochenko/gemma-2b-AWQ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please change this to value: google/gemma-2b-AWQ
so it points to the right repository.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's done already
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small detail but can we remove this file change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
File recovered
- --quantization=awq | ||
env: | ||
- name: MODEL_ID | ||
value: dganochenko/gemma-7b-AWQ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please change to value: google/gemma-7b-AWQ
to ensure we're pointing at the right repository
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's done already
Looks good. @TarasRudko @ganochenkodg Can we mark this as ready for review? |
No region tags are edited in this PR.This comment is generated by snippet-bot.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is deleting this file intentional? Looks like deleting it will break this doc
https://cloud.google.com/kubernetes-engine/docs/tutorials/serve-gemma-gpu-vllm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good from my end