-
Notifications
You must be signed in to change notification settings - Fork 986
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge changes from master to release-0.13 branch #3698
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
upgrade vllm version Signed-off-by: Johnu George <johnugeorge109@gmail.com>
Signed-off-by: Curtis Maddalozzo <cmaddalozzo@bloomberg.net>
…Fixes kserve#3452 (kserve#3603) * feat: Support customizable deployment strategy for RawDeployment mode Signed-off-by: Yuan Tang <terrytangyuan@gmail.com> * regen Signed-off-by: Yuan Tang <terrytangyuan@gmail.com> * lint Signed-off-by: Yuan Tang <terrytangyuan@gmail.com> * Correctly apply rollingupdate Signed-off-by: Yuan Tang <terrytangyuan@gmail.com> * address comments Signed-off-by: Yuan Tang <terrytangyuan@gmail.com> * Add validation Signed-off-by: Yuan Tang <terrytangyuan@gmail.com> --------- Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
* Enable dtype for huggingface server Signed-off-by: Dattu Sharma <venkatadattasainimmaturi@gmail.com> * Set float16 as default. Fixup linter Signed-off-by: Dattu Sharma <venkatadattasainimmaturi@gmail.com> * Add small comment to make the changes understandable Signed-off-by: Dattu Sharma <venkatadattasainimmaturi@gmail.com> * Fixup linter Signed-off-by: Dattu Sharma <venkatadattasainimmaturi@gmail.com> * Adapt to new huggingfacemodel Signed-off-by: Dattu Sharma <venkatadattasainimmaturi@gmail.com> * Fixup merge :) Signed-off-by: Dattu Sharma <venkatadattasainimmaturi@gmail.com> * Explicitly mention the behaviour of dtype flag on auto. Signed-off-by: Dattu Sharma <venkatadattasainimmaturi@gmail.com> * Default to FP32 for encoder models Signed-off-by: Dattu Sharma <venkatadattasainimmaturi@gmail.com> * Selectively add --dtype to parser. Use FP16 for GPU and FP32 for CPU Signed-off-by: Dattu Sharma <venkatadattasainimmaturi@gmail.com> * Fixup linter Signed-off-by: Dattu Sharma <venkatadattasainimmaturi@gmail.com> * Update poetry Signed-off-by: Dattu Sharma <venkatadattasainimmaturi@gmail.com> * Use torch.float32 forr tests explicitly Signed-off-by: Dattu Sharma <venkatadattasainimmaturi@gmail.com> --------- Signed-off-by: Dattu Sharma <venkatadattasainimmaturi@gmail.com>
Signed-off-by: Curtis Maddalozzo <cmaddalozzo@bloomberg.net>
* fix for extract zip from gcs Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> * initial commit for gcs model download unittests Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> * unittests for model download from gcs Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> * black format fix Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> * code verification Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> --------- Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>
Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
* update wording for huggingface README small update to make readme easier to understand Signed-off-by: Alexa Griffith <agriffith50@bloomberg.net> * Update README.md Signed-off-by: Alexa Griffith agriffith50@bloomberg.net * Update python/huggingfaceserver/README.md Co-authored-by: Filippe Spolti <filippespolti@gmail.com> Signed-off-by: Alexa Griffith <agriffith50@bloomberg.net> * update vllm Signed-off-by: alexagriffith <agriffith50@bloomberg.net> * Update README.md --------- Signed-off-by: Alexa Griffith <agriffith50@bloomberg.net> Signed-off-by: Alexa Griffith agriffith50@bloomberg.net Signed-off-by: alexagriffith <agriffith50@bloomberg.net> Signed-off-by: Dan Sun <dsun20@bloomberg.net> Co-authored-by: Filippe Spolti <filippespolti@gmail.com> Co-authored-by: Dan Sun <dsun20@bloomberg.net>
* fix: HPA equality check should include annotations Signed-off-by: Yuan Tang <terrytangyuan@gmail.com> * Only watch related autoscalerclass annotation Signed-off-by: Yuan Tang <terrytangyuan@gmail.com> * simplify Signed-off-by: Yuan Tang <terrytangyuan@gmail.com> * Add missing delete action Signed-off-by: Yuan Tang <terrytangyuan@gmail.com> * fix logic Signed-off-by: Yuan Tang <terrytangyuan@gmail.com> --------- Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
fix huggingface runtime in chart Signed-off-by: Dan Sun <dsun20@bloomberg.net>
* fix huggingface runtime in chart Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Allow model_dir to be specified on template Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Default model_dir to /mnt/models for HF Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Lint format Signed-off-by: Dan Sun <dsun20@bloomberg.net> --------- Signed-off-by: Dan Sun <dsun20@bloomberg.net>
) * Fix:vLLM Model Supported check throwing circular dependency Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * remove unwanted comments Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * remove unwanted comments Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * fix return case Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * fix to check all arch in model config forr vllm support Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * fixlint Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> --------- Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: johnugeorge, yuzisun The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
oss-prow-bot
bot
merged commit May 18, 2024
16d391b
into
kserve:release-0.13
210 of 213 checks passed
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
All fixes from master are pulled to release-0.13 branch