Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: HPA equality check should include annotations #3650

Merged
merged 7 commits into from
May 11, 2024

Conversation

terrytangyuan
Copy link
Member

@terrytangyuan terrytangyuan commented Apr 29, 2024

This PR fixes an issue where:

Correct behavior: If it's a new ISVC with the annotation serving.kserve.io/autoscalerClass: "external", no HPA will be created.

Incorrect behavior: However, if it's an existing ISVC where HPA is already created with annotation serving.kserve.io/autoscalerClass: "hpa", our users want to change the annotation to "external" and expect that the existing HPA to be deleted.

Copy link
Contributor

@spolti spolti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@oss-prow-bot oss-prow-bot bot removed the lgtm label May 3, 2024
return equality.Semantic.DeepEqual(desired.Spec, existing.Spec) && !autoscalerClassChanged
}

func shouldDeleteHPA(desired *autoscalingv2.HorizontalPodAutoscaler) bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should only delete if it is changed from enabled to external otherwise delete throws error

Copy link
Member

@yuzisun yuzisun May 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually nvm, shouldDeleteHPA is called only when HPA exists. But when autoscaler is set to external it creates NoOpAutoscaler and does not reconcile

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have you tested this? I think it will not work

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix and test it. Marking as draft for now

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yuzisun Please take another look. I tested a couple of scenarios and they worked as expected.

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
Copy link
Contributor

@spolti spolti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@oss-prow-bot oss-prow-bot bot added the lgtm label May 9, 2024
return hpa.NewHPAReconciler(client, scheme, componentMeta, componentExt), nil
case constants.AutoscalerClassExternal:
return &NoOpAutoscaler{}, nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So should we delete the NoOpAutoscaler class now if it is no longer used?

@yuzisun
Copy link
Member

yuzisun commented May 11, 2024

/approve

Copy link

oss-prow-bot bot commented May 11, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: spolti, terrytangyuan, yuzisun

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@yuzisun yuzisun merged commit 56a2940 into kserve:master May 11, 2024
57 of 58 checks passed
@terrytangyuan terrytangyuan deleted the fix-hpa-check branch May 11, 2024 14:12
terrytangyuan added a commit to terrytangyuan/kserve that referenced this pull request May 13, 2024
* fix: HPA equality check should include annotations

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

* Only watch related autoscalerclass annotation

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

* simplify

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

* Add missing delete action

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

* fix logic

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
---------

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
terrytangyuan added a commit to terrytangyuan/kserve that referenced this pull request May 13, 2024
* fix: HPA equality check should include annotations

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

* Only watch related autoscalerclass annotation

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

* simplify

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

* Add missing delete action

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

* fix logic

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
---------

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
openshift-merge-bot bot added a commit to opendatahub-io/kserve that referenced this pull request May 13, 2024
[RHOAIENG-6878][Cherry-pick][RHOAI-2.10] fix: HPA equality check should include annotations (kserve#3650)
openshift-merge-bot bot added a commit to red-hat-data-services/kserve that referenced this pull request May 13, 2024
[RHOAIENG-6577][Cherry-pick][RHOAI-2.8] fix: HPA equality check should include annotations (kserve#3650)
asd981256 pushed a commit to asd981256/kserve that referenced this pull request May 14, 2024
* fix: HPA equality check should include annotations

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

* Only watch related autoscalerclass annotation

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

* simplify

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

* Add missing delete action

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

* fix logic

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
---------

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
Signed-off-by: asd981256 <asd981256@gmail.com>
oss-prow-bot bot pushed a commit that referenced this pull request May 18, 2024
* upgrade vllm/transformers version (#3671)

upgrade vllm version

Signed-off-by: Johnu George <johnugeorge109@gmail.com>

* Add openai models endpoint (#3666)

Signed-off-by: Curtis Maddalozzo <cmaddalozzo@bloomberg.net>

* feat: Support customizable deployment strategy for RawDeployment mode. Fixes #3452 (#3603)

* feat: Support customizable deployment strategy for RawDeployment mode

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

* regen

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

* lint

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

* Correctly apply rollingupdate

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

* address comments

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

* Add validation

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

---------

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

* Enable dtype support for huggingface server (#3613)

* Enable dtype for huggingface server

Signed-off-by: Dattu Sharma <venkatadattasainimmaturi@gmail.com>

* Set float16 as default. Fixup linter

Signed-off-by: Dattu Sharma <venkatadattasainimmaturi@gmail.com>

* Add small comment to make the changes understandable

Signed-off-by: Dattu Sharma <venkatadattasainimmaturi@gmail.com>

* Fixup linter

Signed-off-by: Dattu Sharma <venkatadattasainimmaturi@gmail.com>

* Adapt to new huggingfacemodel

Signed-off-by: Dattu Sharma <venkatadattasainimmaturi@gmail.com>

* Fixup merge :)

Signed-off-by: Dattu Sharma <venkatadattasainimmaturi@gmail.com>

* Explicitly mention the behaviour of dtype flag on auto.

Signed-off-by: Dattu Sharma <venkatadattasainimmaturi@gmail.com>

* Default to FP32 for encoder models

Signed-off-by: Dattu Sharma <venkatadattasainimmaturi@gmail.com>

* Selectively add --dtype to parser. Use FP16 for GPU and FP32 for CPU

Signed-off-by: Dattu Sharma <venkatadattasainimmaturi@gmail.com>

* Fixup linter

Signed-off-by: Dattu Sharma <venkatadattasainimmaturi@gmail.com>

* Update poetry

Signed-off-by: Dattu Sharma <venkatadattasainimmaturi@gmail.com>

* Use torch.float32 forr tests explicitly

Signed-off-by: Dattu Sharma <venkatadattasainimmaturi@gmail.com>

---------

Signed-off-by: Dattu Sharma <venkatadattasainimmaturi@gmail.com>

* Add method for checking model health/readiness (#3673)

Signed-off-by: Curtis Maddalozzo <cmaddalozzo@bloomberg.net>

* fix for extract zip from gcs (#3510)

* fix for extract zip from gcs

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

* initial commit for gcs model download unittests

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

* unittests for model download from gcs

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

* black format fix

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

* code verification

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

---------

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

* Update Dockerfile and Readme (#3676)

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* Update huggingface readme (#3678)

* update wording for huggingface README

small update to make readme easier to understand

Signed-off-by: Alexa Griffith  <agriffith50@bloomberg.net>

* Update README.md

Signed-off-by: Alexa Griffith agriffith50@bloomberg.net

* Update python/huggingfaceserver/README.md

Co-authored-by: Filippe Spolti <filippespolti@gmail.com>
Signed-off-by: Alexa Griffith  <agriffith50@bloomberg.net>

* update vllm

Signed-off-by: alexagriffith <agriffith50@bloomberg.net>

* Update README.md

---------

Signed-off-by: Alexa Griffith  <agriffith50@bloomberg.net>
Signed-off-by: Alexa Griffith agriffith50@bloomberg.net
Signed-off-by: alexagriffith <agriffith50@bloomberg.net>
Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Co-authored-by: Filippe Spolti <filippespolti@gmail.com>
Co-authored-by: Dan Sun <dsun20@bloomberg.net>

* fix: HPA equality check should include annotations (#3650)

* fix: HPA equality check should include annotations

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

* Only watch related autoscalerclass annotation

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

* simplify

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

* Add missing delete action

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

* fix logic

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
---------

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

* Fix:  huggingface runtime in helm chart (#3679)

fix huggingface runtime in chart

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Fix: model id and model dir check order (#3680)

* fix huggingface runtime in chart

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Allow model_dir to be specified on template

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Default model_dir to /mnt/models for HF

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Lint format

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

---------

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Fix:vLLM Model Supported check throwing circular dependency (#3688)

* Fix:vLLM Model Supported check throwing circular dependency

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* remove unwanted comments

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* remove unwanted comments

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* fix return case

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* fix to check all arch in model config forr vllm support

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* fixlint

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

---------

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* Fix: Allow null in Finish reason streaming response in vLLM (#3684)

Fix: allow null in Finish reason

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

---------

Signed-off-by: Johnu George <johnugeorge109@gmail.com>
Signed-off-by: Curtis Maddalozzo <cmaddalozzo@bloomberg.net>
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
Signed-off-by: Dattu Sharma <venkatadattasainimmaturi@gmail.com>
Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>
Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
Signed-off-by: Alexa Griffith  <agriffith50@bloomberg.net>
Signed-off-by: Alexa Griffith agriffith50@bloomberg.net
Signed-off-by: alexagriffith <agriffith50@bloomberg.net>
Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Co-authored-by: Curtis Maddalozzo <cmaddalozzo@users.noreply.github.com>
Co-authored-by: Yuan Tang <terrytangyuan@gmail.com>
Co-authored-by: Datta Nimmaturi <39181234+Datta0@users.noreply.github.com>
Co-authored-by: Andrews Arokiam <87992092+andyi2it@users.noreply.github.com>
Co-authored-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
Co-authored-by: Alexa Griffith <agriffith50@bloomberg.net>
Co-authored-by: Filippe Spolti <filippespolti@gmail.com>
Co-authored-by: Dan Sun <dsun20@bloomberg.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants