Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HPA] Properly update HPA when an additional metric is added to the spec #1462

Merged

Conversation

moh-osman3
Copy link
Contributor

@moh-osman3 moh-osman3 commented Feb 9, 2023

Currently the HPA fails to update properly when a new metric is added. setAutoscalerSpec assumes the HPA spec structure will remain static when it is updated, but this is not the case when a new metric is added to the spec. Now the existing MetricSpec copies the desired MetricSpec directly so that any changes to the structure will also be copied.

Resolves #1439

@moh-osman3 moh-osman3 marked this pull request as ready for review February 10, 2023 10:48
@moh-osman3 moh-osman3 requested a review from a team as a code owner February 10, 2023 10:48
@moh-osman3
Copy link
Contributor Author

@kevinearls @pavolloffay Could you please take a look when you get a chance? This deviates from the client.Patch that is used to update most other objects in this repository. I have been unable to properly patch when the underlying HPA changes to add a new metric. Instead I am using client.Update to perform the update, and this has successfully resolved the issue reported. Any input or suggestions would be appreciated! Thank you!

@kevinearls
Copy link
Member

Hi @moh-osman3 sorry, but I'm no longer working on this project.

@moh-osman3
Copy link
Contributor Author

@kevinearls No worries, thanks for letting me know. @pavolloffay would you have some time to review this PR? It fixes an issue with updating the HPA when a second metric is added to the CRD and reapplied to the cluster. Thanks!

Copy link
Contributor

@iblancasa iblancasa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding extra steps to the autoscaling E2E test to verify the fix is working 100%?

@moh-osman3
Copy link
Contributor Author

How about adding extra steps to the autoscaling E2E test to verify the fix is working 100%?

@iblancasa Thanks for the review! I added to the e2e tests to now check the number of metrics in the HPA Metric Spec. Then I install an update with the same name and an additional metric. Let me know if you have any more suggestions on how to improve the e2e test.

pkg/collector/reconcile/horizontalpodautoscaler.go Outdated Show resolved Hide resolved
tests/e2e/autoscale/wait-until-hpa-ready.go Outdated Show resolved Hide resolved
tests/e2e/autoscale/02-install.yaml Show resolved Hide resolved
@moh-osman3 moh-osman3 force-pushed the mohosman/issue-1439/fix-HPA-update branch 6 times, most recently from b9520d6 to a7ea3e8 Compare February 24, 2023 18:56
@moh-osman3 moh-osman3 requested review from pavolloffay and iblancasa and removed request for pavolloffay February 27, 2023 09:58
Copy link
Contributor

@iblancasa iblancasa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job! Just a minor thing related to simplify the tests.

apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryCollector
metadata:
name: simplest-set-utilization
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before your changes, the test was done against the simplest OpenTelemetryCollector instance. That one is just to check the old behavior still works. But it would make more sense to do the test against the simplest-set-utilization OpenTelemetryCollector instance (and will simplify your test).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm is your recommendation to remove simplest collector entirely? I think I'd like to keep it around for now because it also confirms HPA is created with default targetCPUUtilization (i.e. scale up and scale down happens without having to explicitly set the cpu utilization because the default is used).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to continue on e2e tests improvements on my followup PR though, so let me know what you think!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm is your recommendation to remove simplest collector entirely? I think I'd like to keep it around for now because it also confirms HPA is created with default targetCPUUtilization (i.e. scale up and scale down happens without having to explicitly set the cpu utilization because the default is used).

Not removing it completely but not checking if it scales. Just checking if the new approach is the one that scales up and down and can be updated (as you did).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see, sounds good to me! I now removed the tracegen and scaleup/scaledown assertions from the test for the simplest collector. Thanks for the suggestions @iblancasa !

@moh-osman3 moh-osman3 force-pushed the mohosman/issue-1439/fix-HPA-update branch from 1f00e33 to 0468733 Compare March 1, 2023 04:49
Copy link
Contributor

@matej-g matej-g left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good me now, nice job @moh-osman3 🙌

@jaronoff97 jaronoff97 merged commit 278a55e into open-telemetry:main Mar 1, 2023
ItielOlenick pushed a commit to ItielOlenick/opentelemetry-operator that referenced this pull request May 1, 2024
…pec (open-telemetry#1462)

* remove patch logic and replace with update request

* test an update that adds another metric

* minor fixes

* rename chloggen

* split tests up by version

* use patch again instead of update

* add error check

* address review feedback

* delete commented line

* fix chloggen description

* fix lint

* simplify the logic - always replace the existing metrics with desired

* fix lint

* add to e2e tests

* fix e2e bug

* address review feedback

* decrease stabilization window

* cleanup

* simplify e2e
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

spec.autoscaler.targetMemoryUtilization does not add memory scaling to autoscaler
6 participants