Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot update HPA objects using object metrics with AverageValue #87733

Open
htaub opened this issue Jan 31, 2020 · 53 comments
Open

Cannot update HPA objects using object metrics with AverageValue #87733

htaub opened this issue Jan 31, 2020 · 53 comments
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/autoscaling Categorizes an issue or PR as relevant to SIG Autoscaling. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@htaub
Copy link

htaub commented Jan 31, 2020

What happened:

When attempting to label a HorizontalPodAutoscaler (autoscaling/v2beta2) object that has an object metric using AverageValue, received the following error:

The HorizontalPodAutoscaler "<hpa-name>" is invalid: spec.metrics[0].object.target.value: Invalid value: resource.Quantity{i:resource.int64Amount{value:0, scale:0}, d:resource.infDecAmount{Dec:(*inf.Dec)(nil)}, s:"0", Format:"DecimalSI"}: must be positive

What you expected to happen:

The label command should properly label the HPA object with no errors, since the HPA was able to build and run without errors.

How to reproduce it (as minimally and precisely as possible):

  1. Create a hpa.v2beta2.autoscaling object that includes the following metric:
  - object:
      describedObject:
        kind: Service
        name: svc-name
      metric:
        name: metric-name:sum 
      target:
        averageValue: 4650
        type: AverageValue
    type: Object

Make sure that you're using AverageValue and don't have a Value set.

  1. Run the following command:
kubectl label hpa hpa_name -n namespace_name testlabel=test

Anything else we need to know?:

We were able to build and apply the HPA object without encountering any errors, and it appears to be scaling properly. This error only happens when calling kubectl label on the existing HPA object.

Environment:

  • Kubernetes version (use kubectl version): tried in both 1.15.3 and 1.16.4
  • Cloud provider or hardware configuration: AWS
  • OS (e.g: cat /etc/os-release): darwin/amd64
  • Kernel (e.g. uname -a): Darwin Kernel Version 17.7.0: Fri Jul 6 19:54:51 PDT 2018; root:xnu-4570.71.3~2/RELEASE_X86_64 x86_64
@htaub htaub added the kind/bug Categorizes issue or PR as related to a bug. label Jan 31, 2020
@k8s-ci-robot k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Jan 31, 2020
@htaub
Copy link
Author

htaub commented Jan 31, 2020

/sig autoscaling

@k8s-ci-robot k8s-ci-robot added sig/autoscaling Categorizes an issue or PR as relevant to SIG Autoscaling. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jan 31, 2020
@htaub
Copy link
Author

htaub commented Feb 3, 2020

An update: this doesn't only affect the label command. Any attempt to patch the affected HPA object (i.e. changing minReplicas) will return the same error.

@htaub htaub changed the title Cannot label HPA objects using object metrics with AverageValue Cannot update HPA objects using object metrics with AverageValue Feb 3, 2020
@zzzkl
Copy link
Contributor

zzzkl commented Mar 31, 2020

k8s 1.17.2 also meet this error, is this a bug?

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 29, 2020
@garimakemwal
Copy link

Encountering the same error, are there any updates?

@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Aug 15, 2020
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@garimakemwal
Copy link

/reopen

@k8s-ci-robot
Copy link
Contributor

@garimakemwal: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@jebeaudet
Copy link

Encountering the same issue, it breaks our CI pipeline. Can we reopen this?

@L3tum
Copy link

L3tum commented Mar 2, 2021

We have this issue as well. Would be cool to at least reopen it.

@egoroof
Copy link

egoroof commented Mar 29, 2021

Have the same issue :(

@k8s-ci-robot
Copy link
Contributor

@egoroof: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@jlsan92
Copy link

jlsan92 commented Apr 27, 2021

Hitting this issue as well. There is a workaround tho

kubectl edit hpa.v2beta2.autoscaling <your-hpa>

Then spot every spec.metrics.*.object that uses AverageValue. Remove the .value ( that in my case is always "0").

After that, you can freely edit w/e you want from the rest of the document. The bad news, the .value: "0" returns after saving, so you need to delete it every time you want to edit again.

@jlsan92
Copy link

jlsan92 commented Apr 27, 2021

Btw hitting this on both K8s v1.17.9 and v1.18.18

@PatrickOsborne
Copy link

same problem here, and the workaround outlined above doesn't work either.

@leoskyrocker
Copy link

My hypothesis is that when the control plane attempts to persist the object in k8s and convert the v2beta2 object to v1, it does so by storing the new interface in annotation. While doing so, it considers "value" as a required attribute and initialized it to 0. This later becomes an issue when we get back the yaml through the v2beta2 endpoint, which transforms the object back into a v2 object, now with both the "value" and "AverageValue" key.

This is still appearing in 1.21 as well. What is the best way to get this re-opened?

@szuecs
Copy link
Member

szuecs commented Oct 29, 2021

/reopen

@k8s-ci-robot
Copy link
Contributor

@szuecs: Reopened this issue.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@szuecs
Copy link
Member

szuecs commented Sep 5, 2022

/remove-lifecycle rotten
/reopen

@k8s-ci-robot
Copy link
Contributor

@szuecs: Reopened this issue.

In response to this:

/remove-lifecycle rotten
/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot reopened this Sep 5, 2022
@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Sep 5, 2022
@lwangrabbit
Copy link

Also hit this problem on v1.22.10.

@huzhengchuan
Copy link
Contributor

huzhengchuan commented Sep 29, 2022

I meet the same question in k8s 1.22 , and debug it

when create v2beta2 hpa

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: sample-app
  namespace: default
spec:
  maxReplicas: 10
  metrics:
  - object:
      describedObject:
        kind: Service
        name: sample-app
      metric:
        name: http_requests_count
      target:
        averageValue: "1"
        type: AverageValue
    type: Object
  minReplicas: 7
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: sample-app

and the object send to kube-apiserver , and v2beta2 hpa convert to internal hpa object. when read the internal hpa obj to v1 hpa obj, because metrics switch to annotation to the v1 hpa object, we can found the annotaion

apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  annotations:
    autoscaling.alpha.kubernetes.io/conditions: '[{"type":"AbleToScale","status":"False","lastTransitionTime":"2022-09-29T12:31:03Z","reason":"FailedGetScale","message":"the
      HPA controller was unable to get the target''s current scale: deployments/scale.apps
      \"sample-app\" not found"}]'
    autoscaling.alpha.kubernetes.io/metrics: '[{"type":"Object","object":{"target":{"kind":"Service","name":"sample-app"},"metricName":"http_requests_count","targetValue":"0","averageValue":"1"}}]'
  creationTimestamp: "2022-09-29T12:30:48Z"
  name: sample-app
  namespace: default
  resourceVersion: "9314762"

in annotion exist "targetValue":"0"

In v1 hpa define in vendor/k8s.io/api/autoscaling/v1/types.go:227, the TargetValue is resource.Quantity type.

type ObjectMetricSource struct {
	// target is the described Kubernetes object.
	Target CrossVersionObjectReference `json:"target" protobuf:"bytes,1,name=target"`

	// metricName is the name of the metric in question.
	MetricName string `json:"metricName" protobuf:"bytes,2,name=metricName"`
	// targetValue is the target value of the metric (as a quantity).
	TargetValue resource.Quantity `json:"targetValue" protobuf:"bytes,3,name=targetValue"``
---
}

and in v2beta2 hpa define in vendor/k8s.io/api/autoscaling/v2beta2/types.go:320, the TargetValue is *resource.Quantity type.

// MetricTarget defines the target value, average value, or average utilization of a specific metric
type MetricTarget struct {
	// type represents whether the metric type is Utilization, Value, or AverageValue
	Type MetricTargetType `json:"type" protobuf:"bytes,1,name=type"`
	// value is the target value of the metric (as a quantity).
	// +optional
	Value *resource.Quantity `json:"value,omitempty" protobuf:"bytes,2,opt,name=value"`
	// averageValue is the target value of the average of the
	// metric across all relevant pods (as a quantity)
	// +optional
	AverageValue *resource.Quantity `json:"averageValue,omitempty" protobuf:"bytes,3,opt,name=averageValue"`
	// averageUtilization is the target value of the average of the
	// resource metric across all relevant pods, represented as a percentage of
	// the requested value of the resource for the pods.
	// Currently only valid for Resource metric source type
	// +optional
	AverageUtilization *int32 `json:"averageUtilization,omitempty" protobuf:"bytes,4,opt,name=averageUtilization"`
}

the TargetValue is not * v1, and in v2beta2 hpa TargetValue is *.

then , the annotation of v1 hpa object will convert to internal hpa object, the "targetValue":"0" in anntation be convert to internal hpa object, the target value is set to 0 not the nil

debug the problem in k8s 1.22.9 can fix the problem

diff --git a/pkg/apis/autoscaling/v1/conversion.go b/pkg/apis/autoscaling/v1/conversion.go
index bf94d48ae17..09fd8f9f337 100644
--- a/pkg/apis/autoscaling/v1/conversion.go
+++ b/pkg/apis/autoscaling/v1/conversion.go
@@ -90,9 +90,13 @@ func Convert_v1_ObjectMetricSource_To_autoscaling_ObjectMetricSource(in *autosca
 
        out.Target = autoscaling.MetricTarget{
                Type:         metricType,
-               Value:        &in.TargetValue,
                AverageValue: in.AverageValue,
        }
+
+       if in.TargetValue.Value() != 0 {
+               out.Target.Value = &in.TargetValue
+       }
+
        out.DescribedObject = autoscaling.CrossVersionObjectReference{
                Kind:       in.Target.Kind,
                Name:       in.Target.Name,


@pbetkier
Copy link
Contributor

Would be great to know if the problem also affects v2 API.

@pbetkier
Copy link
Contributor

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Oct 24, 2022
@abhisek00
Copy link

abhisek00 commented Oct 24, 2022

Issue is replicated in both autoscaling/v2 and autoscaling/v2 versions.

Server Version: v1.25.2

Error
kubectl label hpa php-apache testlabel=test
*The HorizontalPodAutoscaler "php-apache" is invalid: spec.metrics[0].object.target.value: Invalid value: resource.Quantity{i:resource.int64Amount{value:0, scale:0}, d:resource.infDecAmount{Dec:(inf.Dec)(nil)}, s:"0", Format:"DecimalSI"}: must be positive

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: php-apache
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: php-apache
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 50
  - type: Object
    object:
      describedObject:
        kind: Service
        name: php-apache
      metric:
        name: metric-name:sum 
      target:
        averageValue: 4650
        type: AverageValue

@pbetkier
Copy link
Contributor

/assign @abhisek00

@Zaporozhec7
Copy link

Hitting this issue as well. There is a workaround tho

kubectl edit hpa.v2beta2.autoscaling <your-hpa>

Then spot every spec.metrics.*.object that uses AverageValue. Remove the .value ( that in my case is always "0").

After that, you can freely edit w/e you want from the rest of the document. The bad news, the .value: "0" returns after saving, so you need to delete it every time you want to edit again.

Unfortunatelly, this workaround not very aplicable for our workflow with automated HPA updates. So I just set spec.metrics[0].object.target.value property to "1" - and it seems works, satisfying that mistaken validation rule and being just ignored for AverageValue

@ivasilyev-servicetitan-com
Copy link

I'm trying to workaround this as well and I'm adding object.target.value: X, where X is some positive integer (actually setting it to same value as object.target.averageValue has).
I'm doing it with the following yaml

      metrics:
      - type: Object
        object:
          metric:
            name: <metric_name>
          describedObject:
            apiVersion: apps/v1
            kind: Deployment
            name: <hpa_name>
          target:
            type: AverageValue
            averageValue: X
            value: X

And it seems issue is "fixed" with such spec.
However, what concerns me is the fact in HPA yaml spec I have object.targetValue rather then object.target.value.

  metrics:
    - type: Object
      object:
        target:
          kind: Deployment
          name: <hpa_name>
          apiVersion: apps/v1
        metricName: <metric_name>
        targetValue: 'X'
        averageValue: 'X'

Is it expected?

However, there is no more error about

.object.target.value: Invalid value: resource.Quantity{i:resource.int64Amount{value:0, scale:0}, d:resource.infDecAmount{Dec:(*inf.Dec)(nil)}, s:"0", Format:"DecimalSI"}: must be positive

anymore on patching.

@Zaporozhec7 can you share how you set spec.metrics[0].object.target.value property to "1" exactly ?

@Zaporozhec7
Copy link

I set spec.metrics[0].object.target.value to "1" on HPA creation/update (Helm chart).
Might be difference in Kubernetes versions and used API versions.
I have 1.24.5-gke.600 (GKE) version of K8s and using autoscaling/v2 apiVersion for HPA.
I do not see any difference on saved HPA with that values I provided

@ivasilyev-servicetitan-com
Copy link

  metrics:
    - type: Object
      object:
        target:
          kind: Deployment
          name: <hpa_name>
          apiVersion: apps/v1
          value: '1'
        metricName: <metric_name>
        averageValue: 'X'

so you doing like this?

@ivasilyev-servicetitan-com

Based on docs

With Value, the target is compared directly to the returned metric from the API. With AverageValue, the value returned from the custom metrics API is divided by the number of Pods before being compared to the target.

So you need to modify custom metric value by dividing it to number of pods to have hpa working same way as with averageValue.

@Zaporozhec7
Copy link

I using AverageValue target type, not Value - this is all about what this issue is. For AverageValue target type "averageValue" property used, "value" property is just ignored by AverageValue target type. My metric object look like this:

type: Object
object:
  metric:
    name: nginx_ingress_controller_requests_rate
  describedObject:
    apiVersion: networking.k8s.io/v1
    kind: Ingress
    name: customers
  target:
    type: AverageValue
    averageValue: "50"
    # Workaround for this bug https://github.com/kubernetes/kubernetes/issues/87733
    # remove it when bug will be fixed
    value: "1"

Your example above have different schema, it seems you are using different API version of HPA

@ivasilyev-servicetitan-com

Thx @Zaporozhec7 .
That totally makes sense now.
I didn't notice initially it can have different target.type.
I will go ahead and apply the same workaround.

@kgolab
Copy link
Contributor

kgolab commented May 11, 2023

@pbetkier , you asked at some point if this affects v2 - I'm afraid it does (and @Zaporozhec7 already confirmed this too).

What's worse, I just noticed that the objects gets mangled by using the 'averageValue' workaround.
After I create HPA like this (cannot create it at all w/o averageValue being set):

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: repro-87733
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: a-deployment
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: memory
      target:
        type: Value
        averageValue: 256Mi
        value: 1024Mi

What I get back by kubectl get hpa/repro-87733 -o yaml is:

[...]
  - resource:
      name: memory
      target:
        averageValue: 256Mi
        type: AverageValue
    type: Resource
[...]

But it looks like it's a different issue than the one originally raised, w/o setting averageValue I get:

The HorizontalPodAutoscaler "repro-87733" is invalid: spec.metrics[0].resource.target.averageUtilization: Required value: must set either a target raw value or a target utilization

@kgolab
Copy link
Contributor

kgolab commented May 12, 2023

Replying to my own comment - Value is simply not supported for Resource metrics.
I'll try to make sure that this information lands where it should.

Sorry for the confusion.

@pbetkier
Copy link
Contributor

Note: the issue affects HPAs created under k8s prior to 1.27. I believe #114358 fixed this for 1.27.

HPAs created before 1.27 need to be recreated (e.g. reapply kubectl apply -f hpa-yaml) after upgrading to 1.27 to get fixed.

@brconnell4
Copy link

brconnell4 commented Sep 6, 2023

Is there an update to this problem? We were using v2beta2 and switched to v2. We now get the same error we apply a helm deployment to the service that has an HPA with an Object resource.

We've deleted all the HPA's manually and redeploying with helm works. Except its building the Object reference like this

  metrics:
  - object:
      describedObject:
        apiVersion: /v1
        kind: Namespace
        name: service-ui
      metric:
        name: service_requests_per_second
      target:
        averageValue: 500m
        type: AverageValue
        value: "0"

If we try to do a helm upgrade install after this to the same service, it fails with the same error.

The work around suggested by setting the value of value to 1 sounds like it should work and not interferre with our type of AverageValue. But why is value needed if we are not using the type of Value . This seems like a major oversight and redundant to include values if not in use if this is intended.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/autoscaling Categorizes an issue or PR as relevant to SIG Autoscaling. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging a pull request may close this issue.