Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about HPA trigger to scale #1

Closed
vsoch opened this issue May 31, 2023 · 2 comments
Closed

Question about HPA trigger to scale #1

vsoch opened this issue May 31, 2023 · 2 comments

Comments

@vsoch
Copy link

vsoch commented May 31, 2023

Hi! I found your slides here https://indico.cern.ch/event/968726/contributions/4118126/attachments/2153775/3632238/k8s-HEP_tedeschi.pdf and was hoping you might have some insights to (what I think is) a missing step. I'm working on similar functionality for our Flux Framework operator (in Kubernetes) to scale, and I have the metrics server that is outputting a metric for node cpus, and I have the APIService and the HPA that is pinging it. What I don't understand is the final step - how the HPA knows to act on a metric to tell a pod to, for example, scale up or down. This works fine for a CPU Resource:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: flux-sample-hpa
  namespace: flux-operator
spec:
  scaleTargetRef:
    apiVersion: flux-framework.org/v1alpha1
    kind: MiniCluster
    name: flux-sample
  minReplicas: 2
  maxReplicas: 4
  metrics:
  - type: Resource
    resource:
      name: cpu
      # This is explicitly set to be very low so it triggers
      target:
        type: Utilization
        averageUtilization: 2

But when I use a custom metric (provided by the server) I'm not sure how the custom metric actually advises the autoscaler:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: flux-sample-hpa
  namespace: flux-operator
spec:
  scaleTargetRef:
    apiVersion: flux-framework.org/v1alpha1
    kind: MiniCluster
    name: flux-sample
  minReplicas: 2
  maxReplicas: 4
  metrics:

  # https://docs.openshift.com/container-platform/4.11/rest_api/autoscale_apis/horizontalpodautoscaler-autoscaling-v2.html#spec-metrics-object
  - type: Object
    object:
      # This is the service we created
      target:
        value: 4
        type: "Value"

      # Where to get the data from
      describedObject:
        kind: Service
        name: custom-metrics-apiserver

      # This should scale until we hit 4
      metric:
        name: node_up_count

  # Behavior determines how to do the scaling
  # https://www.kloia.com/blog/advanced-hpa-in-kubernetes
  behavior:

    # select the preferred policy dynamically, "Max" or "Disabled"
    scaleUp:
      selectPolicy: Max
      stabilizationWindowSeconds: 120
      policies:
      - type: Percent
        value: 10
        periodSeconds: 60
      - type: Pods
        value: 4
        periodSeconds: 60

    scaleDown:
      selectPolicy: Max
      stabilizationWindowSeconds: 120
      policies:
      - type: Percent
        value: 10
        periodSeconds: 60
      - type: Pods
        value: 4
        periodSeconds: 60

I thought it was something to do with adding behavior, but looking in your PDF, it seems like there is some other rule that needs to be created? This part:

image

I'm still trying to figure out where that goes - in my case, my exporter is providing the service for the HPA directly, and I see that you are using Prometheus (the adapter) to convert exported metrics to some standard format with a rule? Do you know if there is documentation somewhere about how the adapted is providing metrics and how they trigger scaling? Right now mine look like this:

$ kubectl get --raw /apis/custom.metrics.k8s.io/v1beta2/namespaces/flux-operator/services/custom-metrics-service/node_up_count | jq
{
  "items": [
    {
      "metric": {
        "name": "node_up_count"
      },
      "value": 2,
      "timestamp": "2023-05-31T20:21:58+00:00",
      "windowSeconds": 0,
      "describedObject": {
        "kind": "Service",
        "namespace": "flux-operator",
        "name": "custom-metrics-apiserver",
        "apiVersion": "v1beta2"
      }
    }
  ],
  "apiVersion": "custom.metrics.k8s.io/v1beta2",
  "kind": "MetricValueList",
  "metadata": {
    "selfLink": "/apis/custom.metrics.k8s.io/v1beta2"
  }
}

Thanks for any advice or pointers you can provide! I'm new at this so apologies if any of these questions are stupid.

@vsoch
Copy link
Author

vsoch commented May 31, 2023

ah so I think I figured it out. What helped was looking at the autoscaler status to see what was going on:

$ kubectl get hpa -n flux-operator flux-sample-hpa -o json | jq .status.conditions
[
  {
    "lastTransitionTime": "2023-05-31T19:50:20Z",
    "message": "recommended size matches current size",
    "reason": "ReadyForNewScale",
    "status": "True",
    "type": "AbleToScale"
  },
  {
    "lastTransitionTime": "2023-05-31T19:52:35Z",
    "message": "the HPA was able to successfully calculate a replica count from Service metric node_up_count",
    "reason": "ValidMetricFound",
    "status": "True",
    "type": "ScalingActive"
  },
  {
    "lastTransitionTime": "2023-05-31T20:30:54Z",
    "message": "the desired count is within the acceptable range",
    "reason": "DesiredWithinRange",
    "status": "False",
    "type": "ScalingLimited"
  }
]

And then I tweaked that config above:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: flux-sample-hpa
  namespace: flux-operator
spec:
  scaleTargetRef:
    apiVersion: flux-framework.org/v1alpha1
    kind: MiniCluster
    name: flux-sample
  minReplicas: 2
  maxReplicas: 4
  metrics:

  # https://docs.openshift.com/container-platform/4.11/rest_api/autoscale_apis/horizontalpodautoscaler-autoscaling-v2.html#spec-metrics-object
  - type: Object
    object:
      # This is the service we created
      target:
        value: 4
        type: "Value"

      # Where to get the data from
      describedObject:
        kind: Service
        name: custom-metrics-apiserver

      # This should scale until we hit 4
      metric:
        name: node_up_count

  # Behavior determines how to do the scaling
  # Without this, nothing would happen
  # https://www.kloia.com/blog/advanced-hpa-in-kubernetes
  behavior:

    # select the preferred policy dynamically, "Max" or "Disabled"
    scaleUp:
      selectPolicy: Max
      stabilizationWindowSeconds: 120
      policies:
      - type: Percent
        value: 100
        periodSeconds: 60

    scaleDown:
      selectPolicy: Max
      stabilizationWindowSeconds: 120
      policies:
      - type: Percent
        value: 100
        periodSeconds: 60

And then finally I saw a response (note I started at 2 pods in the cluster!)

$ kubectl get -n flux-operator pods
NAME                  READY   STATUS    RESTARTS   AGE
flux-sample-0-kg8mq   1/1     Running   0          42m
flux-sample-1-dntwk   1/1     Running   0          42m
flux-sample-2-p8vhn   1/1     Running   0          2m3s
flux-sample-3-pvg6l   1/1     Running   0          2m3s

Feel free to close, but thank you in advance if there is any cool discussion! Sorry for the noise otherwise.

@madestro
Copy link
Member

madestro commented Jun 7, 2023

glad to help ;-)

@madestro madestro closed this as completed Jun 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants