Skip to content

Keda-operator in HA mode results in timeouts for the metrics #6729

@sthomson-wyn

Description

@sthomson-wyn

Report

When running 3 replicas of keda-operator and 3 replicas of keda-operator-metrics-apiserver, requests to the api server timeout intermittently

e.g.

kubectl get --raw \
  "/apis/external.metrics.k8s.io/v1beta1/namespaces/<namespace>/<metric>?labelSelector=scaledobject.keda.sh%2Fname%3D<scaled-object>"

will time out. Also seeing

post-timeout activity - time-elapsed: 87.481519ms, GET "/apis/external.metrics.k8s.io/v1beta1/namespaces/<namespace>/<metric>" result: runtime error: invalid memory address or nil pointer dereference

in the logs

This results in HPAs reporting unknown metrics and not scaling properly

Worth noting that this behaviour is inconsistent, and resolved by scaling keda-operator to 1

Expected Behavior

Requests to succeed

Actual Behavior

Requests time out

Steps to Reproduce the Problem

  1. scale keda components to 3 replica
  2. kubectl get --raw
    "/apis/external.metrics.k8s.io/v1beta1/namespaces//?labelSelector=scaledobject.keda.sh%2Fname%3D"
  3. observe timeout

Logs from KEDA operator

n/a

KEDA Version

< 2.15.0

Kubernetes Version

1.30

Platform

Google Cloud

Scaler Details

Prometheus

Anything else?

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingstaleAll issues that are marked as stale due to inactivity

    Type

    No type

    Projects

    Status

    Ready To Ship

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions