Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Proposal: Allow External Metrics to return metrics for a different namespace #324

Closed
carsonoid opened this issue Oct 5, 2020 · 20 comments

Comments

@carsonoid
Copy link
Contributor

Issue

I have a single nsq installation in a K8s namespace named nsq. But I have many deployments in different namespaces that need to autoscale using a topic and/or channel depth from the central installation.

The external metrics proposal requires that a namespace is in the path. And the HPA will always put it's own namespace into the external metrics query.

This makes it impossible to use the adapter to scale a pod based on metrics from a different namespace than the HPA.

Proposed Solution

Allow a special target label (metrics_target_namespace) to override the namespace that will be used in the Prometheus query.

Example HPA:

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: example
  namespace: example
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: example
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: External
    external:
      metric:
        name: nsq_topic_depth
        selector:
          labelSelector:
            topic: my-topic
            metric_target_namespace: nsq
      target:
        type: Value
        AverageValue: 50

So with the example above, even though the HPA is in the example namespace. It will scale based on the topic in the nsq namespace. This simple override allows much greater flexibility for metrics while still fitting withing the Kubernetes API spec.

I have a fork of the adapter with this code running in production and would be happy to upstream the changes if the feature proposal is accepted. I have only implemented the change in external metrics but it could also be implemented for all the other metric providers supported by the adapter.

@h4ckroot
Copy link

I have a very similar use case, and I wonder if there is any workaround for this !.

@carsonoid
Copy link
Contributor Author

@h4ckroot There is! I have already written code to do this. But I want to make sure that the proposed design is acceptable to @DirectXMan12

@jackwhelpton
Copy link

I'd also be interested in something exactly like this, and am curious if there's a workaround until your code lands. I've tried several things over the last day or so: I wondered if I could use a cluster-scoped CRD as the target of a custom metric, but so far nothing seems to have worked out.

@s-urbaniak
Copy link
Contributor

@carsonoid (note: I took over maintainership of prometheus-adapter): agreed I think we need to support the use cases of non-namespace bound external metrics better in prometheus-adapter. Your solution (I think) though relies on underlying metrics to have a namespace present as a label.

However, if we want to support non-namespaced metrics isn't it legit to demand from the underlying label that it doesn't have a namespace label too?

Taking APIResourceList as an inspiration we could have:

    externalRules:
    - seriesQuery: 'foo'
      resources:
        namespaced: false

And we would leave the adapter to decide the implementation details. More formally it would:
a) accept external metrics requests for all namespaces
b) drop enforcing the namespace label in [1] if namespaced: false, effectively returning the same value across all namespaces.

[1] https://github.com/DirectXMan12/k8s-prometheus-adapter/blob/d091fff18b4782d43d3e015f9cba453a560bdbe2/pkg/naming/metrics_query.go#L153-L164)

cc @dgrisonnet

@dgrisonnet
Copy link
Member

I agree with @s-urbaniak here, external metrics may not have a namespace label present so we would still end up with the same issue. The reason why metrics from different namespaces are not scrapable right now lies down to the namespace label being enforced by the adapter from the HPA query. Introducing a way to drop it, would allow both supporting external metrics from different namespaces and external metrics not having a namespace label.
If somehow, you still want to select the namespace label value in your HPA query, it would be possible by dropping the namespace enforcement in the adapter and set the label value in the HPA label selector.

@carsonoid
Copy link
Contributor Author

@dgrisonnet that's a good idea. My only concern is that it could be a bit confusing. AFAIK, the only place we can pass information to the adapter is in the labels of the metric query, which would also be where we would set the namespace label. So we would end up moving my proposal to something like this right?

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: example
  namespace: example
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: example
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: External
    external:
      metric:
        name: nsq_topic_depth
        selector:
          labelSelector:
            adapter-drop-namespace: "true"
            topic: my-topic
            namespace: nsq
      target:
        type: Value
        AverageValue: 50

That would absolutely work. But might be a bit confusing to say "ignore the namespace" and then also have a "namespace" key. This is a minor gripe though, and could be solved with a better label key or value.

Label key/value ideas:

  • ignore-hpa-namespace: "true"
  • adapter-ignore-hpa-namespace: "true"
  • use-hpa-namespace: "false"
  • use-metadata-namespace: "false"

Those are just off the top of my head, I would be interested in other proposals.

@dgrisonnet
Copy link
Member

Not exactly, I meant to drop the enforcement by setting namespaced: false in the adapter config as @s-urbaniak suggested. From the adapter perspective, that would mean that the external metric is accepted from all namespaces.
Then, if the metrics needed by your HPA need to be selected with a specific namespace label value, nothing prevents you from setting it in the HPA's label selector.

You would end up with the following adapter and HPA configurations:

    externalRules:
    - seriesQuery: 'nsq_topic_depth'
      resources:
        namespaced: false
  - type: External
    external:
      metric:
        name: nsq_topic_depth
        selector:
          labelSelector:
            topic: my-topic
            namespace: nsq

@carsonoid
Copy link
Contributor Author

carsonoid commented Nov 3, 2020 via email

@s-urbaniak
Copy link
Contributor

@carsonoid @dgrisonnet thank you for the great discussion! Yes @carsonoid if you want to contribute, that would be great! Side note: as we are in the process of moving this repository to kubernetes-sigs, do you mind to sign the CNCF CLA as described in kubernetes/org#2182 (comment) ? (If you haven't done already)

@s-urbaniak
Copy link
Contributor

Just to sum up, enforcing the namespace for this scenario can be achieved imho in many ways:

  • you can use a recording rule which enforces the namespace
  • you can enforce the namespace label even in the series query itself - seriesQuery: 'nsq_topic_depth{namespace="nsq"}'
  • using a HPA label selector as mentioned above

@carsonoid
Copy link
Contributor Author

carsonoid commented Nov 4, 2020

Oh! Yeah the recording rule is an interesting idea. Those should probably all be in a "dealing with namespaces" section in the docs.

@fxposter
Copy link

Hello, I have a similar problem - we have our external metrics work across many namespaces, based only on labels. Is there currently any workarounds that allow to achieve this?

@leosunmo
Copy link

I ran in to this exact issue as well, caused a lot of headscratching as to why all of my HPAs suddenly stopped working when I upgraded from 0.5.0 to 0.6.0. Since #197 was listed as a BUGFIX in the changelog I didn't pay attention, and didn't realise I was relying on a bug for all of my production scaling >.< .

This is a critical feature for us and I will have to stay on 0.5.0 until we figure something out in this thread. Happy to help in any way I can to move this along.

@dgrisonnet
Copy link
Member

Hello @fxposter, unfortunately there is no workaround for this. The only way to allow external metrics to work across different namespaces would be to implement the feature mentioned above.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 14, 2021
@leosunmo
Copy link

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 14, 2021
@leviathanbadger
Copy link

It seems to me that "external metrics" should be namespaced: false by default. The metrics are coming from outside of the kubernetes cluster; therefore they are not in a kubernetes namespace, by definition. I understand the external metrics spec requires a namespace; but it also indicates the adapter can choose to return the same metrics in multiple namespaces: https://github.com/kubernetes/community/blob/master/contributors/design-proposals/instrumentation/external-metrics-api.md#namespaces

In fact, specifying resources at all for external metrics is very confusing. It's a concept that only really fits with custom metrics, not external metrics (so long as they're not namespaced). Maybe namespaced: true could be inferred from the fact that the resources key exists at all?

I have a use case that requires external metrics. I can get it working for a singular namespace by hardcoding the namespace in the prometheus scraper rule as a label; but this is useless data added to the series that is 100% wrong, and actively misleading; and doing the same in a second namespace is impossible without duplicating the scraper and adapter rules.

@s-urbaniak
Copy link
Contributor

I think @leviathanbadger you have a good point indeed 🤔 having said that my intuition is that we should leave the choice to the user whether external metrics are supposed to be namespaced or not. Hence, my suggestion would be, instead of having this set implicitely, we could have the namespaced setting be an explicit setting declaring whether the given resource metric is supposed to be namespaced or not.

@ryan-dyer-sp
Copy link

This used to be how this worked in 0.5.0; though perhaps that was a "bug". I'm only encountering this problem now that I'm trying to upgrade to the latest version.

@dgrisonnet
Copy link
Member

Fixed by #380

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests