New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix prometheus operator metrics to work out of the box #4617
Conversation
Signed-off-by: Lauri Nevala <lauri.nevala@gmail.com>
Signed-off-by: Lauri Nevala <lauri.nevala@gmail.com>
Signed-off-by: Lauri Nevala <lauri.nevala@gmail.com> Signed-off-by: Jim Ehrismann <jehrismann@mirantis.com>
* Use electron.clipboard for all clipboard uses (#4535) Signed-off-by: Jim Ehrismann <jehrismann@mirantis.com> * Fix ERR_UNSAFE_PORT from LensProxy (#4558) * Fix ERR_UNSAFE_PORT from LensProxy - Use the current list of ports from chromium as it is much easier to just reject using one of those instead of trying to handle the ERR_UNSAFE_PORT laod error from a BrowserWindow.on("did-fail-load") Signed-off-by: Sebastian Malton <sebastian@malton.name> * Move all port handling into LensProxy Signed-off-by: Sebastian Malton <sebastian@malton.name> * don't use so many exceptions Signed-off-by: Sebastian Malton <sebastian@malton.name> Signed-off-by: Jim Ehrismann <jehrismann@mirantis.com> * Fix not being able to clear set cluster icon (#4555) Signed-off-by: Jim Ehrismann <jehrismann@mirantis.com> * Fix extension engine range not working for some ^ ranges (#4554) Signed-off-by: Jim Ehrismann <jehrismann@mirantis.com> * Fix crash on NetworkPolicy when matchLabels is missing (#4500) Signed-off-by: Jim Ehrismann <jehrismann@mirantis.com> * Replace all uses of promiseExec with promiseExecFile (#4514) Signed-off-by: Jim Ehrismann <jehrismann@mirantis.com> * Less noisy metrics-not-available error logging (#4602) Signed-off-by: Jari Kolehmainen <jari.kolehmainen@gmail.com> Signed-off-by: Jim Ehrismann <jehrismann@mirantis.com> * Fix close button overflow in Preferences (#4611) * Adding basic colors to tailwind theme Signed-off-by: Alex Andreev <alex.andreev.email@gmail.com> * Using tailwind inline to style close button Signed-off-by: Alex Andreev <alex.andreev.email@gmail.com> * Make Select look similar to inputs Signed-off-by: Alex Andreev <alex.andreev.email@gmail.com> * Moving styles into separate module Signed-off-by: Alex Andreev <alex.andreev.email@gmail.com> * Convert tailwind commands to css Signed-off-by: Alex Andreev <alex.andreev.email@gmail.com> Signed-off-by: Jim Ehrismann <jehrismann@mirantis.com> * Fix prometheus operator metrics work out of the box (#4617) Signed-off-by: Lauri Nevala <lauri.nevala@gmail.com> Signed-off-by: Jim Ehrismann <jehrismann@mirantis.com> * Fix CRD.getPreferedVersion() to work based on apiVersion (#4553) * Fix CRD.getPreferedVersion() to work based on apiVersion Signed-off-by: Sebastian Malton <sebastian@malton.name> * Add tests Signed-off-by: Sebastian Malton <sebastian@malton.name> Signed-off-by: Jim Ehrismann <jehrismann@mirantis.com> * Fix crash for KubeObjectStore.loadAll() (#4675) Signed-off-by: Sebastian Malton <sebastian@malton.name> Signed-off-by: Jim Ehrismann <jehrismann@mirantis.com> * Convert CloseButton styles out from css modules (#4723) * Convert CloseButton styles out from css modules Signed-off-by: Alex Andreev <alex.andreev.email@gmail.com> * Fix close button styling Signed-off-by: Alex Andreev <alex.andreev.email@gmail.com> * release v5.3.4 Signed-off-by: Jim Ehrismann <jehrismann@mirantis.com> Co-authored-by: Sebastian Malton <sebastian@malton.name> Co-authored-by: Jari Kolehmainen <jari.kolehmainen@gmail.com> Co-authored-by: Alex Andreev <alex.andreev.email@gmail.com> Co-authored-by: Lauri Nevala <lauri.nevala@gmail.com>
@@ -50,7 +50,7 @@ export class PrometheusOperator extends PrometheusProvider { | |||
case "memoryAllocatableCapacity": | |||
return `sum(kube_node_status_allocatable{node=~"${opts.nodes}", resource="memory"})`; | |||
case "cpuUsage": | |||
return `sum(rate(node_cpu_seconds_total{node=~"${opts.nodes}", mode=~"user|system"}[${this.rateAccuracy}]))`; | |||
return `sum(rate(node_cpu_seconds_total{mode=~"user|system"}[${this.rateAccuracy}])* on (pod,namespace) group_left(node) kube_pod_info{node=~"${opts.nodes}"})`; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The queries are now wrong again 😉
It should be reversed, join on node
, take the specific labels
on (node) group_left(pod, namespace)
Or does the query work for you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does not work for us. Many reports about the queries you provided as failing that is why we reverted them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that they need to be reintroduced to fix the compatibilty, but they need to get applied correctly to get it to work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahhh, so node_cpu_seconds_total
should contain pod
and namespace
. Maybe a comment about that would be useful. The node exporter in my setup doesn't have those labels. That also belongs into the documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah thanks for the info, would you mind making an issue to add that to the documentation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marcbachmann how did you install prometheus operator?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have an external node exporter running natively on the hosts.
Therefore it only has the node
label. I'm doing this because I have still some other services like databases outside kubernetes.
Might be an edge case and I could migrate that into the cluster.
This PR brings back
group_left
joins removed by #3653. The reason why joins are not working is thatprometheus-community/kube-prometheus-stack
helm chart has invalidkube-state-metrics
serviceMonitor custom resource andkube_pod_info
metrics is not available (prometheus-community/helm-charts#1631).Metrics on Lens IDE should work without making any additional relabelings and that's why bringing these group left joins back.
Fixes: #4435