[prometheus-kube-stack] "Error on ingesting out-of-order result from rule evaluation" #1177

antoineozenne · 2021-07-19T12:57:08Z

Describe the bug
There is some warn errors in logs:

level=warn ts=2021-07-19T12:40:08.145Z caller=manager.go:651 component="rule manager" group=kube-apiserver.rules msg="Error on ingesting out-of-order result from rule evaluation" numDropped=8
level=warn ts=2021-07-19T12:40:48.951Z caller=manager.go:651 component="rule manager" group=kube-apiserver-burnrate.rules msg="Error on ingesting out-of-order result from rule evaluation" numDropped=1

This seems to trigger the PrometheusMissingRuleEvaluations alert.

Version of Helm and Kubernetes:

Helm Version:

$ helm version
version.BuildInfo{Version:"v3.5.3", GitCommit:"041ce5a2c17a58be0fcd5f5e16fb3e7e95fea622", GitTreeState:"dirty", GoVersion:"go1.15.8"}

Kubernetes Version:

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.10", GitCommit:"98d5dc5d36d34a7ee13368a7893dcb400ec4e566", GitTreeState:"clean", BuildDate:"2021-04-15T03:28:42Z", GoVersion:"go1.15.10", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.10", GitCommit:"98d5dc5d36d34a7ee13368a7893dcb400ec4e566", GitTreeState:"clean", BuildDate:"2021-04-15T03:20:25Z", GoVersion:"go1.15.10", Compiler:"gc", Platform:"linux/amd64"}

Which chart: kube-prometheus-stack

Which version of the chart: 16.14.1

What happened:

The record cluster_quantile:apiserver_request_duration_seconds:histogram_quantile, defined in kube-apiserver.rules.yaml and kube-apiserver-histogram.rules.yaml, contains a lot of NaN values (because of some 0 values in the instant-vector in histogram_quantile). This triggers PrometheusMissingRuleEvaluations alert.

What you expected to happen:

Handle the rule to not trigger alert.

Anything else:

I noticed this commit redefines this record in a second group of rules f501c4ed62c9e77cf96b46e83202f6ea17a13b97 (kube-apiserver-histogram.rules in addition to kube-apiserver.rules).

The text was updated successfully, but these errors were encountered:

stale · 2021-08-18T14:13:39Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.

antoineozenne · 2021-08-18T18:06:46Z

/no-stale

antoineozenne · 2021-08-24T07:27:01Z

Still exists in 18.0.0.

stale · 2021-09-23T08:45:04Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.

antoineozenne · 2021-09-23T08:46:38Z

/no-stale

stale · 2021-10-23T11:17:17Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.

antoineozenne · 2021-10-24T10:21:11Z

/no-stale

antoineozenne · 2021-11-04T07:59:28Z

I can provide more information if you need.

stale · 2021-12-04T10:42:51Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.

antoineozenne · 2021-12-06T09:48:52Z

/no-stale

stale · 2022-01-05T15:12:04Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.

antoineozenne · 2022-01-05T16:07:29Z

/no-stale

bryanasdev000 · 2022-01-10T15:08:32Z

Can confirm, same here.

Also related to #1283, with a possible workaround.

EDIT: Also related in my case kubernetes-monitoring/kubernetes-mixin#392 and https://docs.microfocus.com/itom/HCMX:2021.05/PrometheusManyToManyMatching .

antoineozenne · 2022-01-11T09:17:04Z

I think setting .Values.defaultRules.rules.kubeApiserver to false isn't really a workaround as it disables the monitoring. :)

bryanasdev000 · 2022-01-11T16:33:13Z

I think setting .Values.defaultRules.rules.kubeApiserver to false isn't really a workaround as it disables the monitoring. :)

Forgot the quotes at workaround :P

In my specific case, resolving the many to many also resolved out of order result after a pod restart.

In my case it all refers to an old/dirty installation of Prometheus. In your context, does a new cluster have the same problems?

I am running K8S 1.19-1.21 with kube-prometheus-stack-23.3.2.

antoineozenne · 2022-01-13T09:51:15Z

It is the case for a new installation, yes.

stale · 2022-02-12T23:08:11Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.

antoineozenne · 2022-02-14T10:34:12Z

/no-stale

bryanasdev000 · 2022-03-02T16:38:56Z

/no-stale

@antoineozenne take a look at: #1799

Basically, setting .Values.defaultRules.rules.kubeApiserver to false fixes it, as it disables the duplicate. You keep the monitoring for the API Server and its rules.

antoineozenne · 2022-03-10T14:37:52Z

Thank you @bryanasdev000, will use that.

antoineozenne added the bug Something isn't working label Jul 19, 2021

stale bot added the lifecycle/stale label Aug 18, 2021

stale bot removed the lifecycle/stale label Aug 18, 2021

stale bot added the lifecycle/stale label Sep 23, 2021

stale bot removed the lifecycle/stale label Sep 23, 2021

stale bot added the lifecycle/stale label Oct 23, 2021

stale bot removed the lifecycle/stale label Oct 24, 2021

stale bot added the lifecycle/stale label Dec 4, 2021

stale bot removed the lifecycle/stale label Dec 6, 2021

stale bot added the lifecycle/stale label Jan 5, 2022

stale bot removed the lifecycle/stale label Jan 5, 2022

stale bot added the lifecycle/stale label Feb 12, 2022

stale bot removed the lifecycle/stale label Feb 14, 2022

antoineozenne closed this as completed Mar 10, 2022

pgier mentioned this issue May 19, 2022

[kube-prometheus-stack] remove deprecated kube apiserver rules #2076

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[prometheus-kube-stack] "Error on ingesting out-of-order result from rule evaluation" #1177

[prometheus-kube-stack] "Error on ingesting out-of-order result from rule evaluation" #1177

antoineozenne commented Jul 19, 2021

stale bot commented Aug 18, 2021

antoineozenne commented Aug 18, 2021

antoineozenne commented Aug 24, 2021

stale bot commented Sep 23, 2021

antoineozenne commented Sep 23, 2021

stale bot commented Oct 23, 2021

antoineozenne commented Oct 24, 2021

antoineozenne commented Nov 4, 2021

stale bot commented Dec 4, 2021

antoineozenne commented Dec 6, 2021

stale bot commented Jan 5, 2022

antoineozenne commented Jan 5, 2022

bryanasdev000 commented Jan 10, 2022 •

edited

antoineozenne commented Jan 11, 2022

bryanasdev000 commented Jan 11, 2022 •

edited

antoineozenne commented Jan 13, 2022

stale bot commented Feb 12, 2022

antoineozenne commented Feb 14, 2022

bryanasdev000 commented Mar 2, 2022 •

edited

antoineozenne commented Mar 10, 2022

[prometheus-kube-stack] "Error on ingesting out-of-order result from rule evaluation" #1177

[prometheus-kube-stack] "Error on ingesting out-of-order result from rule evaluation" #1177

Comments

antoineozenne commented Jul 19, 2021

stale bot commented Aug 18, 2021

antoineozenne commented Aug 18, 2021

antoineozenne commented Aug 24, 2021

stale bot commented Sep 23, 2021

antoineozenne commented Sep 23, 2021

stale bot commented Oct 23, 2021

antoineozenne commented Oct 24, 2021

antoineozenne commented Nov 4, 2021

stale bot commented Dec 4, 2021

antoineozenne commented Dec 6, 2021

stale bot commented Jan 5, 2022

antoineozenne commented Jan 5, 2022

bryanasdev000 commented Jan 10, 2022 • edited

antoineozenne commented Jan 11, 2022

bryanasdev000 commented Jan 11, 2022 • edited

antoineozenne commented Jan 13, 2022

stale bot commented Feb 12, 2022

antoineozenne commented Feb 14, 2022

bryanasdev000 commented Mar 2, 2022 • edited

antoineozenne commented Mar 10, 2022

bryanasdev000 commented Jan 10, 2022 •

edited

bryanasdev000 commented Jan 11, 2022 •

edited

bryanasdev000 commented Mar 2, 2022 •

edited