Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate wrong alerts when upgrading Prometheus from 2.33.5 to 2.50.0 #13839

Open
zhaojinxin409 opened this issue Mar 26, 2024 · 2 comments
Open

Comments

@zhaojinxin409
Copy link

zhaojinxin409 commented Mar 26, 2024

What did you do?

I create a alert rule, but it generates alerts' project label is not less

(max_over_time(app:container:kube_pod_container_status_terminated_reason_error{namespace!~".*(test|dev).*",project="lees"}[5m] offset 5m)) > 0

What did you expect to see?

Generate the alerts with project=less

What did you see instead? Under which circumstances?

What i saw

The Prometheus ALERTS metrics:
image

The alert rule query result of the same time:
image

How to reproduce

it happens occasionally, haven't found the way to reproduce.

Other information

I have two Prometheus of v2.33.5 and v2.50.0, the v2.33.5 works as expected while the v2.50.0 is not.

System information

No response

Prometheus version

Prometheus 2.50.0

Prometheus configuration file

alert:容器实例异常退出-cdn1s9mrjlj070sa9c40
expr:(max_over_time(app:container:kube_pod_container_status_terminated_reason_error{namespace!~".*(test|dev).*",project="lees"}[5m] offset 5m)) > 0
labels:
   category: app
   severity: serious
   source: x

Alertmanager version

No response

Alertmanager configuration file

No response

Logs

No related logs
@zhaojinxin409
Copy link
Author

I'm very sure that it's because the upgrading. I downgrade the prometheus from v2.50.0 to 2.33.5 and all alerts works normally.

@bboreham
Copy link
Member

bboreham commented Apr 9, 2024

There is an inconsistency between your screenshot which says less and your posted config which says lees.
I suspect you have not posted the exact and complete config, which makes it impossible to troubleshoot.

Mildly interested to know why you would run 2.50.0 when there was a bugfix release 2.50.1 a few days later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants