Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix JaegerQueryReqsFailing alert rule missing 'operation' in query #4797

Merged
merged 1 commit into from Oct 3, 2023

Conversation

james-ryans
Copy link
Contributor

Which problem is this PR solving?

Description of the changes

  • Add group by operation to JaegerQueryReqsFailing rule
  • Here's the difference between before and after the changes. With operation group by, the alert will be distinguished by operation
image - Tested manually because PR at #1745 says creating the automated tests are not easy.

How was this change tested?

  • Manually tested in local
  • Ran curl -s http://localhost:9090/api/v1/alerts | jq and returns
{
  "status": "success",
  "data": {
    "alerts": [
      {
        ...
        "annotations": {
          "message": "prometheus localhost:14269 is seeing 100.00% query errors on get_trace.\n"
        },
        ...
      }
    ]
  }
}

Before it was

{
  "status": "success",
  "data": {
    "alerts": [
      {
        ...
        "annotations": {
          "message": "prometheus localhost:14269 is seeing 100.00% query errors on .\n"
        },
        ...
      }
    ]
  }
}
  • Ran pint lint prometheus_alerts.yml
james_ryans@james-ryans ~/g/s/g/j/jaeger (spm_alert_operation)> pint lint monitoring/jaeger-mixin/prometheus_alerts.yml 
level=info msg="Problems found" Information=8
level=info msg="1 problem(s) not visible because of --min-severity=warning flag"

Before changes it was

pint lint monitoring/jaeger-mixin/prometheus_alerts.yml
monitoring/jaeger-mixin/prometheus_alerts.yml:62-64 Bug: template is using "operation" label but the query removes it (alerts/template)
 62 |       "message": |
 63 |         {{ $labels.job }} {{ $labels.instance }} is seeing {{ printf "%.2f" $value }}% query errors on {{ $labels.operation }}.
 64 |     "expr": "100 * sum(rate(jaeger_query_requests_total{result=\"err\"}[1m])) by (instance, job, namespace) / sum(rate(jaeger_query_requests_total[1m])) by (instance, job, namespace)> 1"

level=info msg="Problems found" Bug=1 Information=8
level=info msg="1 problem(s) not visible because of --min-severity=warning flag"
level=fatal msg="Execution completed with error(s)" error="found 1 problem(s) with severity Bug or higher"

Checklist

Signed-off-by: James Ryans <james.ryans2012@gmail.com>
@james-ryans james-ryans requested a review from a team as a code owner October 2, 2023 15:12
@yurishkuro yurishkuro requested review from albertteoh and removed request for joe-elliott October 2, 2023 15:23
Copy link
Contributor

@albertteoh albertteoh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

By the way, I don't think this relates to SPM.

@codecov
Copy link

codecov bot commented Oct 3, 2023

Codecov Report

All modified lines are covered by tests ✅

see 1 file with indirect coverage changes

📢 Thoughts on this report? Let us know!.

@james-ryans james-ryans changed the title Fix SPM JaegerQueryReqsFailing alert missing 'operation' in PromQL Fix JaegerQueryReqsFailing promu rule missing 'operation' in query Oct 3, 2023
@james-ryans
Copy link
Contributor Author

james-ryans commented Oct 3, 2023

By the way, I don't think this relates to SPM.

Oops, I've changed the title

@yurishkuro yurishkuro changed the title Fix JaegerQueryReqsFailing promu rule missing 'operation' in query Fix JaegerQueryReqsFailing alert rule missing 'operation' in query Oct 3, 2023
@yurishkuro yurishkuro merged commit b83bda6 into jaegertracing:main Oct 3, 2023
28 of 31 checks passed
@james-ryans james-ryans deleted the spm_alert_operation branch October 3, 2023 16:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: JaegerQueryReqsFailing alert specifies 'operation' in message and it isn't aggregated by the alert expr
3 participants