Use Kubernetes Recommended Labels #38

rlex · 2021-10-21T01:30:29Z

Hello, and thanks for great tool!
I think it needs two things:

Deletion of generated rules after pyrra object removal. Right now they are being left as is. What about adding labels like

managed-by=pyrra
pyrra-slo-name=prometheus-http-error

And some controller to ensure rules are in place / removed?

Adjust name of alert, adding prefix / suffix to alertname. Atm all alerts are "ErrorBudgetBurn"

The text was updated successfully, but these errors were encountered:

metalmatze · 2021-10-22T10:14:13Z

Hey, thanks for trying and opening this issue. I'm sure there a plenty of more things (I have such a long list in my head too), so feel free to continue opening things.

I created Delete generated PrometheusRule and rule files #39 for the deletion issue

Could you give a bit more context about the labels? Is that just adding the common Kubernetes labels to the generated PrometheusRule CRs?

We can totally make the alertnames configurable for sure. Keep in mind, that the alertname is just another label for the alert itself. So usually the alerts are identified by their label set too, like:

{alertname="ErrorBudgetBurn", job="kube-dns", long="30m", short="3m", slo="coredns-response-errors"}{alertname="ErrorBudgetBurn", job="kube-dns", long="3h", short="15m", slo="coredns-response-errors"}

Therefore I'm not super concerned about the alertname itself, but we can still happily make it configurable.

I'll rename this issue to be about the second part of the original comment, as I opened a separate issue for 1.

rlex · 2021-10-22T15:27:07Z

Could you give a bit more context about the labels? Is that just adding the common Kubernetes labels to the generated PrometheusRule CRs?

Yes, precisely

hsolberg · 2022-02-28T13:51:46Z

Hi! We have a similar use-case. When more than one team use the same Pyrra and Prometheus-operator it's difficult to tell the alert-rules apart and Pyrra can get a little cluttered. Having some customization to include existing labels from Kubernetes and adding new labels as well would be great!

pyrra/openapi/client/model_objective.go

Lines 18 to 39 in ca43989

    
           type Objective struct { 
        
           	Labels      map[string]string `json:"labels"` 
        
           	Description string            `json:"description"` 
        
           	Target      float64           `json:"target"` 
        
           	Window      int64             `json:"window"` 
        
           	Config      string            `json:"config"` 
        
           	Indicator   *Indicator        `json:"indicator,omitempty"` 
        
           } 
        
           // NewObjective instantiates a new Objective object 
        
           // This constructor will assign default values to properties that have it defined, 
        
           // and makes sure properties required by API are set, but the set of arguments 
        
           // will change when the set of required properties is changed 
        
           func NewObjective(labels map[string]string, description string, target float64, window int64, config string) *Objective { 
        
           	this := Objective{} 
        
           	this.Labels = labels 
        
           	this.Description = description 
        
           	this.Target = target 
        
           	this.Window = window 
        
           	this.Config = config 
        
           	return &this 
        
           }

From rules.go#L103-L104

        ruleLabels := map[string]string{}
        ruleLabels["slo"] = sloName

Adding a new entry here for custom labels like this for instance

ruleLabels["team"] = your-team-name

Seems the recording rules are missing namespace as well, since namespace is added automatically to the Pyrra config, but not to the recording rules and therefor not visible in the alert rules.

pyrra/openapi/client/model_multi_burnrate_alert.go

Lines 17 to 25 in ca43989

    
           // MultiBurnrateAlert struct for MultiBurnrateAlert 
        
           type MultiBurnrateAlert struct { 
        
           	Severity string   `json:"severity"` 
        
           	For      int64    `json:"for"` 
        
           	Factor   float64  `json:"factor"` 
        
           	Short    Burnrate `json:"short"` 
        
           	Long     Burnrate `json:"long"` 
        
           	State    string   `json:"state"` 
        
           }

For Pyrra (frontend), maybe add some (optional) top-level grouping as well. That way you could add more SLOs for the same team and group them under a team page.

metalmatze · 2022-03-02T18:27:10Z

That sounds very reasonable @hsolberg 👍
Can you give a even more concrete example? I'd use that as the base line for implementing future unit tests to make sure things are properly propagated through all components.

hsolberg · 2022-03-03T14:45:29Z

Thanks for the fast reply! Maybe I should create a separate issue for the frontend part since it's a separate thing? If you prefer to have "user-stories" tickets or specific to what part needs to be changed (frontend or logic for generating alerts for instance).

Frontend
Regarding concrete examples. After conferring with a colleague (developer) to cover some scenarios, something like this for the presentation in Pyrra (frontend) perhaps?

|__<group-1>
|  |__<app-1>
|  |  |__ <metric-1>
|  |  |__ <metric-2>
|  |  .
|  |  .
|  |__<app-2>
|  |  |__<metric-1>
|  |  .
|
|__<group-2>
|  |__<app-1>
|  |  |__<metric-1>

Would look something like this example

|__team-api
|  |__api-app-auth
|  |  |__auth-error-rate
|  |  |__latency
|  |  |__system-error-rate
|  |__api-app-integration
|  |  |__system-error-rate
|
|__team-frontend
|  |__frontend-app
|  |  |__frontend-error-rate
|  |  |__frontend-latency

Alerts
As for the alerts, I already see the label slo being added based on what we put in the name-field in metadata (see example below). So that part should be easy to differentiate when the new Prometheus filter is added in the next version (hopefully). And the possibility to do use pre-filtered links from Pyrra in the near future. That only leaves the need for including namespace in the alert rule as well as possibility of adding more (custom).

apiVersion: pyrra.dev/v1alpha1
kind: ServiceLevelObjective
metadata:
  name: your-slo-name-here
  labels:
    prometheus: k8s
    role: alert-rules
spec:
  target: '99.0'
  window: 7d
  indicator:
    ratio:
      errors:
        metric: frontend_request_counter_total{status=~"5..",app="app-name"}
      total:
        metric: frontend_request_counter_total{app="app-name"}

Looking at the alert when firing with the rules above (this is the same as the last entry in the table Pyrra shows):

ALERTS{alertname="ErrorBudgetBurn", alertstate="firing", long="1d", severity="warning", short="1h30m", slo="your-slo-name-here"}

When comparing this to what Pyrra shows in the table, maybe this should be consistent with the labels added to the alert? Also, namespace is missing, but that's only visible in as a label under the objectives name and at the top of the objective you navigate to.

Since the alerts are generated it's a bit difficult to name them without being generic as they are now. One solution could be to add the exhaustion to the alertname as well as adding the labels. That way you get a quick overview without having to expand all the alerts in prometheus and use the labels to filter more. Adding namespace would also help with filtering, so maybe something like this?

ALERTS{alertname="ErrorBudgetBurn1w", alertstate="firing", long="1d", severity="warning", short="1h30m", slo="your-slo-name-here", namespace="your-namespace", exhaustion="1w", threshold="0.010", team="your-team"}

The last label would be a custom one just to be able to have the target responder for the alert. I didn't want to change too much of how it works now as I'm not a 100% sure of what it ideally would look like. So I'm trying to stick to the bare minimum in the examples and see if anyone else has other views and suggestions. I'll need to think it over some more, but this is the gist of it. 😅

metalmatze · 2022-03-03T18:20:12Z

Perfect. Thank you for that very in-depth comment. Indeed, we should probably track that in a separate issue to this one.
All that this really comes down to is propagating the labels all the way from the SLO CRD through Pyrra until they show up in the alerts.

Let's imagine this label is added in here:

apiVersion: pyrra.dev/v1alpha1
kind: ServiceLevelObjective
metadata:
  name: your-slo-name-here
  labels:
    prometheus: k8s
    role: alert-rules
+   team: pyrra
spec:
  target: '99.0'
  window: 7d
  indicator:
    ratio:
      errors:
        metric: frontend_request_counter_total{status=~"5..",app="app-name"}
      total:
        metric: frontend_request_counter_total{app="app-name"}

Then it should show up in the list page of Pyrra too as team=pyrra.

In the end the same label needs to be part of the alert

ALERTS{alertname="ErrorBudgetBurn1w", alertstate="firing", long="1d", severity="warning", short="1h30m", slo="your-slo-name-here", namespace="your-namespace", exhaustion="1w", threshold="0.010", team="pyrra"}

I'll open a separate issue and we'll look into it. 👍
At last you need to configure your alertmanager to route the alerts correctly based on the team label, as per our example.

hsolberg · 2022-10-05T07:26:45Z

Hi! A followup question, not sure if it should be asked here or in the closed ticket.
The namespace label is used in Pyrra UI etc. but it doesn't seem to be included in the generated alert-rules. Is this something that could be added by default? Or maybe it's already part of the recent changes to master that's not added to a specific release yet? 😄

metalmatze · 2022-11-18T17:53:20Z

Good point. I'm not sure why it never came up.
We should add it indeed.
Would you mind opening another Issue for this one?
Thanks

metalmatze changed the title ~~Improve generated prometheus-operator CRD objects?~~ Use Kubernetes Recommended Labels Nov 12, 2021

metalmatze mentioned this issue Mar 3, 2022

Add label propagation to Pyrra #157

Closed

hsolberg mentioned this issue Nov 19, 2022

Add namespace to generated alert rules #527

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use Kubernetes Recommended Labels #38

Use Kubernetes Recommended Labels #38

rlex commented Oct 21, 2021

metalmatze commented Oct 22, 2021

rlex commented Oct 22, 2021

hsolberg commented Feb 28, 2022

metalmatze commented Mar 2, 2022

hsolberg commented Mar 3, 2022

metalmatze commented Mar 3, 2022

hsolberg commented Oct 5, 2022

metalmatze commented Nov 18, 2022

Use Kubernetes Recommended Labels #38

Use Kubernetes Recommended Labels #38

Comments

rlex commented Oct 21, 2021

metalmatze commented Oct 22, 2021

rlex commented Oct 22, 2021

hsolberg commented Feb 28, 2022

metalmatze commented Mar 2, 2022

hsolberg commented Mar 3, 2022

metalmatze commented Mar 3, 2022

hsolberg commented Oct 5, 2022

metalmatze commented Nov 18, 2022