-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple AlertConditions proposal #240
base: main
Are you sure you want to change the base?
Multiple AlertConditions proposal #240
Conversation
Discussion question: while writing this proposal I was not sure if it was best to use and/or logic or if we would prefer a different type. For example something like this: apiVersion: openslo/v2alpha1
kind: CompositeAlertCondition
metadata:
name: composite-condition
displayName: Composite Condition
spec:
operation: and # can be and/or
conditions: # inline or using refs
- alertConditionRef: # Ref to AlertCondition
- compositeAlertConditionRef: # Ref to another CompositeAlertCondition I slightly prefer the and/or keyword idea in the original proposal, but I am curious to hear opinions on this. |
Coming back to this I think that at this stage having such logic would be an overkill for the spec implementations and thus spec itself. The AND/OR logic can be still achieved with multiple Alert Policies (OR between them) and multiple conditions for a single policy (AND between them). |
So taking the initial example in this PR:
We would translate it like this? This now assumes that multiple conditions are AND between them, which is not the case in the current spec (but is documented to be maximum 1 at the moment, so in theory it should not be conflicting with current tools). apiVersion: openslo/v1
kind: AlertPolicy
metadata:
name: High-Burn-Rate-ShortWindow
displayName: High burn rate
spec:
conditions:
- kind: AlertCondition
spec:
condition:
kind: burnrate
threshold: 14.4
lookbackWindow: -1h
alertAfter: 5m
- kind: AlertCondition
spec:
condition:
kind: burnrate
threshold: 14.4
lookbackWindow: -1h
alertAfter: 1m
---
apiVersion: openslo/v1
kind: AlertPolicy
metadata:
name: High-Burn-Rate-LongWindow
displayName: High burn rate
spec:
conditions:
- kind: AlertCondition
spec:
condition:
kind: burnrate
threshold: 6
lookbackWindow: -6h
alertAfter: 5m
- kind: AlertCondition
spec:
condition:
kind: burnrate
threshold: 6
lookbackWindow: -30m
alertAfter: 5m I do believe this makes sense, and the implementation in go (and most languages) would much simpler than the initial proposal. The only problem I could see is that, if you link both Alert policies to a notification, it would send 2 notifications for the same problem in an OR condition. But I think working on that problem will be easier than trying to fit a full and/or logic as before |
Do you have some proposal for how we would deal with multiple AlertPolicies bound to a single SLO? I.e. how would we make that OR happen? 😁 |
Anyway, that's kind of irrelevant in form of this PR and as @nieomylnieja mentioned, just allowing multiple conditions is already an improvement. LGTM |
One idea that we have as we are discussing this at work with @Hy3n4 is that we could make a field in the AlertPolicy that would define the operation we are looking to create from some template in the order of the conditions or conditionRefs as defined in the policy. Something like:
... using zero-based indexing (we could do index + 1, idc) What do you think? |
My original idea was to move the alert conditions to a separate files so they could be referenced in the alert policy. Allowing for reuse of the alert conditions when needed. And then in the policy itself support a limited set of operations, and be able to reference to an alert condition. But never though you would want to multiple groups of group of conditions. Interesting. |
Ref: #218
This proposal introduces the capability to compose AlertConditions in an AlertPolicy using and/or logic. This should allow users to define complex conditions composition, which in turn will enable using better SRE practices like multi-window burn-rate alerting
The change is designed to be backward compatible with the current specification.