Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Alerting] Provide services to set context for recovered alerts #124972

Merged
merged 44 commits into from
Feb 21, 2022
Merged
Show file tree
Hide file tree
Changes from 31 commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
72255bb
Rename alert instance to alert and add create fn to alert factory
ymao1 Feb 2, 2022
93082d6
Rename alert instance to alert and add create fn to alert factory
ymao1 Feb 2, 2022
5875e52
Fixing types
ymao1 Feb 2, 2022
c254b9d
Fixing types
ymao1 Feb 2, 2022
4b0ef8b
Merge branch 'main' into alerting/alert-instance-to-alert
kibanamachine Feb 3, 2022
18ed715
Merge branch 'main' of https://github.com/elastic/kibana into alertin…
ymao1 Feb 4, 2022
6a7e256
Merge branch 'alerting/alert-instance-to-alert' of https://github.com…
ymao1 Feb 4, 2022
a4a56a6
Adding flag for rule types to opt into setting recovery context
ymao1 Feb 4, 2022
a5c2506
Merge branch 'main' into alerting/alert-instance-to-alert
kibanamachine Feb 7, 2022
be68324
Only showing context in action variable menu if flag set to true
ymao1 Feb 7, 2022
ba6a84f
Merge branch 'alerting/alert-instance-to-alert' of https://github.com…
ymao1 Feb 7, 2022
f0046f2
Merge branch 'main' of https://github.com/elastic/kibana into alertin…
ymao1 Feb 8, 2022
41e75cf
Adding recovery functions to createAlertFactory
ymao1 Feb 8, 2022
d6ed52b
Setting recovery in index threshold and fixing types
ymao1 Feb 8, 2022
cbc8402
Merge branch 'main' of https://github.com/elastic/kibana into alertin…
ymao1 Feb 8, 2022
98fd000
Fixing lint issues and some refactoring
ymao1 Feb 8, 2022
3e002aa
Merge branch 'main' of https://github.com/elastic/kibana into alertin…
ymao1 Feb 8, 2022
35be5b6
Cleanup
ymao1 Feb 8, 2022
fe49c49
Functional tests for index threshold rule recovery context
ymao1 Feb 8, 2022
5754075
Merge branch 'main' of https://github.com/elastic/kibana into alertin…
ymao1 Feb 8, 2022
5bb918b
Merge branch 'main' of https://github.com/elastic/kibana into alertin…
ymao1 Feb 9, 2022
8e585fe
Return array of recovered alerts instead of record
ymao1 Feb 9, 2022
e6e85f3
Merge branch 'main' of https://github.com/elastic/kibana into alertin…
ymao1 Feb 9, 2022
002f94b
Fixing types
ymao1 Feb 9, 2022
3bd343b
Fixing types
ymao1 Feb 9, 2022
3666f6c
Merge branch 'main' of https://github.com/elastic/kibana into alertin…
ymao1 Feb 10, 2022
61c6b77
Cleanup
ymao1 Feb 10, 2022
50bbc78
Merge branch 'main' of https://github.com/elastic/kibana into alertin…
ymao1 Feb 10, 2022
fbe5973
Handling nulls and more tests
ymao1 Feb 10, 2022
3b4753b
Merge branch 'main' of https://github.com/elastic/kibana into alertin…
ymao1 Feb 10, 2022
9365756
Updating developer docs
ymao1 Feb 10, 2022
fb54b5b
Merge branch 'main' into alerting/recovery-context-services-2
kibanamachine Feb 14, 2022
30d3f51
Merge branch 'main' into alerting/recovery-context-services-2
kibanamachine Feb 15, 2022
23234e6
Making getRecoveryAlerts non-optional
ymao1 Feb 16, 2022
b857955
Setting unknown in index threshold recovery value
ymao1 Feb 16, 2022
f1f1469
Merge branch 'main' of https://github.com/elastic/kibana into alertin…
ymao1 Feb 16, 2022
31e70c9
Merge branch 'main' into alerting/recovery-context-services-2
kibanamachine Feb 16, 2022
18bf52f
Merge branch 'main' into alerting/recovery-context-services-2
kibanamachine Feb 17, 2022
9372b31
Merge branch 'main' into alerting/recovery-context-services-2
kibanamachine Feb 17, 2022
afe0c45
PR feedback
ymao1 Feb 17, 2022
1a5df73
Merge branch 'main' of https://github.com/elastic/kibana into alertin…
ymao1 Feb 17, 2022
7e1d246
Adding a test
ymao1 Feb 17, 2022
f22c83e
Merge branch 'main' into alerting/recovery-context-services-2
kibanamachine Feb 21, 2022
181657d
Merge branch 'main' into alerting/recovery-context-services-2
kibanamachine Feb 21, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 39 additions & 1 deletion x-pack/plugins/alerting/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ Table of Contents
- [Methods](#methods)
- [Executor](#executor)
- [Action variables](#action-variables)
- [Recovered Alerts](#recovered-alerts)
- [Licensing](#licensing)
- [Documentation](#documentation)
- [Tests](#tests)
Expand Down Expand Up @@ -100,6 +101,7 @@ The following table describes the properties of the `options` object.
|isExportable|Whether the rule type is exportable from the Saved Objects Management UI.|boolean|
|defaultScheduleInterval|The default interval that will show up in the UI when creating a rule of this rule type.|boolean|
|minimumScheduleInterval|The minimum interval that will be allowed for all rules of this rule type.|boolean|
|doesSetRecoveryContext|Whether the rule type will set context variables for recovered alerts. Defaults to `false`. If this is set to true, context variables are made available for the recovery action group and executors will be provided with the ability to set recovery context.|boolean|

### Executor

Expand Down Expand Up @@ -170,6 +172,38 @@ This function should take the rule type params as input and extract out any save


This function should take the rule type params (with saved object references) and the saved object references array as input and inject the saved object ID in place of any saved object references in the rule type params. Note that any error thrown within this function will be propagated.

## Recovered Alerts
The Alerting framework automatically determines which alerts are recovered by comparing the active alerts from the previous rule execution to the active alerts in the current rule execution. Alerts that were active previously but not active currently are considered `recovered`. If any actions were specified on the Recovery action group for the rule, they will be scheduled at the end of the execution cycle.

Because this determination occurs after rule type executors have completed execution, the framework provides a mechanism for rule type executors to set contextual information for recovered alerts that can be templated and used inside recovery actions. In order to use this mechanism, the rule type must set the `doesSetRecoveryContext` flag to `true` during rule type registration.

Then, the following code would be added within a rule type executor. As you can see, when the rule type is finished creating and scheduling actions for active alerts, it should call `done()` on the alertFactory. This will give the executor access to the list recovered alerts for this execution cycle, for which it can iterate and set context.

The following code would be within a rule type. As you can see `cpuUsage` will replace the state of the alert and `server` is the context for the alert to execute. The difference between the two is that `cpuUsage` will be accessible at the next execution.

```
// Create and schedule actions for active alerts
for (const i = 0; i < 5; ++i) {
alertFactory
.create('server_1')
.scheduleActions('default', {
server: 'server_1',
});
}

// Call done() to gain access to recovery utils
const { getRecoveredAlerts } = alertsFactory.done();

if (getRecoveredAlerts) {
ymao1 marked this conversation as resolved.
Show resolved Hide resolved
for (const alert of getRecoveredAlerts()) {
const alertId = alert.getId();
alert.setContext({
server: <set something useful here>
})
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jasonrhodes This is the PR for providing the ability to set context on recovered alerts. Based on your feedback in the RFC, this is what I've settled on for actually setting the context. Instead of getting a list of recovered alerts ids and using a utility function to set context using the ID string, this returns a list of alerts, that you then call setContext() on.

We had also talked about changing scheduleActions() for active alert to something clearer like scheduleActionsAndSetContext() or scheduleAction().setContext(), which is not addressed in this PR but it would be a straightforward change if we wanted to do it in the future.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks!!!

}
```
## Licensing

Currently most rule types are free features. But some rule types are subscription features, such as the tracking containment rule.
Expand Down Expand Up @@ -743,6 +777,7 @@ This factory returns an instance of `Alert`. The `Alert` class has the following
|scheduleActions(actionGroup, context)|Call this to schedule the execution of actions. The actionGroup is a string `id` that relates to the group of alert `actions` to execute and the context will be used for templating purposes. `scheduleActions` or `scheduleActionsWithSubGroup` should only be called once per alert.|
|scheduleActionsWithSubGroup(actionGroup, subgroup, context)|Call this to schedule the execution of actions within a subgroup. The actionGroup is a string `id` that relates to the group of alert `actions` to execute, the `subgroup` is a dynamic string that denotes a subgroup within the actionGroup and the context will be used for templating purposes. `scheduleActions` or `scheduleActionsWithSubGroup` should only be called once per alert.|
|replaceState(state)|Used to replace the current state of the alert. This doesn't work like React, the entire state must be provided. Use this feature as you see fit. The state that is set will persist between rule executions whenever you re-create an alert with the same id. The alert state will be erased when `scheduleActions` or `scheduleActionsWithSubGroup` aren't called during an execution.|
|setContext(context)|Call this to set the context for this alert that is used for templating purposes.

### When should I use `scheduleActions` and `scheduleActionsWithSubGroup`?
The `scheduleActions` or `scheduleActionsWithSubGroup` methods are both used to achieve the same thing: schedule actions to be run under a specific action group.
Expand All @@ -758,13 +793,16 @@ Action Subgroups are dynamic, and can be defined on the fly.
This approach enables users to specify actions under specific action groups, but they can't specify actions that are specific to subgroups.
As subgroups fall under action groups, we will schedule the actions specified for the action group, but the subgroup allows the RuleType implementer to reuse the same action group for multiple different active subgroups.

### When should I use `setContext`?
`setContext` is intended to be used for setting context for recovered alerts. While rule type executors make the determination as to which alerts are active for an execution, the Alerting Framework automatically determines which alerts are recovered for an execution. `setContext` empowers rule type executors to provide additional contextual information for these recovered alerts that will be templated into actions.

## Templating Actions

There needs to be a way to map rule context into action parameters. For this, we started off by adding template support. Any string within the `params` of a rule saved object's `actions` will be processed as a template and can inject context or state values.

When an alert executes, the first argument is the `group` of actions to execute and the second is the context the rule exposes to templates. We iterate through each action parameter attributes recursively and render templates if they are a string. Templates have access to the following "variables":

- `context` - provided by context argument of `.scheduleActions(...)` and `.scheduleActionsWithSubGroup(...)` on an alert.
- `context` - provided by context argument of `.scheduleActions(...)`, `.scheduleActionsWithSubGroup(...)` and `setContext(...)` on an alert.
- `state` - the alert's `state` provided by the most recent `replaceState` call on an alert.
- `alertId` - the id of the rule
- `alertInstanceId` - the alert id
Expand Down
1 change: 1 addition & 0 deletions x-pack/plugins/alerting/common/rule_type.ts
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ export interface RuleType<
ruleTaskTimeout?: string;
defaultScheduleInterval?: string;
minimumScheduleInterval?: string;
doesSetRecoveryContext?: boolean;
enabledInLicense: boolean;
authorizedConsumers: Record<string, ConsumerPrivileges>;
}
Expand Down
Loading