[SIEM] Check ML Job status on ML Rule execution #61715

rylnd · 2020-03-28T00:49:25Z

Summary

The first leg of this PR (diff) adds the actual enhancement: checking of ML Jobs during rule execution, and setting of rule status appropriately. Things required to get that to work:

Add explicit, optional dependency on ML plugin
Expose mlClient from ML plugin in order to use e.g. ml.jobsServiceProvider

If necessary, I'm happy to open a separate PR against 7.7 with just that work.

However, our rule executor function is large (and contains multitudes?), and the additional ML logic pushed us over the eslint complexity limit, so the second half of this PR is moving logic out of that function, notably:

Extracting a ruleStatusSavedObjectsClient to handle SO calls
Extracting a ruleStatusService to encapsulate logic of creating/updating/GCing RuleStatuses
Adds consistent rule metadata to every status/log message

Checklist

Unit or functional tests were updated or added to match the most common scenarios

For maintainers

This was checked for breaking API changes and was labeled appropriately

And use it during rule execution as well.

This was unintentionally removed in a previous merge commit.

This allows dependent plugins to leverage the exposed services without having to define their own ml paths, e.g. "ml.jobs"

These are pure functions and used on both the client and server.

This works, but unfortunately it pushes this executor function to a complexity of 25. We're gonna refactor this next.

These are used on both the frontend and the backend, and can be shared.

RuleStatusService holds the logic for updating the current status as well as adding an error status. It leverages a simple RuleStatusSavedObjectClient to handle the communication with SavedObjects. This removes the need for our specialized 'writeError', 'writeGap', and 'writeSuccess' functions, which duplicated much of the rule status logic and code. It also fixes a bug with gap failures, with should have been treated the same as other failures. NB that an error does not necessarily prevent the rule from running, as in the case of a gap or an ML Job not running. This also adds a buildRuleMessage helper to reduce the noise of generating logs/messages, and to make them more consistent.

We're not awaiting here, so we can just return the promise.

We weren't doing anything async here, and in fact the returning of a promise was causing a bug when we tried to spread it into our attributes object.

This mapping could be done within the ruleStatusService, but it lives outside it for now. Also renames the object holding these values to the more general 'result,' as creationSuccess implies it always succeeds.

Adds some tests, as well.

Only ruleStatusSavedObjectsClient receives a savedObjectsClient, the other functions receive the ruleStatusSavedObjectsClient * pluralizes savedObjects in ruleStatusSavedObjectsClient * Backfills tests

elasticmachine · 2020-03-28T00:49:27Z

Pinging @elastic/siem (Team:SIEM)

rylnd · 2020-03-29T23:06:08Z

Opened #61753 against 7.7. Once that's merged, this can go into 7.x.

dhurley14

LGTM - this cleans things up very nicely. Appreciate the wrappers around saved objects and creating a clean api for the rule status SO.

We were storing state in our RuleStatusClient, and consequently could get into a situation where that state did not reflect reality, and we would incorrectly try to delete a SavedObject that had already been deleted. Rather than try to store the _correct_ state in the service, we remove state entirely and just fetch our statuses on each action.

rylnd · 2020-03-30T19:47:10Z

@elasticmachine merge upstream

rylnd · 2020-03-30T19:49:10Z

@dhurley14 I added a bugfix/simplification in 7cb5c2c if you could give this one more pass.

I encountered a bug where you can't actually create multiple error statuses if you've already got 5 failures, and the fix was much easier to do here than it would be on #61753, so I think we should get this in 7.7 instead.

alvarezmelissa87

LGTM ⚡️

kibanamachine · 2020-03-30T21:26:17Z

💚 Build Succeeded

continuous-integration/kibana-ci/pull-request
Commit: 055daee

History

💚 Build #36964 succeeded a7f0a08

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

* Move isMlRule helper to a more general location And use it during rule execution as well. * Add error message back to rule error status This was unintentionally removed in a previous merge commit. * Expose mlClient as part of ML's Setup contract This allows dependent plugins to leverage the exposed services without having to define their own ml paths, e.g. "ml.jobs" * Move ML Job predicates to common folder These are pure functions and used on both the client and server. * WIP: Check ML Job status on ML Rule execution This works, but unfortunately it pushes this executor function to a complexity of 25. We're gonna refactor this next. * Move isMlRule and RuleType to common These are used on both the frontend and the backend, and can be shared. * Refactor Signal Rule executor to use RuleStatusService RuleStatusService holds the logic for updating the current status as well as adding an error status. It leverages a simple RuleStatusSavedObjectClient to handle the communication with SavedObjects. This removes the need for our specialized 'writeError', 'writeGap', and 'writeSuccess' functions, which duplicated much of the rule status logic and code. It also fixes a bug with gap failures, with should have been treated the same as other failures. NB that an error does not necessarily prevent the rule from running, as in the case of a gap or an ML Job not running. This also adds a buildRuleMessage helper to reduce the noise of generating logs/messages, and to make them more consistent. * Remove unneeded 'async' keywords We're not awaiting here, so we can just return the promise. * Make buildRuleStatusAttributes synchronous We weren't doing anything async here, and in fact the returning of a promise was causing a bug when we tried to spread it into our attributes object. * Fix incorrectly-named RuleStatus attributes This mapping could be done within the ruleStatusService, but it lives outside it for now. Also renames the object holding these values to the more general 'result,' as creationSuccess implies it always succeeds. * Move our rule message helpers to a separate file Adds some tests, as well. * Refactor how rule status objects interact Only ruleStatusSavedObjectsClient receives a savedObjectsClient, the other functions receive the ruleStatusSavedObjectsClient * pluralizes savedObjects in ruleStatusSavedObjectsClient * Backfills tests * Handle adding multiple errors during a single rule execution We were storing state in our RuleStatusClient, and consequently could get into a situation where that state did not reflect reality, and we would incorrectly try to delete a SavedObject that had already been deleted. Rather than try to store the _correct_ state in the service, we remove state entirely and just fetch our statuses on each action. Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>

* Move isMlRule helper to a more general location And use it during rule execution as well. * Add error message back to rule error status This was unintentionally removed in a previous merge commit. * Expose mlClient as part of ML's Setup contract This allows dependent plugins to leverage the exposed services without having to define their own ml paths, e.g. "ml.jobs" * Move ML Job predicates to common folder These are pure functions and used on both the client and server. * WIP: Check ML Job status on ML Rule execution This works, but unfortunately it pushes this executor function to a complexity of 25. We're gonna refactor this next. * Move isMlRule and RuleType to common These are used on both the frontend and the backend, and can be shared. * Refactor Signal Rule executor to use RuleStatusService RuleStatusService holds the logic for updating the current status as well as adding an error status. It leverages a simple RuleStatusSavedObjectClient to handle the communication with SavedObjects. This removes the need for our specialized 'writeError', 'writeGap', and 'writeSuccess' functions, which duplicated much of the rule status logic and code. It also fixes a bug with gap failures, with should have been treated the same as other failures. NB that an error does not necessarily prevent the rule from running, as in the case of a gap or an ML Job not running. This also adds a buildRuleMessage helper to reduce the noise of generating logs/messages, and to make them more consistent. * Remove unneeded 'async' keywords We're not awaiting here, so we can just return the promise. * Make buildRuleStatusAttributes synchronous We weren't doing anything async here, and in fact the returning of a promise was causing a bug when we tried to spread it into our attributes object. * Fix incorrectly-named RuleStatus attributes This mapping could be done within the ruleStatusService, but it lives outside it for now. Also renames the object holding these values to the more general 'result,' as creationSuccess implies it always succeeds. * Move our rule message helpers to a separate file Adds some tests, as well. * Refactor how rule status objects interact Only ruleStatusSavedObjectsClient receives a savedObjectsClient, the other functions receive the ruleStatusSavedObjectsClient * pluralizes savedObjects in ruleStatusSavedObjectsClient * Backfills tests * Handle adding multiple errors during a single rule execution We were storing state in our RuleStatusClient, and consequently could get into a situation where that state did not reflect reality, and we would incorrectly try to delete a SavedObject that had already been deleted. Rather than try to store the _correct_ state in the service, we remove state entirely and just fetch our statuses on each action. Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>

elasticmachine · 2021-09-23T14:37:27Z

Pinging @elastic/security-solution (Team: SecuritySolution)

rylnd added 12 commits March 27, 2020 19:05

Move isMlRule helper to a more general location

94a501c

And use it during rule execution as well.

Add error message back to rule error status

806be3c

This was unintentionally removed in a previous merge commit.

Expose mlClient as part of ML's Setup contract

449cbd4

This allows dependent plugins to leverage the exposed services without having to define their own ml paths, e.g. "ml.jobs"

Move ML Job predicates to common folder

909f566

These are pure functions and used on both the client and server.

WIP: Check ML Job status on ML Rule execution

3075e51

This works, but unfortunately it pushes this executor function to a complexity of 25. We're gonna refactor this next.

Move isMlRule and RuleType to common

e4b948c

These are used on both the frontend and the backend, and can be shared.

Remove unneeded 'async' keywords

7408de9

We're not awaiting here, so we can just return the promise.

Make buildRuleStatusAttributes synchronous

c4378aa

We weren't doing anything async here, and in fact the returning of a promise was causing a bug when we tried to spread it into our attributes object.

Fix incorrectly-named RuleStatus attributes

d80d0b3

This mapping could be done within the ruleStatusService, but it lives outside it for now. Also renames the object holding these values to the more general 'result,' as creationSuccess implies it always succeeds.

Move our rule message helpers to a separate file

c698169

Adds some tests, as well.

Refactor how rule status objects interact

a7f0a08

Only ruleStatusSavedObjectsClient receives a savedObjectsClient, the other functions receive the ruleStatusSavedObjectsClient * pluralizes savedObjects in ruleStatusSavedObjectsClient * Backfills tests

rylnd added Team:SIEM v8.0.0 release_note:skip Skip the PR/issue when compiling release notes v7.7.0 v7.8.0 labels Mar 28, 2020

rylnd requested review from dhurley14 and FrankHassanabad March 28, 2020 00:49

rylnd self-assigned this Mar 28, 2020

rylnd mentioned this pull request Mar 29, 2020

[SIEM] Check ML Jobs Status During Rule Execution (7.7) #61753

Closed

2 tasks

rylnd removed the v7.7.0 label Mar 29, 2020

dhurley14 approved these changes Mar 30, 2020

View reviewed changes

rylnd added the v7.7.0 label Mar 30, 2020

rylnd marked this pull request as ready for review March 30, 2020 19:46

rylnd requested review from a team as code owners March 30, 2020 19:46

Merge branch 'master' into check_job_status_on_ml_rule_execution

055daee

alvarezmelissa87 approved these changes Mar 30, 2020

View reviewed changes

rylnd merged commit 8b31ce0 into elastic:master Mar 30, 2020

rylnd deleted the check_job_status_on_ml_rule_execution branch March 30, 2020 21:35

rylnd mentioned this pull request Mar 30, 2020

[7.x] [SIEM] Check ML Job status on ML Rule execution (#61715) #61898

Merged

This was referenced Mar 30, 2020

[7.7] [SIEM] Check ML Job status on ML Rule execution (#61715) #61899

Merged

[SIEM] Feature: ML Rules #60301

Closed

MindyRS added the Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. label Sep 23, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SIEM] Check ML Job status on ML Rule execution #61715

[SIEM] Check ML Job status on ML Rule execution #61715

rylnd commented Mar 28, 2020

elasticmachine commented Mar 28, 2020

rylnd commented Mar 29, 2020

dhurley14 left a comment

rylnd commented Mar 30, 2020

rylnd commented Mar 30, 2020

alvarezmelissa87 left a comment

kibanamachine commented Mar 30, 2020

elasticmachine commented Sep 23, 2021

[SIEM] Check ML Job status on ML Rule execution #61715

[SIEM] Check ML Job status on ML Rule execution #61715

Conversation

rylnd commented Mar 28, 2020

Summary

Checklist

For maintainers

elasticmachine commented Mar 28, 2020

rylnd commented Mar 29, 2020

dhurley14 left a comment

Choose a reason for hiding this comment

rylnd commented Mar 30, 2020

rylnd commented Mar 30, 2020

alvarezmelissa87 left a comment

Choose a reason for hiding this comment

kibanamachine commented Mar 30, 2020

💚 Build Succeeded

History

elasticmachine commented Sep 23, 2021