Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ResponseOps] Some rules are getting skipped when performing bulkEnableRules and bulkDisableRules #181050

Open
xcrzx opened this issue Apr 17, 2024 · 3 comments
Labels
bug Fixes for quality problems that affect the customer experience Feature:Alerting/RulesFramework Issues related to the Alerting Rules Framework sdh-linked Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams)

Comments

@xcrzx
Copy link
Contributor

xcrzx commented Apr 17, 2024

Related to: #177634

Summary

When performing bulk enable or disable operations, some rules are getting skipped. For the bulk enable operation, it skips all rules that are already enabled, and for bulk disable, it skips those that are already disabled. This behavior creates several issues when using the rules client.

1. Inconsistent responses from RulesCLient

If you call rulesClient.bulkEnableRules({ ids: ['ruleId1', 'ruleId2'] }) with the same arguments but affected rules in different enabled/disabled states, you can get three different responses, making this call non-idempotent. Here are some examples:

Both rules initially disabled:

{ 
  "errors": [], 
  "rules": [{...}, {...}], // <- All rules are correctly returned
  "total": 2 // <- The total is correct
}

One rule disabled, one enabled:

{ 
  "errors": [], 
  "rules": [{...}], // <- Only one rule is returned
  "total": 2 // <- The total is correct
}

Both rules initially enabled:

{ 
  "errors": [], 
  "rules": [], // <- No rules
  "total": 2 // <- The total is correct
}

Expected responses

I expect the rules passed as params to end up either in the errors array or the rules array in the returned object regardless of their initial state.

2. Discrepancy between the rule state and task manager

When enabling rules, the Alerting framework skips rules if their saved object has alert.attributes.enabled: true. This behavior creates issues described here and in the attached SDH. There might be a situation where a rule is marked as enabled, but no corresponding task exists in the Task Manager. In the UI, these rules will appear enabled but will never run. Users expect that all rules affected by the bulk enable action will get a corresponding task created in the Task Manager and be scheduled for execution. Therefore, it would be best to also check if the rules to be enabled have tasks in the Task Manager, instead of relying solely on the rule's current enabled state.

@xcrzx xcrzx added bug Fixes for quality problems that affect the customer experience Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) labels Apr 17, 2024
@elasticmachine
Copy link
Contributor

Pinging @elastic/response-ops (Team:ResponseOps)

@banderror
Copy link
Contributor

@xcrzx Thanks a lot for opening this ticket! I would imagine that this bug affects our bulk actions endpoint through which detection rules are being enabled/disabled.

  • Could you describe the impact here and what workaround we use, and where it is in the code?
  • When someone fixes this bug, will they need to also update any code in security_solution?

@xcrzx xcrzx changed the title [ResponseOps] Inconsistent bulkEnableRules and bulkDisableRules response [ResponseOps] Some rules are getting skipped when performing bulkEnableRules and bulkDisableRules Apr 22, 2024
@xcrzx
Copy link
Contributor Author

xcrzx commented Apr 22, 2024

  • Could you describe the impact here and what workaround we use, and where it is in the code?
  • When someone fixes this bug, will they need to also update any code in security_solution?

We have implemented a workaround in our code where we loop through the original rule array instead of relying on the response from RulesClient. Yes, it would be best to update this code as well when the issue is resolved.

// We need to go through the original rules array and update rules that were
// not returned as failed from the bulkEnableRules. We cannot rely on the
// results from the bulkEnableRules because the response is not consistent.
// Some rules might be missing in the response if they were skipped by
// Alerting Framework. See this issue for more details:
// https://github.com/elastic/kibana/issues/181050
updatedRules.push(
...rules.flatMap((rule) => {
if (failedRuleIds.includes(rule.id)) {
return [];
}
return {
...rule,
enabled: operation === 'enable',
};
})
);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes for quality problems that affect the customer experience Feature:Alerting/RulesFramework Issues related to the Alerting Rules Framework sdh-linked Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams)
Projects
None yet
Development

No branches or pull requests

4 participants