Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Security Solution]Legacy actions are deleted when user tries to save a rule and the run action interval is slower than rule run interval #157462

Closed
vgomez-el opened this issue May 12, 2023 · 13 comments
Assignees
Labels
bug Fixes for quality problems that affect the customer experience fixed impact:high Addressing this issue will have a high level of impact on the quality/strength of our product. Team:Detection Engine Security Solution Detection Engine Area Team:Detections and Resp Security Detection Response Team Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. v8.9.1

Comments

@vgomez-el
Copy link

Describe the bug:

  • When a user adds legacy actions to a rule via API and the actions run interval is lower than rule run interval and afterwards edits the rule via kibana UI, after pressing the Save Changes button, legacy action is "migrated", so sidecar SO is deleted, but rule cannot be saved, because an error message is displayed, which leads to lose the legacy action and leaves the rule without migrated action.

Kibana/Elasticsearch Stack version:

  • 8.8.0-BC3

Original install method (e.g. download page, yum, from source, etc.):

  • Cloud Instance

Functional Area (e.g. Endpoint management, timelines, resolver, etc.):

  • Detection Alerts

Initial setup

User must have a rule with one legacy action ( rule run interval should be 5 minutes or more) and a legacy action injected using this PR steps

Steps to reproduce:

  1. Edit the rule that contains the legacy action
  2. Go to Actions tab and assure Action run interval is lower than Rule run Scheduled interval
  3. Press Save Changes button

Current behavior:

  • Error message is displayed and Rule is not saved but legacy action is deleted which means that data is lost

Expected behavior:

  • Legacy action should not be deleted if rule cannot be saved and a warning message about the invalid action interval should be displayed

Screen recording:

REC-20230512113308.mp4

Any additional context (logs, chat logs, magical formulas, etc.):
-This bug will eventually be fixed once this task is done: #155502

@vgomez-el vgomez-el added bug Fixes for quality problems that affect the customer experience triage_needed Team:Detections and Resp Security Detection Response Team Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. Team:Detection Alerts Security Detection Alerts Area Team labels May 12, 2023
@elasticmachine
Copy link
Contributor

Pinging @elastic/security-solution (Team: SecuritySolution)

@elasticmachine
Copy link
Contributor

Pinging @elastic/security-detections-response (Team:Detections and Resp)

@yctercero yctercero added Team:Detection Engine Security Solution Detection Engine Area v8.8.0 and removed triage_needed labels May 13, 2023
@yctercero yctercero removed the Team:Detection Alerts Security Detection Alerts Area Team label May 13, 2023
@peluja1012 peluja1012 added impact:high Addressing this issue will have a high level of impact on the quality/strength of our product. v8.9.0 and removed v8.8.0 labels May 16, 2023
@yctercero
Copy link
Contributor

Note: occurring in older versions as well.

e40pud added a commit that referenced this issue Jul 31, 2023
…e a rule and the run action interval is slower than rule run interval (#160798)

## Summary

Original ticket: #157462

With these changes we fix the legacy actions data loss (on migration)
issue. One of the first steps of the migration we retrieve legacy
actions and immediately delete them. Then we do validation which might
throw an exception and all legacy actions will be lost in this case.

As a solution we will do legacy actions validation before deleting them
and throwing exception in case those are broken. This means that in case
legacy action is broken user will need to export the rule, fix it
manually and import it again. Or just re-create it from scratch.


https://github.com/elastic/kibana/assets/2700761/a23f5d43-3758-4ab7-8e63-bd93016e338d
kibanamachine pushed a commit to kibanamachine/kibana that referenced this issue Jul 31, 2023
…e a rule and the run action interval is slower than rule run interval (elastic#160798)

## Summary

Original ticket: elastic#157462

With these changes we fix the legacy actions data loss (on migration)
issue. One of the first steps of the migration we retrieve legacy
actions and immediately delete them. Then we do validation which might
throw an exception and all legacy actions will be lost in this case.

As a solution we will do legacy actions validation before deleting them
and throwing exception in case those are broken. This means that in case
legacy action is broken user will need to export the rule, fix it
manually and import it again. Or just re-create it from scratch.

https://github.com/elastic/kibana/assets/2700761/a23f5d43-3758-4ab7-8e63-bd93016e338d
(cherry picked from commit 8ca90fb)
ThomThomson pushed a commit to ThomThomson/kibana that referenced this issue Aug 1, 2023
…e a rule and the run action interval is slower than rule run interval (elastic#160798)

## Summary

Original ticket: elastic#157462

With these changes we fix the legacy actions data loss (on migration)
issue. One of the first steps of the migration we retrieve legacy
actions and immediately delete them. Then we do validation which might
throw an exception and all legacy actions will be lost in this case.

As a solution we will do legacy actions validation before deleting them
and throwing exception in case those are broken. This means that in case
legacy action is broken user will need to export the rule, fix it
manually and import it again. Or just re-create it from scratch.


https://github.com/elastic/kibana/assets/2700761/a23f5d43-3758-4ab7-8e63-bd93016e338d
kibanamachine referenced this issue Aug 3, 2023
…to save a rule and the run action interval is slower than rule run interval (#160798) (#162779)

# Backport

This will backport the following commits from `main` to `8.9`:
- [[Security Solution] Legacy actions are deleted when user tries to
save a rule and the run action interval is slower than rule run interval
(#160798)](#160798)

<!--- Backport version: 8.9.7 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Ievgen
Sorokopud","email":"ievgen.sorokopud@elastic.co"},"sourceCommit":{"committedDate":"2023-07-31T09:51:38Z","message":"[Security
Solution] Legacy actions are deleted when user tries to save a rule and
the run action interval is slower than rule run interval (#160798)\n\n##
Summary\r\n\r\nOriginal ticket:
https://github.com/elastic/kibana/issues/157462\r\n\r\nWith these
changes we fix the legacy actions data loss (on migration)\r\nissue. One
of the first steps of the migration we retrieve legacy\r\nactions and
immediately delete them. Then we do validation which might\r\nthrow an
exception and all legacy actions will be lost in this case.\r\n\r\nAs a
solution we will do legacy actions validation before deleting
them\r\nand throwing exception in case those are broken. This means that
in case\r\nlegacy action is broken user will need to export the rule,
fix it\r\nmanually and import it again. Or just re-create it from
scratch.\r\n\r\n\r\nhttps://github.com/elastic/kibana/assets/2700761/a23f5d43-3758-4ab7-8e63-bd93016e338d","sha":"8ca90fbfc3f5f201e12053d9675c41b0906b8f9e","branchLabelMapping":{"^v8.10.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:fix","Team:ResponseOps","Team:
SecuritySolution","ci:cloud-deploy","Team:Detection
Engine","v8.10.0","v8.9.1"],"number":160798,"url":"https://github.com/elastic/kibana/pull/160798","mergeCommit":{"message":"[Security
Solution] Legacy actions are deleted when user tries to save a rule and
the run action interval is slower than rule run interval (#160798)\n\n##
Summary\r\n\r\nOriginal ticket:
https://github.com/elastic/kibana/issues/157462\r\n\r\nWith these
changes we fix the legacy actions data loss (on migration)\r\nissue. One
of the first steps of the migration we retrieve legacy\r\nactions and
immediately delete them. Then we do validation which might\r\nthrow an
exception and all legacy actions will be lost in this case.\r\n\r\nAs a
solution we will do legacy actions validation before deleting
them\r\nand throwing exception in case those are broken. This means that
in case\r\nlegacy action is broken user will need to export the rule,
fix it\r\nmanually and import it again. Or just re-create it from
scratch.\r\n\r\n\r\nhttps://github.com/elastic/kibana/assets/2700761/a23f5d43-3758-4ab7-8e63-bd93016e338d","sha":"8ca90fbfc3f5f201e12053d9675c41b0906b8f9e"}},"sourceBranch":"main","suggestedTargetBranches":["8.9"],"targetPullRequestStates":[{"branch":"main","label":"v8.10.0","labelRegex":"^v8.10.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/160798","number":160798,"mergeCommit":{"message":"[Security
Solution] Legacy actions are deleted when user tries to save a rule and
the run action interval is slower than rule run interval (#160798)\n\n##
Summary\r\n\r\nOriginal ticket:
https://github.com/elastic/kibana/issues/157462\r\n\r\nWith these
changes we fix the legacy actions data loss (on migration)\r\nissue. One
of the first steps of the migration we retrieve legacy\r\nactions and
immediately delete them. Then we do validation which might\r\nthrow an
exception and all legacy actions will be lost in this case.\r\n\r\nAs a
solution we will do legacy actions validation before deleting
them\r\nand throwing exception in case those are broken. This means that
in case\r\nlegacy action is broken user will need to export the rule,
fix it\r\nmanually and import it again. Or just re-create it from
scratch.\r\n\r\n\r\nhttps://github.com/elastic/kibana/assets/2700761/a23f5d43-3758-4ab7-8e63-bd93016e338d","sha":"8ca90fbfc3f5f201e12053d9675c41b0906b8f9e"}},{"branch":"8.9","label":"v8.9.1","labelRegex":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"}]}]
BACKPORT-->

Co-authored-by: Ievgen Sorokopud <ievgen.sorokopud@elastic.co>
@e40pud e40pud added the fixed label Aug 3, 2023
@e40pud
Copy link
Contributor

e40pud commented Aug 3, 2023

This fix has been merged into main and 8.9 branches.
cc @vgomez-el @MadameSheema @yctercero @nastasha-solomon

@MadameSheema MadameSheema added v8.9.1 and removed v8.9.0 labels Aug 3, 2023
@ghost
Copy link

ghost commented Aug 14, 2023

Hi @MadameSheema

we have validated this issue by upgrading the 8.8.0 to 8.9.0 and found that issue is still occuring . While duplicate the rule with legacy rule action there are error ❌

Moreover we are not able to add the legacy action directly on 8.9.1 directly as PR steps are not running on the 8.9 rule.

Build Details:

Version: 8.9.1 BC1
Commit: 6c664aeb22673e6eb42348ea50b5a098509f7deb
Build: 64802

Additional observation of 8.9.1 Direct Testing

we are following this pr steps to add legacy rule action ( in our case it was slack ) to rule present on 8.9.1 by running the post_legacy action.js <alert_id> but on 8.9.1 this script is returning error and not adding the rule action to target rule id passed along the script.

tried with different rule ID

image

Screen-Cast

Rules.-.Kibana.Mozilla.Firefox.2023-08-11.17-06-38.mp4

@MadameSheema
Copy link
Member

@e40pud may you please help @karanbirsingh-qasource to validate the fix? Thanks!

@yctercero
Copy link
Contributor

@karanbirsingh-qasource would it be possible to validate by first creating a legacy action in 7.16 then upgrading to latest? I only ask this because even the API to create legacy actions does not fully mock out the actual behavior so the best way to test legacy actions stuff accurately is to go through the flow of creating an action in <7.16 (before the new action migration code was introduced) and then upgrading.

@e40pud
Copy link
Contributor

e40pud commented Aug 14, 2023

@karanbirsingh-qasource It is expected that duplication does not work with the broken legacy actions. Since action is not valid there is no good solution to fix it programatically on migration. Thus, we prevent user from migrating legacy actions in such cases to avoid data loss. Any operation that triggers legacy action migration (enable, disable, rule update, rule duplicate etc.) will throw an exception about invalid actions state.

@e40pud
Copy link
Contributor

e40pud commented Aug 14, 2023

I will have a look why scripts to create legacy actions don't work in 8.9, but as @yctercero mentioned we can always create invalid legacy actions in older kibana and upgrade it to latest.

@ghost
Copy link

ghost commented Aug 16, 2023

thanks @yctercero for the comment sure we have checked the issue by upgrading from 7.16.0 to 8.9.1 ( we have to upgrade 7.16.0 first to 7.17.2 in order to upgrade to 8.9.1) and found the issue still occuring

we are getting Failed to migrate legacy actions for SIEM rule 16243f40-3b2f-11ee-9589-9dd09d7bfb2b: Failed to validate actions due to the following error: Action frequency cannot be shorter than the schedule interval of 5m: default (1m) on editing/enabling the rule with legacy action.

Build Details:

Version: 7.16.0
Commit:b7b724b4f677db8b3ca0b7d5a35a822fef9766a0
Build:46197
Version: 8.9.1
Commit:6c664aeb22673e6eb42348ea50b5a098509f7deb
Build: 64802

Observations:

  • Created slack connector on 7.16.0
  • Ran below query in Dev Tool and fetched the above connector action id
GET .kibana/_search
{
  "query": {
    "term": {
      "type": "action"
    }
  }
}
  • Edited the one_action.json and replace with above fetched the action id under "id" field
  • Created custom rule by the name of cmd_custom_rule on kibana but not added any connector to it
  • Copied the rule id from the browser URL
  • Ran the post_legacy_notification.sh <above copied rule id>
  • Slack action got added to rule
Rules.-.Kibana.Mozilla.Firefox.2023-08-15.11-31-59.mp4
  • Generated some alert to test slack connector working on 7.16.0 instance ✔️
    image

  • Upgraded the instance to 7.17.2 first due to below attached reason
    image

  • Upgraded the instance to 8.9.1

  • Checking legacy action present on upgrade kibana version 8.9.1

Console.-.Dev.Tools.-.Elastic.Mozilla.Firefox.Private.Browsing.2023-08-15.12-01-16.mp4
  • Tried to enable existing rule cmd_custom_rule on kibana 8.9.1 but throws error ❌ Failed to migrate legacy actions for SIEM rule 16243f40-3b2f-11ee-9589-9dd09d7bfb2b: Failed to validate actions due to the following error: Action frequency cannot be shorter than the schedule interval of 5m: default (1m)
Rules.-.Kibana.Mozilla.Firefox.Private.Browsing.2023-08-15.12-01-45.mp4
Rule Enable Error Logs

{
  "name": "Error",
  "body": {
    "statusCode": 500,
    "error": "Internal Server Error",
    "message": "Bulk edit failed",
    "attributes": {
      "errors": [
        {
          "message": "Failed to migrate legacy actions for SIEM rule 16243f40-3b2f-11ee-9589-9dd09d7bfb2b: Failed to validate actions due to the following error: Action frequency cannot be shorter than the schedule interval of 5m: default (1m)",
          "status_code": 400,
          "rules": [
            {
              "id": "16243f40-3b2f-11ee-9589-9dd09d7bfb2b",
              "name": "cmd_custom_rule"
            }
          ]
        }
      ],
      "results": {
        "updated": [],
        "created": [],
        "deleted": [],
        "skipped": []
      },
      "summary": {
        "failed": 1,
        "succeeded": 0,
        "skipped": 0,
        "total": 1
      }
    }
  },
  "stack": "{\n  \"statusCode\": 500,\n  \"error\": \"Internal Server Error\",\n  \"message\": \"Bulk edit failed\",\n  \"attributes\": {\n    \"errors\": [\n      {\n        \"message\": \"Failed to migrate legacy actions for SIEM rule 16243f40-3b2f-11ee-9589-9dd09d7bfb2b: Failed to validate actions due to the following error: Action frequency cannot be shorter than the schedule interval of 5m: default (1m)\",\n        \"status_code\": 400,\n        \"rules\": [\n          {\n            \"id\": \"16243f40-3b2f-11ee-9589-9dd09d7bfb2b\",\n            \"name\": \"cmd_custom_rule\"\n          }\n        ]\n      }\n    ],\n    \"results\": {\n      \"updated\": [],\n      \"created\": [],\n      \"deleted\": [],\n      \"skipped\": []\n    },\n    \"summary\": {\n      \"failed\": 1,\n      \"succeeded\": 0,\n      \"skipped\": 0,\n      \"total\": 1\n    }\n  }\n}",
  "message": "Internal Server Error"
}

  • Editing the rule and clicking on save throws error ❌ But in our case the connector dont get removed from the rule
Editing Rule Error Logs

{
  "name": "Error",
  "body": {
    "message": "Failed to migrate legacy actions for SIEM rule 16243f40-3b2f-11ee-9589-9dd09d7bfb2b: Failed to validate actions due to the following error: Action frequency cannot be shorter than the schedule interval of 5m: default (1m)",
    "status_code": 400
  },
  "message": "Bad Request",
  "stack": "http_fetch_error_HttpFetchError@https://c495067b1d5b440fbbefde3bb3e8acea.europe-west1.gcp.cloud.es.io:9243/64802/bundles/core/core.entry.js:1:270487\nfetchResponse@https://c495067b1d5b440fbbefde3bb3e8acea.europe-west1.gcp.cloud.es.io:9243/64802/bundles/core/core.entry.js:1:274385\n"
}

c.c @MadameSheema @vgomez-el

@e40pud
Copy link
Contributor

e40pud commented Aug 16, 2023

All the described behaviours above are expected.

The main goal is to avoid data loss (legacy actions) during the migration. Thus, we throw an exception when user tries to migrate the rule with invalid legacy actions to prevent legacy actions to be removed.

There will be a known issue added to security docs elastic/security-docs#3572 saying that Rule changes can’t be saved if the rule’s action frequency is shorter than the rule’s run interval. (elastic/security-docs#3722)

@ghost
Copy link

ghost commented Aug 16, 2023

ok @e40pud thanks for looking into the observation. so can we close this issue or not and track other issue separately.

@e40pud
Copy link
Contributor

e40pud commented Aug 17, 2023

I would say yes. Thank you for performing thorough testing!

@ghost ghost closed this as completed Aug 17, 2023
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes for quality problems that affect the customer experience fixed impact:high Addressing this issue will have a high level of impact on the quality/strength of our product. Team:Detection Engine Security Solution Detection Engine Area Team:Detections and Resp Security Detection Response Team Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. v8.9.1
Projects
None yet
Development

No branches or pull requests

6 participants