Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RAC] [Rule Registry] Update old index mappings and index templates on change #108941

Closed
Tracked by #101016
xcrzx opened this issue Aug 17, 2021 · 14 comments
Closed
Tracked by #101016
Assignees
Labels
bug Fixes for quality problems that affect the customer experience Team:Detection Alerts Security Detection Alerts Area Team Team:Infra Monitoring UI - DEPRECATED DEPRECATED - Label for the Infra Monitoring UI team. Use Team:obs-ux-infra_services Theme: rac label obsolete v7.16.0

Comments

@xcrzx
Copy link
Contributor

xcrzx commented Aug 17, 2021

ℹ️ The description has been updated to reflect to most recent plans and understanding of the @elastic/logs-metrics-ui team.

📓 Summary

It might happen that index template configuration changes on the solutions side (e.g., added new component template):

ruleDataClient = ruleDataService.initializeIndex({
  feature: SERVER_APP_ID,
  registrationContext: 'security',
  dataset: Dataset.alerts,
  componentTemplateRefs: [ECS_COMPONENT_TEMPLATE_NAME],
-  componentTemplates: [],
+  componentTemplates: [
+   {
+     name: 'mappings',
+     version: 0,
+     mappings: mappingFromFieldMap(
+        { ...alertsFieldMap, ...rulesFieldMap, ...ctiFieldMap },
+        false
+      ),
+    },
+  ],
  indexTemplate: {
    version: 0,
  },
  secondaryAlias: config.signalsIndex,
});

Changes like these, currently, have no effect if the index has been previously initialized. After initialization, all the subsequent changes to the index template are ignored. We need to find a solution to updated index templates if changes happen.

part of #101016
supersedes #110945

✔️ Acceptance criteria

  • the namespace-independent initialization steps for a given registration context are performed during the setup phase:
    • the ILM policy is created/updated (which is already the case)
    • the component template are created/updated (which is already the case)
    • namespace-specific index and index template operations don't take place here (need to be removed)
  • the namespace-specific initialization steps for a given namespace are performed when a namespace's writer is first created
    • index template is created/updated (needs to be implemented)
      • composed of the component templates matching the current version
      • with the current version in the _meta object
    • all indices backing the namespace-specific aliases are updated to have the new mappings
  • the writer is cached (keyed by namespace) within the RuleDataClient such that the initialization only happens once per namespace
  • the index update feature flag is removed
  • the writer.bulk (or successor methods) doesn't perform any initialization operation anymore

💡 Implementation hints

@xcrzx xcrzx added Team:Detections and Resp Security Detection Response Team Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. Theme: rac label obsolete labels Aug 17, 2021
@xcrzx xcrzx self-assigned this Aug 17, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/security-detections-response (Team:Detections and Resp)

@elasticmachine
Copy link
Contributor

Pinging @elastic/security-solution (Team: SecuritySolution)

@xcrzx
Copy link
Contributor Author

xcrzx commented Aug 25, 2021

Summary of discussion with @weltenwort @banderror @marshallmain

We are going to implement the following changes to the bootstrapping logic:

  1. Perform the namespaced resources bootstrapping unconditionally when a user calls getWriter():
    1. Install or update index template
    2. Update existing index mappings
    3. Perform index rollover if needed
  2. Rename getWriter() to initializeWriter()
  3. Remove error handling/retry logic from the bulk() method
  4. Remove the mappings update logic from installIndexLevelResources() that runs during the Kibana startup.

@banderror
Copy link
Contributor

Probably "Perform index rollover if needed or create an initial concrete index if needed"

@xcrzx
Copy link
Contributor Author

xcrzx commented Aug 25, 2021

Probably "Perform index rollover if needed or create an initial concrete index if needed"

I think index creation should happen conditionally after the index template update:

@banderror banderror changed the title [RAC] Rule registry index bootstrapping: implement upgrades of existing index templates [RAC][Rule Registry] Index bootstrapping: implement upgrades of existing index templates Sep 1, 2021
@banderror banderror removed the v7.16.0 label Sep 1, 2021
@weltenwort weltenwort added the Team:Infra Monitoring UI - DEPRECATED DEPRECATED - Label for the Infra Monitoring UI team. Use Team:obs-ux-infra_services label Sep 15, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/logs-metrics-ui (Team:logs-metrics-ui)

@weltenwort weltenwort added v7.16.0 and removed Team:Detections and Resp Security Detection Response Team Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. labels Sep 15, 2021
@weltenwort weltenwort changed the title [RAC][Rule Registry] Index bootstrapping: implement upgrades of existing index templates [RAC] [Rule Registry] Index bootstrapping: implement upgrades of existing index templates Sep 15, 2021
@weltenwort
Copy link
Member

I adopted this issue for the work that the @elastic/logs-metrics-ui team is planning to do for 7.16.0. The description now represents the most recent plans and understanding.

@weltenwort weltenwort changed the title [RAC] [Rule Registry] Index bootstrapping: implement upgrades of existing index templates [RAC] [Rule Registry] Update index templates and aggressive rollover during namespace-level bootstrapping Sep 15, 2021
@weltenwort weltenwort changed the title [RAC] [Rule Registry] Update index templates and aggressive rollover during namespace-level bootstrapping [RAC] [Rule Registry] Update index templates and perform rollover during namespace-level bootstrapping Sep 15, 2021
@weltenwort weltenwort changed the title [RAC] [Rule Registry] Update index templates and perform rollover during namespace-level bootstrapping [RAC] [Rule Registry] Update old index mappings and index templates on change Sep 16, 2021
@Kerry350 Kerry350 self-assigned this Sep 20, 2021
@banderror
Copy link
Contributor

I adopted this issue for the work that the @elastic/logs-metrics-ui team is planning to do for 7.16.0. The description now represents the most recent plans and understanding.

Thank you for the clear AC @weltenwort, looks great 👍

I have 2 points which I'd like to double-check:

@Kerry350
Copy link
Contributor

@banderror

@weltenwort is away on PTO for the next 2 weeks, so I'll do my best to fill in and answer questions.

"the index update feature flag is removed" - are you sure about that, considering the ongoing debate with Brandon around the "moderately aggressive rollover strategy" vs "additive only changes strategy"?

At the very end of last week, in the interest of shipping something for 7.16, we decided to move forward with the additive only changes approach suggested by Brandon. (With the understanding we can still pivot to the "moderately aggressive rollover strategy" in the future, if needed).

This would be simpler as we don't need to build the document downgrade / versioning mechanism yet. Whilst simultaneously providing more time for various parties to discuss the backwards compatibility story further, write RFCs etc.

This ticket: #112329 remains unnecessary for now (but would be needed if we change direction moving forward).

Based on this, it seems appropriate to remove the feature flag. And for now #110795 wouldn't be needed.

Are we on the same page that we're not going to change the current index upgrade logic itself ("update mappings of the write index; rollover if failed") as part of this ticket?

The part that would change (as I understand it) is that we would no longer rollover. If something goes wrong in applying the mapping updates then it is basically a catastrophic failure. We will retry in the case of something....random...going wrong (e.g. not a mapping conflict) but after that we would need to fail in such a way that we sufficiently log the issue, and make sure we don't break the overall alerting framework (e.g. executor runs etc). We still need to discuss this failure recovery more, and also how we can test for incompatible changes from developers (as in someone accidentally adds something non-additive).

cc @jasonrhodes

@mgiota
Copy link
Contributor

mgiota commented Sep 21, 2021

@Kerry350 Very well summary of what we have been discussing over last few days. One more thing I have in my notes is the lazy vs eager bootstrapping phase. Did we agree on one versus the other?

@Kerry350
Copy link
Contributor

@mgiota

One more thing I have in my notes is the lazy vs eager bootstrapping phase. Did we agree on one versus the other?

If I remember correctly it was decided that switching from eager to lazy should be possible, but we didn't necessarily make a concrete decision on whether that would happen. I would say this will become more apparent with implementation.

@jasonrhodes
Copy link
Member

Thank you @Kerry350! @banderror / @xcrzx thanks so much for all of your help with wrangling all of this. I know we are still having some ongoing conversations about how to make this process work for everyone. We're going to start in on this implementation but with the understanding that things could still change, possibly. Thanks again for your patience.

@marshallmain
Copy link
Contributor

Re: eager vs lazy index initialization (#108941 (comment)) @XavierM and @stephmilovic are working on Timeline and are finding that it may be simpler to implement UI elements that interact with alerts as data indices if the indices are created earlier. It's more complicated from a UI perspective if the indices may not be created until the first alert is triggered, whereas if we know the indices will be created when the first rule runs we can simplify things.

@banderror banderror added the Team:Detection Alerts Security Detection Alerts Area Team label Oct 11, 2021
@banderror
Copy link
Contributor

Hey everyone, I removed this ticket from the backlog of the Detection Rules area (@elastic/security-detections-response-rules), and added to the backlog of Detection Alerts area (Team:Detection Alerts label) to keep track of it from the Security side. Please ping @peluja1012 and @marshallmain if you have any questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes for quality problems that affect the customer experience Team:Detection Alerts Security Detection Alerts Area Team Team:Infra Monitoring UI - DEPRECATED DEPRECATED - Label for the Infra Monitoring UI team. Use Team:obs-ux-infra_services Theme: rac label obsolete v7.16.0
Projects
None yet
Development

No branches or pull requests

8 participants