Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separate "Metricbeat index aliases" from "xpack.enabled fork removal", revert fork removal #26480

Open
jasonrhodes opened this issue Jun 24, 2021 · 8 comments
Assignees
Labels
Feature:Stack Monitoring Team:Integrations Label for the Integrations team Team:Obs-DC Label for the Data Collection team

Comments

@jasonrhodes
Copy link
Member

jasonrhodes commented Jun 24, 2021

UPDATED: 28-10-2021

Related: #24427

Original Context

In this PR: #19747, we outlined a number of requirements for Stack Monitoring data collection.

This PR prepares the elasticsearch/node_stats metricset (without setting xpack.enabled: true) to collect and index data for the Stack Monitoring UI. Concretely, it does the following:

  • introduces new fields, collecting data needed by the Stack Monitoring UI. These fields are named following Metricbeat/ECS naming conventions.
  • introduces field aliases from current fields known to the Stack Monitoring UI to the fields defined by the metricset. This will aid transitioning the Stack Monitoring UI in 7.x to be driven off the metricset with xpack.enabled omitted. In 8.0 the aliases will be removed and the Stack Monitoring UI code will have to be overhauled to reference the new fields directly.
  • removes the data_xpack.go file implementing the xpack.enabled: true code path for the node_stats metricset. Going forward, the node_stats metricset will simply ignore the value of the xpack.enabled setting.

Points (1) and (2) mentioned here involve making sure that the various Stack Monitoring Metricbeat modules store data "correctly" when xpack.enabled: false (the default value), i.e. using ECS paths for all necessary fields and stored in the metricbeat-* indices. It also involves creating the necessary index aliases from the "legacy" paths to the "correct" ECS paths where the data will now live, so that a query to metricbeat-* looking for my.old.field.path will be silently redirected behind the scenes to return the value from metricbeat-* : ecs.path.to.same.field.

That work is required to ship in the released modules ASAP so that we can test out the aliases that are in place with real data as easily as possible.

Point (3) above is about removing the xpack.enabled fork altogether, i.e. "the node_stats metricset will simply ignore the value of the xpack.enabled setting". We don't want to ship this because this will mean that users who have opted into the xpack.enabled experience (i.e. most if not all of our Stack Monitoring users, because this is the currently recommended way to use Metricbeat collection for Stack Monitoring) will suddenly have their data written to a new index and any visualizations will need to be updated to point to the metricbeat-* index (the paths are redirected but the index must be updated). In addition, metricbeat-* indices do not apply any ILM policies to these data in the same way that the .monitoring indices do, so this change would silently begin storing data for a different amount of time which could incur costs for the user without them knowing, etc.

For this reason, we need to separate these two sets of changes so that we can backport/merge points (1) and (2) above, while reverting (3) only.

NOTE: We have made the decision to continue supporting internal collection and metricbeat (xpack.enabled: true) collection in 8.0.0 and beyond, which makes this ticket even more important.

Previous AC

AC:

  • When xpack.enabled is set to false (or not set at all), the Elasticsearch, Kibana, Logstash, and Beats modules each store their data in ECS-appropriate fields, with aliases from the legacy fields to the new ECS fields installed as well.
  • When xpack.enabled is set to true for any of these modules, they will continue to do what they have done in the past, i.e. store their data in the .monitoring-* indices using the legacy paths.
  • When Elastic Agent runs metricbeat, it should set xpack.enabled to false and not provide that option to the user at all, and should write to an appropriately named metrics-* datastream that has the mentioned field aliases installed in its mappings.

AC:

  • When a user is running Metricbeat Standalone in 8.0+
    • With xpack.enabled set to true for a given module, that module's data is queryable from the .monitoring-{module}-mb* indices and can be queried using the "non-ECS" paths as filters, aggregations, etc.
    • With xpack.enabled not set, or set to false for a given module, that module's data does whatever you all think is best (we don't mind if this data is not visible in the Stack Monitoring UI, because they haven't been before)
  • When a user is running Agent (with Metricbeat inside) in 8.0+
    • There is no way for that user to set xpack.enabled at all for these integrations/packages
    • The data is written to the relevant data stream for that integration/package
    • These data can be queried using the "non-ECS" paths as filters, aggregations, etc.
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Jun 24, 2021
@matschaffer
Copy link
Contributor

All 3 of those examples seem to say that there's a point in time past which kibana will not display data collected using older mechanisms.

This is surprising to me since I would have expected aliases on the old data to make them available via the new indices/fields.

What am I missing here that requires a hard cut-over? And if we indeed need a hard cut, are we preparing something the user can use to reindex old data into the new format?

@jasonrhodes jasonrhodes added Team:Integrations Label for the Integrations team Team:Obs-DC Label for the Data Collection team labels Jul 12, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/obs-dc (Team:Obs-DC)

@elasticmachine
Copy link
Collaborator

Pinging @elastic/integrations (Team:Integrations)

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Jul 12, 2021
@jasonrhodes
Copy link
Member Author

@matschaffer the goal was to not carry field aliases with us into the future in such a way that they would need to be maintained through the 9.x release cycle, so using the 8.x breaking change switch was the recommended approach at the time. Since then, because we've run out of some time between now and then and because of "make it minor" changing things slightly, we're considering a different approach overall.

@masci / @paulb-elastic We should quickly chat again before anyone picks this up. I noticed it didn't have any labels on it so I am not sure if it's on anybody's radar about but @ravikesarwani and I have just drafted a new plan for Stack Monitoring migration that will change some of this.

@elasticmachine
Copy link
Collaborator

Pinging @elastic/stack-monitoring (Stack monitoring)

@sayden sayden self-assigned this Aug 9, 2021
@jasonrhodes
Copy link
Member Author

@sayden do you see this making it in for 7.15? Just trying to see if we should get the metrics-* stuff into the UI by then or not -- thanks!

@jasonrhodes jasonrhodes changed the title Separate "Metricbeat index updates" from "xpack.enabled fork removal" Separate "Metricbeat index aliases" from "xpack.enabled fork removal", revert fork removal Oct 1, 2021
@jasonrhodes
Copy link
Member Author

I've updated this ticket's description and AC to reflect our new strategy that supports more collection modes in 8.x. We still need the work outlined in this ticket so that we can achieve this support.

@ravikesarwani
Copy link

@sayden Is there any update on this? Looks like this is a precursor to delivering Elasticsearch, Kibana or Logstash packages.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:Stack Monitoring Team:Integrations Label for the Integrations team Team:Obs-DC Label for the Data Collection team
Projects
None yet
Development

No branches or pull requests

5 participants