Native support for data stream naming scheme and related assets in Kibana #134883

ruflin · 2022-06-22T08:23:34Z

With 7.13, data streams and component templates were introduced in Elasticsearch as a feature. Along with it came the data stream naming scheme and Elasticsearch loading templates for logs-*-*, metrics-*-* etc. The data stream naming scheme does not only describe how data should be organised and index but Fleet also follows conventions on how index templates and ingest pipelines are named and how these are extended.

Up to today, this is all based on conventions and enforced when modifying anything through Fleet. But as soon as users access the stack directly through the Stack Management UI. Currently (8.3) Stack Management UI is able to show which templates are loaded directly by Elasticsearch or Fleet as a tag "managed: true" exists. But there the understanding on how these assets are organised stops.

As the data stream naming scheme is not only a Fleet concept but now at the core of Elasticsearch and the recommended way to ingest and manage data in Elasticsearch, Stack Management UI should have an understanding for how it works to provide a better UX around it. There are many aspects to this inside the Stack Management UI but also outside for example inside the unified search bar. Here an example for the Stack Management UI on what could be done.

User story

If a user goes to the Stack Management UI and tries to modify any of the templates belonging to the data stream naming scheme, the Stack Management UI provides guidance. Index Templates and Component templates that belong together are grouped together in the UI. If a user tries to modify one of the managed assets, Stack Management guides the user to use the @custom template instead and creates it if needed. The same applies for ingest pipelines.

The main goal for all flows is to support users making the right decisions, not allow them to shoot themselves into the foot and provide a good UX around the conventions we have.

The text was updated successfully, but these errors were encountered:

elasticmachine · 2022-06-22T08:23:36Z

Pinging @elastic/platform-deployment-management (Team:Deployment Management)

cjcenizal · 2022-06-22T17:50:36Z

Thanks @ruflin! I have questions about the naming scheme and how we're enforcing it. For clarity, I'm going to use "managed data stream" to refer to the data streams described by the naming scheme docs. According to these docs, a managed data stream must satisfy these criteria:

It must be named {type}-{dataset}-{namespace}.
It must have the field data_stream.type.
The value of the data_stream.type field must equal {type} from the data stream's name.
It must have the field data_stream.dataset.
The value of the data_stream.dataset field must equal {dataset} from the data stream's name.
It must have the field data_stream.namespace.
The value of the data_stream.namespace field must equal {namespace} from the data stream's name.

Is that right?

I just did some testing, and it looks like ES allows the user to create data streams and index templates that only partially meet these criteria. So it's easy for a user to create entities that kinda look like they're managed, but actually aren't. In order to accurately identify a data stream as managed, we'll need logic to assess whether it meets all of the above criteria. This logic will need to live on both the front-end (for any special behavior or presentation specific to managed data streams) and the back-end (for any APIs that perform operations on managed data streams).

I think we can simplify this substantially if we can build a strong validation strategy into ES, to ensure that managed data streams and related entities are easy to create and identify, and behave as expected. For example:

Signifying managed data streams with a reserved _meta.managed: true field. Something like this will make it trivial to identify whether a data stream or other entity is managed, assuming that we have strong validation that ensures it will always meet the above criteria.
Validation that a request to create a managed data stream satisfies the criteria listed above.
Validation that a request to change any of a data stream's special mappings doesn't invalidate these criteria.
Validation that a request to create an index template that would generate a managed data stream also satisfies these criteria.

These are just a few thoughts off the top of my head. We'd probably need to do a more thorough analysis concerning other related entities like component templates and ingest pipelines. For example, a request to change a component template's mappings needs to be validated against the composed result in case it removes a required field. Ingest node pipelines that feed into managed data streams need to be validated to ensure they don't remove a managed data stream's special fields.

ruflin · 2022-06-24T06:48:58Z

The points you make above @cjcenizal are all correct. I like the idea of building this even deeper into Elasticsearch. But I think there is on aspect that you partially missed which is around the UX. In an ideal world from my perspective, a user has never to know about the data stream naming scheme and all the conventions around it. It the users is using Kibana UI or API, we hide it from the user. If a user wants to modify some mappings, we offer the user to add / remove / edit a mapping, what template it is stored in is not relevant, it just works.

Taking this to the data streams: A user wants to store nginx log data in Elasticsearch. We can ask first the user if logs or metrics, then what the user would like to call this data set. We create logs-nginx-default for the user and explain how to ship data there. But the user never created a data stream or had to learn what an index template as and what component templates are.

ruflin added the Team:Kibana Management Dev Tools, Index Management, Upgrade Assistant, ILM, Ingest Node Pipelines, and more label Jun 22, 2022

ruflin mentioned this issue Jun 22, 2022

Ingest Pipelines UI should indicate managed pipelines and allow filtering #133382

Closed

yuliacech added the Feature:Index Management Index and index templates UI label Dec 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Native support for data stream naming scheme and related assets in Kibana #134883

Native support for data stream naming scheme and related assets in Kibana #134883

ruflin commented Jun 22, 2022

elasticmachine commented Jun 22, 2022

cjcenizal commented Jun 22, 2022 •

edited

ruflin commented Jun 24, 2022

Native support for data stream naming scheme and related assets in Kibana #134883

Native support for data stream naming scheme and related assets in Kibana #134883

Comments

ruflin commented Jun 22, 2022

User story

elasticmachine commented Jun 22, 2022

cjcenizal commented Jun 22, 2022 • edited

ruflin commented Jun 24, 2022

cjcenizal commented Jun 22, 2022 •

edited