Automatic tuning of mappings of data streams #87469
Labels
:Data Management/Data streams
Data streams and their lifecycles
>feature
Team:Data Management
Meta label for data/management team
I've been in a few discussions about improving index templates to change the index/search trade-off via runtime fields and doc-value only fields, and these discussions are hard to move forward because it depends on how end users leverage their data, which isn't known ahead of time. The same fields might be used very differently depending on whether and how end users leverage SIEM, alerting, e.g. are there custom rules?
Since we can't know ahead of time how end users will leverage the data, and since this information can change over time, I'm considering making data streams able to tune their index templates based on usage. The high-level idea I have in mind is that data streams could look at usage statistics upon rollover and update the index template if there is a mismatch between search-time usage of the fields and how they are mapped. This way, the next index should hopefully get mappings that better fit the sort of searches that run.
Some interesting things we could do that way:
text
andmatch_only_text
fields depending on how frequently positional queries run.keyword
andwildcard
depending on the cardinality of the field and whether users run infix queries.eager_global_ordinals
on fields that are frequently used forterms
/composite
aggregations.More thoughts/notes:
host.name
indexed even if it hasn't been used for filtering recently.The text was updated successfully, but these errors were encountered: