Improve flexibility over index.required_pipeline #49247

roncohen · 2019-11-18T14:29:50Z

Describe the feature:

Following up on #46847 there are a couple of cases where we want to ensure that a specific pipeline is run on any documents that are ingested into an index. For example, you may want to set the event.ingested timestamp or ensure that the name of the API Key used is present in the document.

At the same time, we want to give users the flexibility they currently have to use a pipeline of their choosing to process the incoming data. We have index.required_pipeline, but it doesn't come with the flexibility we'd like.

@skearns64 suggested:

Sounds like we need an "append" pipeline, or an option to required to be "run first or run last"

If "append pipeline" means that Elasticsearch will automatically run the "append pipeline" on every indexed document after the pipeline specified with the request has been run, it sounds like the "append pipeline" option would solve the use-cases I'm familiar with.

I've not heard a compelling use case for "run first", but they could exist.

some questions that come to mind:

Does "append pipeline" let users specify a list of pipelines to append or only a single pipeline. You can achieve the same functionality by combining pipelines, but I can imagine it would be convenient to be able to specify a list.
how will it work with index.default_pipeline and index.required_pipeline
I don't know that index.required_pipeline has any use case that index.append_pipeline does not solve, but that could be due to lack of context on my part

cc @ruflin @webmat @clintongormley @jasontedor @bytebilly

(first Elasticsearch issue! 🎉 )

The text was updated successfully, but these errors were encountered:

elasticmachine · 2019-11-18T14:48:08Z

Pinging @elastic/es-core-features (:Core/Features/Ingest)

webmat · 2019-11-18T15:16:16Z

One very important aspect of this added flexibility is specifically to let the user add their own pipelines around stack-provided pipelines without having to modify the existing pipeline.

I've been feeling that need for a while, and one way I've been thinking about this is to provide "before hooks" or "after hooks", where users can insert their own pipelines anywhere they need.

When users are forced to modify pipelines provided by products in the stack -- like Beats modules -- they're signing up to permanently having to re-apply their changes whenever they upgrade the product. Or worse, they won't remember, and whatever they improved will be lost when they upgrade.

Approaching this in this in a generic fashion like before/after hooks would let users work around not only provided stack pipelines, but also around their own team structure & areas of responsibility.

Consider this example:

A stack-provided ingest pipeline like the Filebeat Apache httpd module's is used for multiple web apps
The team managing a central pipeline hooks after the module's pipeline to perform additional work relevant to all deployments.
- E.g. adjusting older Beats module outputs to newer versions of ECS
The team managing one of the Apache deployments have their own adjustments they want to do, before the default processing kicks in.
- E.g. In a prior life I would append 20-ish kv after a default apache log

With this in mind, I think it would be great to offer the ability to hook before/after via the API call, and via the index setting.

webmat · 2019-11-18T15:18:00Z

Modifying a stack pipeline is possible but is nasty, as you can see here (scroll to "Ingest Node Pipeline").

roncohen added >enhancement >feature labels Nov 18, 2019

pgomulka added the :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP label Nov 18, 2019

jasontedor self-assigned this Nov 18, 2019

jasontedor mentioned this issue Nov 21, 2019

Replace required pipeline with final pipeline #49470

Merged

jasontedor closed this as completed in #49470 Nov 22, 2019

This was referenced Feb 3, 2020

[meta] 7.6 release elastic/elasticsearch-net#4340

Closed

[meta] 7.6 release elastic/elasticsearch-net#4341

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve flexibility over index.required_pipeline #49247

Improve flexibility over index.required_pipeline #49247

roncohen commented Nov 18, 2019 •

edited by jasontedor

Loading

elasticmachine commented Nov 18, 2019

webmat commented Nov 18, 2019 •

edited

Loading

webmat commented Nov 18, 2019 •

edited

Loading

Improve flexibility over index.required_pipeline #49247

Improve flexibility over index.required_pipeline #49247

Comments

roncohen commented Nov 18, 2019 • edited by jasontedor Loading

elasticmachine commented Nov 18, 2019

webmat commented Nov 18, 2019 • edited Loading

webmat commented Nov 18, 2019 • edited Loading

roncohen commented Nov 18, 2019 •

edited by jasontedor

Loading

webmat commented Nov 18, 2019 •

edited

Loading

webmat commented Nov 18, 2019 •

edited

Loading