[Feature] Dead letter queue #86170

ruflin · 2022-04-26T08:20:52Z

With Elastic Agent we are fully embracing data streams and the data stream naming scheme. In many scenarios, we control the ingestion data structure and the mappings put in place. But as we encourage everyone to use the data stream naming scheme and for example for logs-*-* we put a basic ECS template in place, it is possible that on ingest time it can come to conflict. Reasons might be because the field foo is an object but someone trying to ingest data sends foo as a keyword.

Currently, Elasticsearch just rejects the data with an error. Instead it would be nice to be able to configure a dead letter queue where these events end up in. This ensures not the client has to deal with mapping conflicts and ensures all data is ingested.

This dead letter queue could be generic or per data stream (up for discussion). An assumption I make is that this dead letter queue by default would not have any mappings specified and queries have to be run with runtime fields.

Users could look at the dead letter queue and use it to debug their ingest pipelines / mappings to the "reindex" part of the events in the dead letter queue.

The text was updated successfully, but these errors were encountered:

elasticmachine · 2022-05-03T08:29:45Z

Pinging @elastic/es-distributed (Team:Distributed)

elasticmachine · 2022-05-03T09:25:51Z

Pinging @elastic/es-data-management (Team:Data Management)

DaveCTurner · 2022-05-03T09:29:40Z

I'm moving this over to the data management team for now. It's definitely on the border between the data-management and distrib areas but I think the data management folks are a better choice to think about this idea first.

felixbarny · 2022-05-03T16:08:22Z

Thanks @DaveCTurner for the help in triaging this. Not sure if that has an impact on responsibilities within the ES team but it's currently an open question whether DLQs should be exclusive to data streams or if they should also apply to regular indices. @jsvd brought up some good arguments in favor of also adding DLQs to regular indices to facilitate non-time series use cases in Logstash and Enterprise search that would benefit from a DLQ.

felixbarny · 2022-05-04T09:05:09Z

DLQs could also help with a lot of the use cases that are mentioned in

ignore_malformed to support ignoring JSON objects ingested into fields of the wrong type #12366

threatangler-jp · 2022-07-13T16:47:30Z

This is great. We have achieved the same thing using a different method.

We set ignore_malformed to true on filebeat and elastic agent integration index templates. Some native index templates will not accept the ignore_malformed true setting though and so that is a blind spot.

We also populate error.message anytime a pipeline processor fails. We set ignore failure to enabled on all processors. And we occasionally enable logging on our agents (this is too costly to do all the time). We then have a quality assurance process that searches a * index pattern for the below string:

(message : mapper_parsing_exception) OR (error.message : *) OR (_ignored : *) OR (message : dropping)

We are reactively catching and then able to fix these issues (those we are not blind to at least). We have been shocked though to see the volume of these issues coming from native index template settings in new filebeat module and agent integrations. So, we are asking the question - is the root issue a lack of discipline in alignment with ECS when Elastic is building new modules and integrations? A minority of the issues are not ECS alignment related but are field char limitation related but the fields this is happening to are easily identifiable as a field that would need a larger char limit.

felixbarny · 2022-07-25T10:28:33Z

A minority of the issues are not ECS alignment related but are field char limitation related but the fields this is happening to are easily identifiable as a field that would need a larger char limit.

Could you elaborate on which field char limit you are taking about and how you've fixed in your mapping?

We're currently working on improved default mappings for logs that are more resilient i.e. not prone to mapping conflicts and field explosions.

threatangler-jp · 2022-07-25T11:16:08Z

Here are some examples of fields that would cause data loss without ignore malformed set to true and with using native index templates. These are due to char limitation issues. process.command_line.caseless process.command_line process.args winlog.event_data.ObjectProperties winlog.event_data.AttributeValue winlog.event_data.TaskContentNew

felixbarny · 2022-07-25T11:29:15Z

Where does the char limitation come from and what's the default value? Do you have a link handy to the Elasticsearch docs?

threatangler-jp · 2022-07-25T12:26:22Z

We are finding fields set to a 1024 char limit but that are commonly populated with a much higher number of characters. We are customizing this configuration to 8191 which is the max recommended in the kibana interface and so far that has been sufficient. I had originally posted that the char limitation issue is not due to ECS misalignment but we are now thinking it is....at least in some instances. If an ECS field is not mapped natively then it is up to dynamic mapping and dynamic mapping can create a field mapping with a char limit too low for the field. So there seems to be an indirect relationship between the char limit issue and native index templates not aligning to ECS.

felixbarny · 2022-07-25T13:54:46Z

I still don't understand which char limit you are talking about and what the impact of this is 🙂

Are there any Exceptions on ingest that you can share? Are you referring to the ignore_above option? If fields are longer than that, it shouldn't stop ingestion but the fields aren't indexed. But maybe that's the issue you are facing?

threatangler-jp · 2022-07-25T15:04:40Z

Correct ignore_above.

From the documentation - Strings longer than the ignore_above setting will not be indexed or stored

https://www.elastic.co/guide/en/elasticsearch/reference/current/ignore-above.html

dakrone · 2023-04-25T17:05:18Z

Closing this in favor of #95534

jpountz added the discuss label Apr 26, 2022

ruflin mentioned this issue Apr 29, 2022

Synthetic source #85649

Merged

felixbarny added the :Distributed/CRUD A catch all label for issues around indexing, updating and getting a doc by id. Not search. label May 3, 2022

elasticmachine added the Team:Distributed Meta label for distributed team label May 3, 2022

DaveCTurner added :Data Management/Data streams Data streams and their lifecycles and removed :Distributed/CRUD A catch all label for issues around indexing, updating and getting a doc by id. Not search. labels May 3, 2022

elasticmachine added Team:Data Management Meta label for data/management team and removed Team:Distributed Meta label for distributed team labels May 3, 2022

felixbarny mentioned this issue May 3, 2022

need a solution for conflicts between ECS specified fields and user-logged fields elastic/ecs-logging-nodejs#68

Open

dakrone closed this as completed Apr 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Dead letter queue #86170

[Feature] Dead letter queue #86170

ruflin commented Apr 26, 2022

elasticmachine commented May 3, 2022

elasticmachine commented May 3, 2022

DaveCTurner commented May 3, 2022

felixbarny commented May 3, 2022

felixbarny commented May 4, 2022

threatangler-jp commented Jul 13, 2022 •

edited

felixbarny commented Jul 25, 2022

threatangler-jp commented Jul 25, 2022 via email •

edited

felixbarny commented Jul 25, 2022

threatangler-jp commented Jul 25, 2022 via email •

edited

felixbarny commented Jul 25, 2022

threatangler-jp commented Jul 25, 2022

dakrone commented Apr 25, 2023

[Feature] Dead letter queue #86170

[Feature] Dead letter queue #86170

Comments

ruflin commented Apr 26, 2022

elasticmachine commented May 3, 2022

elasticmachine commented May 3, 2022

DaveCTurner commented May 3, 2022

felixbarny commented May 3, 2022

felixbarny commented May 4, 2022

threatangler-jp commented Jul 13, 2022 • edited

felixbarny commented Jul 25, 2022

threatangler-jp commented Jul 25, 2022 via email • edited

felixbarny commented Jul 25, 2022

threatangler-jp commented Jul 25, 2022 via email • edited

felixbarny commented Jul 25, 2022

threatangler-jp commented Jul 25, 2022

dakrone commented Apr 25, 2023

threatangler-jp commented Jul 13, 2022 •

edited

threatangler-jp commented Jul 25, 2022 via email •

edited

threatangler-jp commented Jul 25, 2022 via email •

edited