Output: Backup for bad chunk #1856

repeatedly · 2018-02-14T02:37:04Z

Fluentd's output plugin somtimes hit un-recoverable error during chunk flush.

chunk contains wrong record for output configuration
output plugin has a bug for specific record
broken chunk is generated by hardware problem
wrong setup for destination

Currently, we use retry limit and secondary for handling these chunks but it has several problems.

bad chunk occupy flush threads until reach retry limit
non retry limit environment can't rescue bad chunk

So we should care bad chunk for stability and performance.
The idea is if output plugin raises un-recoverable error during chunk flush, such chunks are routed to backup directory.

The recoverable errors are TypeError, NoMethodError, ArgumentError, etc
Use UnrecoverableError for plugin specific error

In addition, <system> directive provide backup_dir parameter. The default is /tmp/fluentd.

The text was updated successfully, but these errors were encountered:

mururu · 2018-03-01T07:05:30Z

Is secondary ignored when UnrecoverableError is thrown? The behaviour looks good for "broken chunk" case, but not good for "wrong setup for destination" case.
I think that skipping retry of the output plugin is enough when UnrecoverableError is thrown. It is because secondary output plugin can also raise UnrecoverableError if they can't handle the chunk.

repeatedly · 2018-04-03T17:22:52Z

not good for "wrong setup for destination" case.

Yes. backup feature is for bad chunk, not for wrong setup. If plugin re-raise an error for wrong setup, it is routed to backup directory. I think wrong setup should be found during configuration or start, e.g. S3 pluing check API key at start.

I think that skipping retry of the output plugin is enough when UnrecoverableError is thrown.

There are 2 cases:

secondary plugin is same as primary. In this case, secondary process should be skipped because same error happens inside secondary.
secondary plugin is different from primary. In this case, secondary should handle bad chunk first.

To change secondary usage, adding option is one idea, force_secondary or similar parameter.

repeatedly · 2018-04-18T05:18:04Z

Patch is here: #1952

artbeglaryan · 2019-07-08T21:47:33Z

Hello @repeatedly. Maybe this is the wrong place, but I didn't find any information in existing issues or in documentation. I have my own output plugin, which do some specific job with buffered data via network. Sometimes there may be some issue with network and my plugin failed to flush data after reaching retry_wait, retry_max_interval, retry_max_times and put that buffer chunk in secondary output plugin which is file. So I didn't lose any data. I have that chunks in my backup directory. Is there any way(I really didn't find anywhere documented or answered) to process that chunks later with the same plugin? by hand, by some external command or maybe there is some existing plugin which can process that data again?

repeatedly · 2019-07-12T06:01:38Z

@artbeglaryan Currently, writing script is better for it. Here is an example:

https://groups.google.com/d/msg/fluentd/6Pn4XDOPxoU/CiYFkJXXfAEJ

I will add this to documetation.

repeatedly mentioned this issue Feb 28, 2018

buf_file: Skip and delete broken file chunks during resume. fix #1760 #1874

Merged

repeatedly added feature request v1 labels Mar 1, 2018

repeatedly mentioned this issue Apr 3, 2018

[Question] Partial Retries Support #1911

Open

cosmo0920 mentioned this issue Apr 12, 2018

introduce dead letter queue to handle issues unpacking file buffer chunks uken/fluent-plugin-elasticsearch#398

Merged

7 tasks

repeatedly mentioned this issue Apr 18, 2018

output: Add backup feature for bad chunks #1952

Merged

repeatedly closed this as completed in #1952 Apr 29, 2018

breath-co2 mentioned this issue Dec 27, 2018

I suggest backup for bad chunk also save the meta information #2245

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Output: Backup for bad chunk #1856

Output: Backup for bad chunk #1856

repeatedly commented Feb 14, 2018

mururu commented Mar 1, 2018

repeatedly commented Apr 3, 2018 •

edited

repeatedly commented Apr 18, 2018

artbeglaryan commented Jul 8, 2019

repeatedly commented Jul 12, 2019

Output: Backup for bad chunk #1856

Output: Backup for bad chunk #1856

Comments

repeatedly commented Feb 14, 2018

mururu commented Mar 1, 2018

repeatedly commented Apr 3, 2018 • edited

repeatedly commented Apr 18, 2018

artbeglaryan commented Jul 8, 2019

repeatedly commented Jul 12, 2019

repeatedly commented Apr 3, 2018 •

edited