fix(file source): more robust handling of split reads #4089

lukesteensen · 2020-09-25T03:16:15Z

This is an initial minimal-ish fix for the file source being too willing to return lines that did not end in a newline. Our original fix in #1236 was to wait up to a millisecond for the rest of a line to show up before considering it complete and returning it anyway. It turns out that in high-throughput situations, it's not uncommon for partial log lines to be visible for periods of time longer than this. This causes Vector to return a single logical line split across multiple events, which obviously breaks many kinds of downstream processing.

The problem with the sleep approach is that we can't know how long we need to sleep to see the rest of the line, and any amount of time we sleep is time we're not doing useful work, potentially lowering our overall throughput. This PR eliminates the sleep, instead opting to make a single logical line read operation able to be suspended and resumed across calls. The core of that change is to switch from a single line buffer to one that's specific to each file watcher. This means the intermediate state won't get overwritten between calls and we're safe to return control to the main loop.

This opens up a couple of edge cases that will require a larger change to handle, but they generally seem very unlikely to cause issues. Mostly, we are less likely to return legitimately interrupted writes as their own event. These are likely extremely uncommon, and will now be prepended to the following line, just as they appear if you read the file normally. If there is no following line, we won't flush the partial message until the file is deleted, which may not happen. If Vector is shut down before it sees another newline, it doesn't have a mechanism to flush this intermediate state.

There's some further refactoring here that should happen, but I cut it short in favor of getting the fix out sooner:

We should read directly into the BytesMut instead of double buffering via BufReader
read_until_with_max_size should be inlined into read_line because the boundary doesn't really make sense anymore

Both these refactoring and plugging any edge cases can happen as part of the larger file source cleanup that should be coming up soon.

Signed-off-by: Luke Steensen <luke.steensen@gmail.com>

ktff · 2020-09-25T12:23:33Z

lib/file-source/src/file_watcher.rs

    /// This function will attempt to read a new line from its file, blocking,
    /// up to some maximum but unspecified amount of time. `read_line` will open


Does this function still block?

It still does multiple blocking filesystems reads, so yes. This comment predates the addition of the sleep.

ktff

Seams great, and the shutdown tests are failing because of an other PR, so they can be ignored for this one.

jamtur01 · 2020-09-25T14:02:12Z

+1.

Hoverbear · 2020-09-25T21:10:29Z

lib/file-source/src/file_server.rs

                        break;
                    }
+
+                    let sz = line.len();
+                    trace!(


This could shift into an internal event perhaps?

The way we do events here is a little different since it's its own crate, but not a bad idea. I'll make a note to do that in the next set of changes.

mikhno-s · 2020-09-29T06:18:11Z

Can somebody trigger container building the version with this fix, please? The last one in docker hub dated 5 days ago (24 sep)
Thanks

jamtur01 · 2020-09-29T06:22:16Z

@mikhno-s Apologies! We're working on an issue with the nightly builds. Hopefully, it will be fixed tomorrow.

) Signed-off-by: Luke Steensen <luke.steensen@gmail.com> Signed-off-by: Brian Menges <brian.menges@anaplan.com>

lukesteensen added 4 commits September 24, 2020 21:38

push max size down into file watcher

d93a3d6

Signed-off-by: Luke Steensen <luke.steensen@gmail.com>

cleanup and use local buffer

87ec370

Signed-off-by: Luke Steensen <luke.steensen@gmail.com>

return an option

98de4c1

Signed-off-by: Luke Steensen <luke.steensen@gmail.com>

no more sleeping

fe4ff77

Signed-off-by: Luke Steensen <luke.steensen@gmail.com>

lukesteensen requested review from Hoverbear and ktff September 25, 2020 03:16

ktff reviewed Sep 25, 2020

View reviewed changes

ktff approved these changes Sep 25, 2020

View reviewed changes

Hoverbear reviewed Sep 25, 2020

View reviewed changes

Hoverbear approved these changes Sep 25, 2020

View reviewed changes

lukesteensen merged commit c5ec225 into master Sep 28, 2020

lukesteensen deleted the fix-file-source branch September 28, 2020 15:59

mengesb pushed a commit to jacobbraaten/vector that referenced this pull request Dec 9, 2020

fix(file source): more robust handling of split reads (vectordotdev#4089

02ec2e0

) Signed-off-by: Luke Steensen <luke.steensen@gmail.com> Signed-off-by: Brian Menges <brian.menges@anaplan.com>

lukesteensen mentioned this pull request Mar 12, 2021

chore(performance): file-source testing and benchmarks #6742

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(file source): more robust handling of split reads #4089

fix(file source): more robust handling of split reads #4089

lukesteensen commented Sep 25, 2020

ktff Sep 25, 2020

lukesteensen Sep 28, 2020

ktff left a comment •

edited

Loading

jamtur01 commented Sep 25, 2020

Hoverbear Sep 25, 2020

lukesteensen Sep 28, 2020

mikhno-s commented Sep 29, 2020

jamtur01 commented Sep 29, 2020

		/// This function will attempt to read a new line from its file, blocking,
		/// up to some maximum but unspecified amount of time. `read_line` will open

fix(file source): more robust handling of split reads #4089

fix(file source): more robust handling of split reads #4089

Conversation

lukesteensen commented Sep 25, 2020

ktff Sep 25, 2020

Choose a reason for hiding this comment

lukesteensen Sep 28, 2020

Choose a reason for hiding this comment

ktff left a comment • edited Loading

Choose a reason for hiding this comment

jamtur01 commented Sep 25, 2020

Hoverbear Sep 25, 2020

Choose a reason for hiding this comment

lukesteensen Sep 28, 2020

Choose a reason for hiding this comment

mikhno-s commented Sep 29, 2020

jamtur01 commented Sep 29, 2020

ktff left a comment •

edited

Loading