Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(splunk_hec source): Flush messages #5069

Merged
merged 3 commits into from Nov 17, 2020
Merged

fix(splunk_hec source): Flush messages #5069

merged 3 commits into from Nov 17, 2020

Conversation

jszwedko
Copy link
Member

Fixes #5058

#4875 introduced buffering
during Pipeline.start_send. Events were sent to the inner sink only on
when start_send was called again or poll_complete is called.
A side-effect of this was that when using Sink.send_all and
Stream.forward, if an error was returned on a poll() of the stream
being sent, the previously written event would never be sent to the
inner sink as poll_complete() was never called.

futures01 Sink.send_all:
https://docs.rs/futures/0.1.30/src/futures/sink/send_all.rs.html#59-88
futures01 Stream.forward:
https://docs.rs/futures/0.1.30/src/futures/stream/forward.rs.html#78-110

We can see in both cases it will return early if there is an error. If
the stream is not ready or is finished, then it will call
poll_complete / close on the sink.

We use an errorable futures01 stream for the splunk_hec source:
https://github.com/timberio/vector/blob/3bd8076be4097e4e9cf903a4f32e34a8b145a339/src/sources/splunk_hec.rs#L175-L189
which meant that, if there was an error processing a single event, the
previous event was never flushed downstream.

This adds a Drop implementation for Pipeline that flushes via
close(). I'm not 100% on this idea over requiring usages of
Pipelines that use errorable streams to explicitly call .close(),
but pushing up to get feedback.

I also implement close to ensure that the inner sink also has
close() called on it.

Signed-off-by: Jesse Szwedko jesse@szwedko.me

Fixes #5058

#4875 introduced buffering
during `Pipeline.start_send`. Events were sent to the inner sink only on
when `start_send` was called again or `poll_complete` is called.
A side-effect of this was that when using `Sink.send_all` and
`Stream.forward`, if an error was returned on a `poll()` of the stream
being sent, the previously written event would never be sent to the
inner sink as `poll_complete()` was never called.

futures01 `Sink.send_all`:
https://docs.rs/futures/0.1.30/src/futures/sink/send_all.rs.html#59-88
futures01 `Stream.forward`:
https://docs.rs/futures/0.1.30/src/futures/stream/forward.rs.html#78-110

We can see in both cases it will return early if there is an error. If
the stream is not ready or is finished, then it will call
`poll_complete` / `close` on the sink.

We use an errorable futures01 stream for the splunk_hec source:
https://github.com/timberio/vector/blob/3bd8076be4097e4e9cf903a4f32e34a8b145a339/src/sources/splunk_hec.rs#L175-L189
which meant that, if there was an error processing a single event, the
previous event was never flushed downstream.

This adds a `Drop` implementation for `Pipeline` that flushes via
`close()`. I'm not 100% on this idea over requiring usages of
`Pipeline`s that use errorable streams to explicitly call `.close()`,
but pushing up to get feedback.

I also implement `close` to ensure that the inner sink also has
`close()` called on it.

Signed-off-by: Jesse Szwedko <jesse@szwedko.me>
src/pipeline.rs Outdated
match self.close() {
Ok(Async::Ready(())) => break,
Ok(Async::NotReady) => {}
Err(err) => panic!(err),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thoughts welcome on what to do with errors here.

@binarylogic
Copy link
Contributor

binarylogic commented Nov 16, 2020

Is it possible to test for this? Or is that what the broken test is doing?

@@ -60,6 +60,23 @@ impl Sink for Pipeline {
debug_assert!(self.enqueued.is_empty());
self.inner.poll_complete()
}

fn close(&mut self) -> Poll<(), Self::SinkError> {
self.poll_complete()?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this change alone fix it by chance?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had the same thought before; it did not.

src/pipeline.rs Outdated
loop {
match self.close() {
Ok(Async::Ready(())) => break,
Ok(Async::NotReady) => {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could turn into a pretty tight loop if there's something inner is waiting on before closing. That plus the error handling question below makes me think we'd be better off requiring the caller to use close than doing this through Drop.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I'm happy to just call close() in this one particular case, but do you have thoughts on ensuring that we call close() everywhere forward or send_all is used?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(my concern is that we'll forget about this again)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will need to move away from the futures 0.1 send_all/forward methods before too long, so we could add our own method to Pipeline directly that makes sure to flush before returning an error.

@Hoverbear
Copy link
Contributor

@binarylogic In this case, this existing test is saving us. :)

Signed-off-by: Jesse Szwedko <jesse@szwedko.me>
@jszwedko jszwedko requested a review from ktff as a code owner November 16, 2020 22:57
@jszwedko
Copy link
Member Author

jszwedko commented Nov 16, 2020

@lukesteensen @Hoverbear I pushed another commit switching this to explicit flushing in the splunk_hec source.

I'll do a survey tomorrow for other places we are using Stream.forward and Sink.send_all to see if this issue exists there as well, but this at least fixes this specific case.

Given the switch to calling flush() directly, I'd say the existing test covers this @binarylogic

@jszwedko
Copy link
Member Author

Noting that I'll do the survey separately from this PR; I'd like to merge this to unblock master.

@jszwedko jszwedko changed the title fix(architecture): Ensure that Pipelines are flushed fix(splunk_hec source): Flush messages Nov 17, 2020
@jszwedko jszwedko merged commit 6bce5ac into master Nov 17, 2020
@jszwedko jszwedko deleted the flush-sinks branch November 17, 2020 17:20
casserni pushed a commit that referenced this pull request Nov 20, 2020
* fix(architecture): Ensure that Pipelines are flushed

Fixes #5058

#4875 introduced buffering
during `Pipeline.start_send`. Events were sent to the inner sink only on
when `start_send` was called again or `poll_complete` is called.
A side-effect of this was that when using `Sink.send_all` and
`Stream.forward`, if an error was returned on a `poll()` of the stream
being sent, the previously written event would never be sent to the
inner sink as `poll_complete()` was never called.

futures01 `Sink.send_all`:
https://docs.rs/futures/0.1.30/src/futures/sink/send_all.rs.html#59-88
futures01 `Stream.forward`:
https://docs.rs/futures/0.1.30/src/futures/stream/forward.rs.html#78-110

We can see in both cases it will return early if there is an error. If
the stream is not ready or is finished, then it will call
`poll_complete` / `close` on the sink.

We use an errorable futures01 stream for the splunk_hec source:
https://github.com/timberio/vector/blob/3bd8076be4097e4e9cf903a4f32e34a8b145a339/src/sources/splunk_hec.rs#L175-L189
which meant that, if there was an error processing a single event, the
previous event was never flushed downstream.

This change ensures that it flushes after the request is handled to send any additional buffered events.
I also implement `close` to ensure that the inner sink also has
`close()` called on it.

Signed-off-by: Jesse Szwedko <jesse@szwedko.me>
Signed-off-by: casserni <nicholascassera@gmail.com>
mengesb pushed a commit to jacobbraaten/vector that referenced this pull request Dec 9, 2020
* fix(architecture): Ensure that Pipelines are flushed

Fixes vectordotdev#5058

vectordotdev#4875 introduced buffering
during `Pipeline.start_send`. Events were sent to the inner sink only on
when `start_send` was called again or `poll_complete` is called.
A side-effect of this was that when using `Sink.send_all` and
`Stream.forward`, if an error was returned on a `poll()` of the stream
being sent, the previously written event would never be sent to the
inner sink as `poll_complete()` was never called.

futures01 `Sink.send_all`:
https://docs.rs/futures/0.1.30/src/futures/sink/send_all.rs.html#59-88
futures01 `Stream.forward`:
https://docs.rs/futures/0.1.30/src/futures/stream/forward.rs.html#78-110

We can see in both cases it will return early if there is an error. If
the stream is not ready or is finished, then it will call
`poll_complete` / `close` on the sink.

We use an errorable futures01 stream for the splunk_hec source:
https://github.com/timberio/vector/blob/3bd8076be4097e4e9cf903a4f32e34a8b145a339/src/sources/splunk_hec.rs#L175-L189
which meant that, if there was an error processing a single event, the
previous event was never flushed downstream.

This change ensures that it flushes after the request is handled to send any additional buffered events.
I also implement `close` to ensure that the inner sink also has
`close()` called on it.

Signed-off-by: Jesse Szwedko <jesse@szwedko.me>
Signed-off-by: Brian Menges <brian.menges@anaplan.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

splunk_hec source partial test is hanging
5 participants