Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix incomplete flush in the sender when flushing the client #163

Merged
merged 1 commit into from Aug 25, 2020

Conversation

hush-hush
Copy link
Member

In some case, the sender 'sendLoop' would pop a buffer from its queue
as we are calling its 'flush' method. Since the input queue is empty it
would return instantly without waiting for the current buffer own by the
sender to be sent. This would result in partial flush.

We now synchronize 'flush' and 'sendLoop' to make sure flushing the
client will send everything to the wire. The side effect of this is that
the workers are lock for longer as we keep them locked when flushing the
sender.

@hush-hush hush-hush requested a review from truthbk August 11, 2020 21:07
Copy link
Member

@olivielpeau olivielpeau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should address most cases of incomplete flushes, but in ChannelMode I don't think it guarantees that, when Flush is called, a previous metric submission call (e.g. Gauge()) has gone as far as a worker's writeMetricUnsafe call here

A larger refactoring might be necessary so that the pipeline of client -> workers -> sender can be fully flushed in order, by passing a "flush" message down from the client to the workers and ultimately to the sender, each step confirming that it has processed all data that was available before the flush started.

@hush-hush
Copy link
Member Author

Indeed the ChannelMode will not be fully flushed. The goal of it being to decouple sampling metrics and receiving it in the workers so we never block the app. Right now, we have no way to fully flush the input channel without blocking the calling app, which would defeat the main purpose of ChannelMode. I think the current behavior is expected but we might want to improve the Flush documentation to make it clear.

For the flush message idea, I don't think it would really work with multiple workers as we don't control which one will pull the message from the queue. I'm not sure the overhead of synchronizing the workers is worth it just for ChannelMode.

@hush-hush hush-hush force-pushed the maxime/fix-sender-flush branch 2 times, most recently from 60b7418 to 308138d Compare August 12, 2020 21:51
Copy link
Member

@olivielpeau olivielpeau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, for now in ChannelMode it sounds reasonable to document that the Flush function doesn't offer guarantees that everything that was submitted is actually flushed.

Left a suggestion inline

statsd/sender.go Outdated Show resolved Hide resolved
In some case, the sender 'sendLoop' would pop a buffer from its queue
as we are calling its 'flush' method. Since the input queue is empty it
would return instantly without waiting for the current buffer own by the
sender to be sent. This would result in partial flush.

We now synchronize 'flush' and 'sendLoop' to make sure flushing the
client will send everything to the wire. The side effect of this is that
the workers are lock for longer as we keep them locked when flushing the
sender.
Copy link
Member

@truthbk truthbk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📦

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants