Automatic buffering and preemptive flushing #146

kbogtob · 2020-04-23T14:59:10Z

What does this PR do?

Introducing new breaking changes:
It refactors the batch class into the MessageBuffer class. It removes the usage of the #batch method on the Statsd class as buffering is now automatic and always used. It flushes preemptively when reaching a given message count (max_buffer_pool_size) or 95% of the maximum buffer size (max_buffer_payload_size). If the user wants to flush data, they could call a new method #flush on Statsd. (that method is used a lot in the specs)

Adding new specific integration tests relevant to buffering to test edge cases.

Motivation

Preparation of version 5 road map with automatic buffering, refactored telemetry (buffered as other messages) and async IO in a separate thread.

Notes

We cannot use the optimized buffer size for UDP (1432 bytes) for now because of how telemetry is implemented. It eats part of the buffer size with a worst estimation of the size it could take. But in UDP, with a 1432 byte size, only a few global tags would make the telemetry allocate a size bigger of the buffer than its own size!

I removed this problem for now by increasing the default size of the buffer. I will add another PR after this one is merged to refactor telemetry as a "client" of the buffer instead of it allocating buffer size and using directly the low level Connection class.

remeh

High quality addition to the library! Thanks for your work on this.
I've left a few comments, the most notable one being about the telemetry stuff that I think should probably be done before merging this PR.

remeh · 2020-04-27T13:28:27Z

lib/datadog/statsd.rb

    OK       = 0
    WARNING  = 1
    CRITICAL = 2
    UNKNOWN  = 3

-    DEFAULT_BUFFER_SIZE = 8 * 1_024
+    UDP_DEFAULT_BUFFER_SIZE = 8_192


This value (8kb) won't be optimal on real networks where UDP is transported in an Ethernet packet, having most of the time only 1432 bytes available (when the MTU is 1500 = everywhere). Setting this to 8kb means that there will be IP fragmentation and that there is chances to lose 8kb of metrics.

You've detailed what were the reasons for this bump, maybe we should consider doing the telemetry improvements in a PR targeting the current branch (kbogtob/automatic-buffering), to merge it into this one and then after, to merge this one in master?

remeh · 2020-04-27T13:57:04Z

lib/datadog/statsd/message_buffer.rb

+module Datadog
+  class Statsd
+    class MessageBuffer
+      PAYLOAD_SIZE_TOLERANCE = 0.05


I get the idea but are you sure that it's helping somewhere (against just using 100% of the buffer size)? I'm not against it, I just want to know if there is benefit adding this extra math in the buffer code. (mostly because while we will debug the output of the client, we'll have to keep this in mind)

With a buffer size of 1472, it represents 73 bytes, which could be enough for one more metric.

My idea was to send messages preemptively when we think that we can't send a new message. You're right that when using a size of 1432, it's quite a lot. Do you think I should remove it?

lib/datadog/statsd.rb

lib/datadog/statsd/message_buffer.rb

remeh

Latest changes LGTM.

albertvaka

Very minor, but I think batch would be a better name than pool for what we are doing. Feel free to ignore this if you think otherwise. Other than that LGTM.

It is now handled by the MessageBuffer

* Isolate forwarding logic into class We are also renaming parameters for them to be prefixed by their relevant usage * Rewrite telemetry to snapshot messages in array Also reduce allocations using sprintf to compensate * Use telemetry as a message emitter in the forwarder

iancanderson · 2020-11-12T03:33:26Z

@kbogtob I'm interested in the performance gains from this and #151

Are there any plans to push a new gem release with these PRs? Thanks!

This reverts commit 0f1d768.

Reverts #151 & #146 to release 4.8.2

Restores kbogtob performances improvements (#151 and #146)

wjessop · 2023-11-07T01:58:53Z

lib/datadog/statsd.rb

      sample_rate: nil,
-      disable_telemetry: false,


It looks like this was a breaking API change that didn't trigger a changelog entry or major version bump. I suggest at least retro-actively adding a changelog entry, I had to dig around in the source to find this out.

Hey @wjessop thanks for your comment and sorry for the inconvenience.

I'll add a note about this in the 4th bullet point of the changelog entry of the 5.0.0 major version here, that the telemetry parameter has been renamed as well 👍

It'll go in with: #288

kbogtob requested review from albertvaka and remeh April 23, 2020 14:59

kbogtob force-pushed the kbogtob/automatic-buffering branch from bad52e4 to 0e83f3d Compare April 23, 2020 15:28

remeh reviewed Apr 27, 2020

View reviewed changes

remeh approved these changes May 25, 2020

View reviewed changes

albertvaka approved these changes May 25, 2020

View reviewed changes

kbogtob and others added 20 commits May 26, 2020 10:31

Rename Batch concept into MessageBuffer

6f26692

Clean buffer code to use more significative naming

f9257a3

Add management of max buffer pool size

e63cea4

Use optimized default buffer sizes for UDP & UDS

71fedc3

Add strategy to either drop or raise when msg too long

6979cc8

Remove check on events for max size

7ad7ff8

It is now handled by the MessageBuffer

Clarify buffer code

b2dad52

Add preemptive flushing of message buffer

e471f4d

Improve telemetry matcher to be diffable

a58876f

Cannot use optimized buffer size because telemetry estimate too big

794e625

Update parameters and setters for max_buffer params

5471eab

Make buffering automatic

9d0fe7e

Fix rubocop and ruby 2.0 issues

0eecf0a

Reduce UDP_DEFAULT_BUFFER_SIZE to 1432

39dcc92

Fix nit on error

0392f98

Do not bufferize empty messages

bdc5e37

Fix nit for rubocop

337028b

Skip allocation tests on old rubies

84ecb40

Fix nits after rebase

5fecfbc

kbogtob force-pushed the kbogtob/automatic-buffering branch from 8acd4c3 to 5fecfbc Compare May 26, 2020 08:36

Add missing comment

357deaf

kbogtob merged commit 0f1d768 into master May 26, 2020

kbogtob deleted the kbogtob/automatic-buffering branch May 26, 2020 08:43

remeh added a commit that referenced this pull request Nov 16, 2020

Revert "Automatic buffering and preemptive flushing (#146)"

779fb41

This reverts commit 0f1d768.

remeh added a commit that referenced this pull request Nov 16, 2020

Merge pull request #159 from DataDog/remeh/4.8.2-wo-karim-prs

e665cfb

Reverts #151 & #146 to release 4.8.2

remeh added a commit that referenced this pull request Nov 16, 2020

Restores kbogtob performances improvements (#151 and #146)

15d07c3

remeh mentioned this pull request Nov 16, 2020

Restores kbogtob performances improvements (#151 and #146) #160

Merged

remeh added a commit that referenced this pull request Mar 23, 2021

Merge pull request #160 from DataDog/remeh/restore-karim-work

58f4850

Restores kbogtob performances improvements (#151 and #146)

wjessop reviewed Nov 7, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Automatic buffering and preemptive flushing #146

Automatic buffering and preemptive flushing #146

kbogtob commented Apr 23, 2020 •

edited

remeh left a comment

remeh Apr 27, 2020

remeh Apr 27, 2020

kbogtob May 25, 2020

remeh left a comment

albertvaka left a comment

iancanderson commented Nov 12, 2020 •

edited

wjessop Nov 7, 2023

remeh Nov 10, 2023

remeh Nov 10, 2023

Automatic buffering and preemptive flushing #146

Automatic buffering and preemptive flushing #146

Conversation

kbogtob commented Apr 23, 2020 • edited

What does this PR do?

Motivation

Notes

remeh left a comment

Choose a reason for hiding this comment

remeh Apr 27, 2020

Choose a reason for hiding this comment

remeh Apr 27, 2020

Choose a reason for hiding this comment

kbogtob May 25, 2020

Choose a reason for hiding this comment

remeh left a comment

Choose a reason for hiding this comment

albertvaka left a comment

Choose a reason for hiding this comment

iancanderson commented Nov 12, 2020 • edited

wjessop Nov 7, 2023

Choose a reason for hiding this comment

remeh Nov 10, 2023

Choose a reason for hiding this comment

remeh Nov 10, 2023

Choose a reason for hiding this comment

kbogtob commented Apr 23, 2020 •

edited

iancanderson commented Nov 12, 2020 •

edited