-
Notifications
You must be signed in to change notification settings - Fork 419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix encoder pool issues #116
Conversation
25fb2be
to
1c72bd9
Compare
9cb2eca
to
3efa359
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes requested on coding style, but nothing big. This only needs to be battle tested on heavy loads.
tracer/encoder.go
Outdated
JSON_ENCODER = iota | ||
MSGPACK_ENCODER | ||
jsonType = iota | ||
msgpackType |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really see the point in having an int to describe the encoder type here. Just use the encoding as the key. Let the constants be jsonType = "application/json"
and msgpackType = "application/msgpack"
, and use those directly. Using an int is not like using a value within a finite set so we always take the risk to pass -1
or 2
, get out of bound, and return nil
when calling contentType(int)
. So let's just use the encoding as the key, but still use the constants jsonType
and msgpackType
to avoid copy/pasting and typo errors.
tracer/encoder.go
Outdated
pool := &encoderPool{ | ||
encoderType: encoderType, | ||
pool: make(chan Encoder, size), | ||
func contentType(encoderType int) string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this function can be removed, use encoderType
as a key.
// can theoretically read the underlying buffer whereas the encoder has been returned to the pool. | ||
// Since the underlying bytes.Buffer is not thread safe, this can make the app panicking. | ||
// since this method will later on spawn a goroutine referencing this buffer. | ||
// That's why we prefer the less performant yet SAFE implementation of allocating a new encoder every time we flush. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One way to get the encoder overhead would be to just write a unit test that... only allocates an encoder. Totally agree that if the order of magnitude of this happening is "every 2sec", we'd be fine. Our current impl. tends to have big payloads flush rather than small calls, from a high-level view, is fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I totally agree that we're good here. It's not a hot-path where if we have 1 millions of requests, we still allocate 1 encoder every 2 seconds. Good to keep this comment to explain why we undo that (premature) optimization.
9fcabc6
to
585deea
Compare
585deea
to
ea9fbc6
Compare
// can theoretically read the underlying buffer whereas the encoder has been returned to the pool. | ||
// Since the underlying bytes.Buffer is not thread safe, this can make the app panicking. | ||
// since this method will later on spawn a goroutine referencing this buffer. | ||
// That's why we prefer the less performant yet SAFE implementation of allocating a new encoder every time we flush. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I totally agree that we're good here. It's not a hot-path where if we have 1 millions of requests, we still allocate 1 encoder every 2 seconds. Good to keep this comment to explain why we undo that (premature) optimization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update, GTM.
Co-Authored-By: cswatt <cecilia.watt@datadoghq.com>
After several race conditions / unexpected panics due to reusing the underlying buffers via a pool of encoders, we need to make the following changes:
bytes.Buffer
) => performance overhead, but necessary since neither theencoder
nor thebytes.Buffer
is thread-safe.Note: Since we only flush every 2s, allocating a new encoder at every flush is acceptable.
TODO
This PR needs to be merged to avoid the apps using our go tracer from panicking.
However, we should work on a way to improve our tracer efficiency.
Here are some ideas:
Note: We should also work on a way to precisely measure the performance impact of our tracer on real apps in production (I tried to use out-of-the-box expvar metrics, but this was not conclusive).
Issues this PR solves
Unexpected panic
Race condition (already solved by #112)