-
Notifications
You must be signed in to change notification settings - Fork 15
chore(bottlecap): reduce lock contention in logs #256
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore(bottlecap): reduce lock contention in logs #256
Conversation
it doesnt make sense to send thousands of events to the main thread, and then to the logs agent, inverting makes more sense
so the telemetry api can directly send the logs to it, insted of the main thread
…h://github.com/DataDog/datadog-lambda-extension into jordan.gonzalez/bottlecap/reduce-lock-contention
added some debug logs, found that issue is in serialization of huge payloads
update how we read from stream
| @@ -1,11 +1,10 @@ | |||
| use serde::Serialize; | |||
| use std::collections::VecDeque; | |||
| use crossbeam::queue::SegQueue; | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW SegQueue is an unbounded multi-producer, multi-consumer. Minor issue, the multi-consumer bit requires more coordination in the implementation so you're leaving performance on the table by not using a single-consumer. More pressing, this is an unbounded queue and represents a potential site for unbounded allocations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm aware of the tradeoffs, I have to clean this PR still, will mark it as draft – plan to use a bounded queue here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SegQueue is also theoretically lock free but it uses Acquire/Release semantics internally and so the semantics are equivalent to a mutex. Except Acq/Rel done on atomic structures cuts the Linux scheduler out so your latency will be worse generally speaking, except in specialized circumstances where the acq/rel using software has the highest CPU priority anyway and is always scheduled first.
Tokio's mpsc is good, will play well with the Linux scheduler and also nicely between async/sync code.
|
Closing due to other PRs doing this |
What?
Aggregatorlock as soon as possible.