Skip to content

Conversation

@duncanista
Copy link
Contributor

@duncanista duncanista commented May 23, 2024

What?

  1. Drop logs Aggregator lock as soon as possible.
  2. Make the Telemetry API listener to send the events directly to the logs agent, and send events of interest to the main thread.
  • Previously, we were sending first the events to the main thread event bus, which was bounded to 100 elements, and we had to wait for them to be processed there to be forwarded to the logs agent. This is not ideal.

@duncanista duncanista marked this pull request as ready for review May 24, 2024 18:52
@duncanista duncanista requested a review from a team as a code owner May 24, 2024 18:52
astuyve and others added 4 commits May 24, 2024 15:15
added some debug logs, found that issue is in serialization of huge payloads
update how we read from stream
@@ -1,11 +1,10 @@
use serde::Serialize;
use std::collections::VecDeque;
use crossbeam::queue::SegQueue;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW SegQueue is an unbounded multi-producer, multi-consumer. Minor issue, the multi-consumer bit requires more coordination in the implementation so you're leaving performance on the table by not using a single-consumer. More pressing, this is an unbounded queue and represents a potential site for unbounded allocations.

Copy link
Contributor Author

@duncanista duncanista May 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm aware of the tradeoffs, I have to clean this PR still, will mark it as draft – plan to use a bounded queue here

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SegQueue is also theoretically lock free but it uses Acquire/Release semantics internally and so the semantics are equivalent to a mutex. Except Acq/Rel done on atomic structures cuts the Linux scheduler out so your latency will be worse generally speaking, except in specialized circumstances where the acq/rel using software has the highest CPU priority anyway and is always scheduled first.

Tokio's mpsc is good, will play well with the Linux scheduler and also nicely between async/sync code.

@duncanista
Copy link
Contributor Author

Closing due to other PRs doing this

@duncanista duncanista closed this Aug 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants