Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose network-level telemetry #196

Closed
mlowicki opened this issue Feb 20, 2024 · 4 comments
Closed

Expose network-level telemetry #196

mlowicki opened this issue Feb 20, 2024 · 4 comments

Comments

@mlowicki
Copy link
Contributor

While working with bigger numbers of metrics being emitted it's useful to know things like:

  • bytes sent / dropped
  • packets sent / dropped

in order to e.g. diagnose issues related agent receiving those being hammered. Currently there isn't anything like this from what I've checked - QueuingMetricSink supports few metrics via WorkerStats but those are related to the queue / channel being used there.

One option would be to have those at the very low layers like UnixWriteAdapter or UdpWriteAdapter but then we would need to have an API to access those:

  • we could extend MetricSink trait with something like stats() to get access those. Wrapping sinks would simply call stats() of the underlying sink.
  • add APIs only in few places like BufferedUnixMetricSink and UnixWriteAdapter and in a similar manner for UDP

Any thoughts / ideas?

@56quarters
Copy link
Owner

56quarters commented Feb 21, 2024

While working with bigger numbers of metrics being emitted it's useful to know things like:

  • bytes sent / dropped
  • packets sent / dropped

Where are the "dropped" counters coming from? Implicit when there's an error writing to a socket?

Currently there isn't anything like this from what I've checked - QueuingMetricSink supports few metrics via WorkerStats but those are related to the queue / channel being used there.

io::MultilineWriter keeps some metrics as well but they're mostly for ease of unit testing.

One option would be to have those at the very low layers like UnixWriteAdapter or UdpWriteAdapter but then we would need to have an API to access those:

  • we could extend MetricSink trait with something like stats() to get access those. Wrapping sinks would simply call stats() of the underlying sink.

Adding a new method to MetricSink that returns a SinkMetrics or similar seems pretty reasonable to me. This would likely require adding atomic counters in the various sinks that do the actual writes. My only concern is that we'd have to make it clear that not all sinks emit all metrics (I'm assuming) and this is largely best effort.

  • add APIs only in few places like BufferedUnixMetricSink and UnixWriteAdapter and in a similar manner for UDP
    Any thoughts / ideas?

Seems like a nice addition to me. Thanks for the issue!

@mlowicki
Copy link
Contributor Author

mlowicki commented Feb 21, 2024

While working with bigger numbers of metrics being emitted it's useful to know things like:

  • bytes sent / dropped
  • packets sent / dropped

Where are the "dropped" counters coming from? Implicit when there's an error writing to a socket?

Yea, that was my initial though on it. In theory wrapping sink could have some retry logic (I don't think it's the case currently) so error writing to a socket won't necessary mean that data got dropped on the floor but maybe it would be good enough for start. If sinks would return SinkMetrics then wrapping sink can always implement correct calculation to take retries into account instead of simply returning values from low-level sink like UnixMetricSink or BufferedUnixMetricSink.

Currently there isn't anything like this from what I've checked - QueuingMetricSink supports few metrics via WorkerStats but those are related to the queue / channel being used there.

io::MultilineWriter keeps some metrics as well but they're mostly for ease of unit testing.

One option would be to have those at the very low layers like UnixWriteAdapter or UdpWriteAdapter but then we would need to have an API to access those:

  • we could extend MetricSink trait with something like stats() to get access those. Wrapping sinks would simply call stats() of the underlying sink.

Adding a new method to MetricSink that returns a SinkMetrics or similar seems pretty reasonable to me. This would likely require adding atomic counters in the various sinks that do the actual writes. My only concern is that we'd have to make it clear that not all sinks emit all metrics (I'm assuming) and this is largely best effort.

Right, I can try to prototype it in upcoming days and then we'll see how it looks. Maybe implementing it for few low-level sinks would be just fine and wrapping sink can simply call stats() on the inner sink 🤷 .

  • add APIs only in few places like BufferedUnixMetricSink and UnixWriteAdapter and in a similar manner for UDP
    Any thoughts / ideas?

Seems like a nice addition to me. Thanks for the issue!

Thanks, let me try to come up with some draft and we can further discuss direction over there.

@mlowicki
Copy link
Contributor Author

@56quarters initial stub on #203. This only exposes bytes / packets sent in some parts. Should be enough to discuss on further direction.

@mlowicki
Copy link
Contributor Author

Fixed by #196.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants