Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace logging layer with tracing #870

Closed
wants to merge 2 commits into from
Closed

Replace logging layer with tracing #870

wants to merge 2 commits into from

Conversation

Keksoj
Copy link
Member

@Keksoj Keksoj commented Jan 6, 2023

For now, Sōzu implements its own two layers of logging :

  • a set of logging macros (info!, debug!, etc)
  • a Logger instantiated in each worker to gather the log events and write them onto 5 possible targets:
    • a UDP address
    • a TCP address
    • a unix socket
    • a file
    • stdout

Since the birth of Sōzu, the tokio ecosystem has grown to become a defining standard in terms of asynchronous software architecture. It contains its own tracing crate, tracing, compatible with a number of loggers.

Switching Sōzu's home-made macros and logger will allow for a lot of enhancements down the way (such as structured logging: #437) and remove a great deal of code that is uneasy to maintain.

This PR aims to perform a drop-in replacement of Sōzu's logging logic, including notably:

  • a naive use of logging-subscriber to print logs to stdout throughout Sōzu
  • the ability to change the log level during runtime
  • send logs to all 5 targets (udp, tcp, unix socket, file, stdout)

@Keksoj Keksoj added this to the v0.15.0 milestone Jan 6, 2023
@Keksoj Keksoj added this to In progress in Roadmap via automation Jan 6, 2023
@Keksoj Keksoj self-assigned this Jan 6, 2023
@Keksoj Keksoj changed the title naive integration of tracing Replace logging layer with tracing Jan 6, 2023
@Geal
Copy link
Member

Geal commented Jan 6, 2023

sorry, I have to be annoying again and ask you to think about why the code is written that way instead of quickly replacing it.
The custom logging in sozu is not there due to a lack of libraries for that, they existed long ago. That is because logging is an extremely costly task.
At first, the proxy used the crates log and env-logger. Adding a custom formatter with env-logger destroyed the performance, so I had to vendor the macros and make custom formatting faster https://github.com/sozu-proxy/sozu/blob/main/command/src/logging.rs#L568
Logging was still about 40% of CPU time so I went and optimized it further:

A HTTP proxy is not doing much: reading data into a buffer, parsing it, writing data from a buffer to a socket. So it's not a surprise that tasks like logging and metrics, that can result in a lot of small allocations in the hot path, would be costly.
BTW I'm pretty sure we'll be back on the same topic in a few months when you discover the custom statsd metrics writer that, like logging, was painstakingly written for better performance and avoiding very annoying bugs.

My advice here:

  • I'm not saying tracing is bad, it's actually great and solves a lot of problems, but,
  • do not go and replace the logging code without benchmarking it, because it will flush years of effort on performance down the drain (like recently when I had to call out a PR that introduced way too many allocations)
  • get into the habit of benchmarking. It's easy to assume that Rust is fast enough and sozu alady has a good foundation fr performance, but it actually requires a lot of care. Be indful about the kind of benchmark you run, and the parameters (example: you should reduce the resources given to sozu to a minimum while giving a lot of resources to the backends, otherwise you're testing the backend's performance, not sozu's)
  • remember Chesterton's fence

@Keksoj
Copy link
Member Author

Keksoj commented Jan 6, 2023

Thanks a lot for your advice @Geal , we did miss the sign on that one. As @Wonshtrum suggests, these performance issues may arise as well with the use of anyhow for error management in the Sōzu lib, as it allocates memory on the heap, and errors are quite frequent when connecting to backend for instance.

Your remark put emphasis on benchmarking and that's what we should focus on. We'll definitely look in this direction. Thanks!

@Keksoj Keksoj closed this Jan 6, 2023
Roadmap automation moved this from In progress to Done Jan 6, 2023
@Keksoj Keksoj reopened this Jan 6, 2023
Roadmap automation moved this from Done to In progress Jan 6, 2023
@Keksoj Keksoj closed this Jan 10, 2023
Roadmap automation moved this from In progress to Done Jan 10, 2023
@Keksoj Keksoj deleted the switch-to-tracing branch October 25, 2023 09:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Roadmap
  
Done
Development

Successfully merging this pull request may close these issues.

None yet

2 participants