Skip to content

What makes slog fast

Sudharshan S edited this page Dec 7, 2018 · 17 revisions

This has been written for slog v1. Some slog v2 details are slightly different but attention to performance is the same. Slog v2 is generally marginally faster than slog v1.

Rust is extremely well suited for writing efficient, performance conscious code. Below is the description of the methods that slog implementation is using to achieve great performance.

Static dispatch

Handling logging records in slog is based on Drain trait. By using a trait, it's possible to parametrize structs implementing Drain, with other Drains. Eg. LevelFilter implements Drain by wrapping different Drain and filtering out all the logging records of insufficient logging level. Thanks to the Rust static dispatch, it's a cost-free abstraction. The resulting code is as efficient as hand-written one, while Drains can be nicely combined similarly to LEGO blocks.

Avoiding allocations

Slog rigorously avoids allocation. In fact, the only place in the core slog library that uses allocation, are values associated with loggers, that are placed on the heap with Box and then reference counted with Arc.

thread_local

When particular Drain needs a buffer for serialization, a thread_local reusable vectors are employed and Vec::clear() is used to keep them empty, without reallocating memory. This pattern is easy to use, saving allocation time, keeping the same buffer memory in CPU cache, without a need for synchronization of concurrent calls.

io::Write

Another technique to avoid allocation is to accept io::Write in formatting/serialization APIs, and writing formatted data directly to it, instead of returning objects to the caller, and letting it do it manually.

fn format_me<W : io::Write>(&self, io : &mut W) -> io::Result<()>

vs

fn format_me(&self) -> String

While the second version looks nicer and shorter it has two disadvantages: it allocates, and it does not compose well. Eg. What happens if format_me needs to format a sub-element of &self. If the same type of API is used for sub-element, that's another allocation. The first method can just pass the io to the sub-element method.

Recursive list of generic structs

The combination of avoiding allocation and static dispatch is recursive list of Serialize-implementations. Thanks to this technique just one allocation can be used to handle an arbitrary list of SyncSerialize key-value pairs.

Using lifetimes for temporary data

Every logging statement forms a Record structure that encapsulates all the logging information.

pub struct Record<'a> {
    meta: &'a RecordStatic<'a>,
    msg: fmt::Arguments<'a>,
    values: &'a [BorrowedKeyValue<'a>],
}

Notice the lifetimes and that a part of Record is moved to RecordStatic it is:

pub struct RecordStatic<'a> {
    /// Logging level
    pub level: Level,
    /// File
    pub file: &'static str,
    /// Line
    pub line: u32,
    /// Column (currently not implemented)
    pub column: u32,
    /// Function (currently not implemented)
    pub function: &'static str,
    /// Module
    pub module: &'static str,
    /// Target - for backward compatibility with `log`
    pub target: &'a str,
}

This makes it possible to create RecordStatic as a static variable and thus put the data that is constant for a given logging statement in the read-only section of the binary, instead of re-creating it every time.

It's a big improvement, taking microbenchmark from 6ns to 2ns. Without it, every field of RecordStatic would have to be initialized on stack every time Record is created. With it - Record is created on the stack with only three fields.

Avoid doing anything until it's necessary

Most of the time, most of the logging statements will be ignored due to their low logging levels. That's why it's important to avoid any work when handling them.

Taking current system time is slow. That's why Record does not contain any timestamp. If any Drain wants to print it - it will have to take it manually. This also avoids slog core crate mandate a time library to use by its users and custom Drain implementation.

Formatting logging message takes time, so Record<'a> contains Arguments<'a> and defers its formatting until some drain eg. prints it.

All key-value pairs associated with logging records are taken as a static array of references. All key-value pairs associated with loggers are Arc-chained vectors of Boxed values to avoid having to copy them. Serialize trait makes it possible to defer actual serialization of any key-value pairs to the time when it's actually required.

However, the arguments themselves are normally evaluated and passed down the Drain hierarchy even if the log is discarded. To deal with that a lazy closure can be used:

// Closures can be used for lazy evaluation:
// This `slow_fib` won't be evaluated, as the current drain discards
// "trace" level logging records.
debug!(log, "debug"; "lazy-closure" => FnValue(|_ : &Record| slow_fib(40)));

Compile time logging filtering

Most logging application won't ever print debug or trace level logging messages in production (release builds). That is why by default slog removes trace! and debug! logging statements at compile time. This can be overridden using cargo's "feature flags", and makes both trace! and debug! practically free both performance and code size wise. Hopefully, that will encourage liberal usage, even in performance sensitive code.

For a bit different reasons, trace! messages are by default not compiled in, even in debug builds most of the time they just blow up the log, hiding what's important. When they are really needed, they can be re-enabled with "feature flags".

Support for Async logging

slog-extra::Async is a generic Drain implementation that can offload any logging to a separate logging thread, via channels. Thanks to key-value pairs being wrapped in Arc and a custom struct implementing Serializer it's quite cheap for the main thread to offload serialization, formatting, io accesses and any other logic, to another thread.