Skip to content
Low-Latency C++ Logger Design
C++ CMake
Branch: master
Clone or download

Latest commit

Fetching latest commit…
Cannot retrieve the latest commit at this time.


Type Name Latest commit message Commit time
Failed to load latest commit information.

WoodyLogger Design

  • Low Latency C++11 Asynchronous Logging Library.
  • Lock-Free Optimized for Mulit-Core system, so it is very fast. See Latency Benchmark
  • Supports multiple log levels: FAULT, ERROR, WARN, INFO, DEBUG, TRACE
  • Configurable for output to local file/stdout or both.
  • Colorful output on stdout for different log level on UNIX terminal.


#include "WoodyLogger.h"
#include <thread>
#include <unistd.h>

void PrintLogThread() {
    LOG_ERROR("PrintLogThread Join -->");
    LOG_ERROR("PrintLogThread END <---");

int main() {
    //WOODY_LOGGER_INIT("/dev/null", false);

    LOG_TRACE("Test LOG_TRACE: %d", 12345);  // TRACE will output on DEBUG level
    LOG_DEBUG("Test LOG_DEBUG: %d", 23456);
    LOG_ERROR("Test LOG_ERROR: %d", 6543210);
    LOG_WARN ("Test LOG_WARN : %07d", 123);
    LOG_INFO ("Test LOG_INFO : %.5f", 1.0/3.0);
    LOG_FAULT("Test LOG_FAULT: %p", &PrintLogThread);

    std::thread print_thread1 = std::thread(PrintLogThread);

    return 0;


API Description

Macro Define: WOODY_LOGGER

In "WoodyLogger.h", there is a macro #define WOODY_LOGGER. User can use it to control the Log API calling comiling. If WOODY_LOGGER is not defined, any call of WOODY_LOGGER_XXXX will not be compilied. So user can put WoodyLogger in code for test and debug usage and then undefine the WOODY_LOGGER macro to turn off WoodyLogger.


Start WoodyLogger, which starts the Consumer thread. It is just needed to be called once only. The consumer thread will keep running until WOODY_LOGGER_STOP() is called.
WOODY_LOGGER_START() should be called in main() function, and before any LOG_XXXX call.


WOODY_LOGGER_INIT marco is used to configure the log output, it has 2 parameters:
("file_path_name", is_enalbe_output_to_stdout)
Will configure the log output into "./log.txt" file and also the stdout. If user want to turn-off the stdout:
WOODY_LOGGER_INIT("./log.txt", false); //disable stdout
User can also configure the FILE handler (e.g. stdout, stderr):
User can direct the file name to "/dev/null" and disable the stdout, then the log will NOT output any:
WOODY_LOGGER_INIT("/dev/null", false);
This is used in benchmark test. If WOODY_LOGGER_INIT is not configured, it will output to stdout in default as the demo shows.


There are 6 log levels as Strong to Week order:
Logger will only output the level which is stronger than the current configured level.
If the level is setup to INFO, then DEBUG and TRACE will not be output.
WOODY_LOGGER_LEVEL() can be called in a thread which means the log level can be controlled dynamically. User can write a thread listen to a port as a Server. The client process can connect to the server and configure the level dynamically.



The output format


[YYYY-MM-DD HH:mm:ss.nanoseconds] [LEVEL] [thread_id] [__FILE__: __LINE__] message


[YYYY-MM-DD HH:mm:ss.nanoseconds] [LEVEL] [thread_id] message


WoodyLogger will start a consumer thread print out the log message. The thread will keep running if user does not call WOODY_LOGGER_STOP()


When I search "Low-Latency C++ Log" in github, I find a very fast C++ log project: NanoLog. It has been proved much faster than spdlog and others by itself. So I am trying to use the benchmark test code in NanoLog so that If WoodyLogger is proved fater than NanoLog, WoodyLogger will be fater than the others. NanoLog's design is so much like WoodyLogger, but WoodyLogger is optimized for multi-core system, it can take advantage in multi-thread environment.

Benchmark Code Compiling Option And System environment

The benchmark test code is run under my MacBook Pro Laptop:

Intel(R) Core(TM) i7-4870HQ CPU @ 2.50GHz
Total Number of Cores:  4
Threads:                8

Test Code

WoodyLogger and NanoLog are both Compiled under -O2 optimization level.

Benchmark Test Result

Threads WoodyLogger(Avg, ns) NanLog(Avg, ns)
1 106 125
2 122.5 197
4 144 250.75
8 274.375 424.375
16 526.9375 936.4375



WoodyLogger's aim is to impletement a Log tool that can be applied in HFT (High Frequent Trading) System. So the design of WoodyLogger is focusing on Log Latency NOT High Throughput, which means WoodyLogger is not going to output as many logs into file as possible, but is going to reduces the latency to nanoseconds as short as possible.

Normal Design

What makes the high latency of a Log tool?

  • As a normal design of a "fast" log tool, I/O denseness firtly comes into developers' mind.
    Usually, developers isolate the log writting operation and log calling operation into two threads. There is a ring buffer storing the messages which the Log API is called. Each caller is a Producer which writes out the log messages into the ring buffer. There is a Consumer thread reads the ring buffer and then print out the msg into file or stdout, etc. Each call of the log API will not be blocked by the I/O dely, so this is a great step to "speed up" (reduce the latency). But it is just the first step, there are more things to do.
  • Many developers did not realize that the String Formatter functions are the very important key of high latency. Such as std::stringstream in C++, sprintf in C, etc. Access FMT for reference.
    The key latency of string foratter is coverting digital numbers into string. Which is very slow in those function, espicially floating-point formatting. Although there are very quick Integer Converting and floating-point converting implementations: double-conversion, but they are not as quick as memory copy. So do NOT convert and format any digital numbers into string in Log calling, but do it in the reading/output thread.

Normally the Log tool is designed as following:
The key problem of latency is using the LOCK. If there are more caller threads, they will block each other even the LOCKER is a spin-lock.


WoodyLogger Design

Lock-Free Queue

The Locker can be removed by using a Lock-Free Queue. When thers are only one producer and one consumer, there is not going to need a lock anymore.
To achieve multi-thread support of a Logger and Lock-Free ring buffer, every thread need a thread_local Lock-Free ring buffer. The consumer thread orginize a std::vector<LockFreeQueue*> to access every ring buffer (LockFreeQueue) in the threads. The life-cycle of LockFreeQueue is maintained by the thead, when the thread exit, the LockFreeQueue is also destoryed and be removed from the vector.
The WoodyLogger design is going to like the following diagram shows:


You can’t perform that action at this time.