Multithreaded Sink #809

jezzsantos · 2016-07-11T20:30:03Z

We are using Serilog for our REST service, between our logger class and an Azure queue.
We have a custom ILogEventSink to write the messages to our Azure queue.

Each REST service call is resulting in on average 100 logging calls (various things like Tracing, Counting, Auditing etc.) And that has really slowed down the response time of each web service call for us.

If we remove the [final] step in our sink that actually makes the call to write to the Azure queue, a service call completes 5X faster!
So we have concluded that the performance bottleneck for us is having the web service request thread doing the actual writing to the Azure queue in the sink.
We want to offload that I/O bound work to another thread pool thread, and allow the web request thread to get out of logging as soon as possible.

A typical design pattern for this is the Producer/Consumer pattern, where one thread produces data on a shared [thread safe] memory queue (i.e. BlockingCollection<T>, or BufferBlock<T>), and another thread consumes that data and does work with it.

Ideally, with serilog, this producer/consumer could be implemented as a delegating ILogEventSink that can be configured between Serilog LoggerConfiguration (producer) and another sink that does the hard (consumer) work (in our case our Azure sink).

Does serilog accomadate anything like this?

Basically, we just don't want the same thread that makes the call to ILogger.* to be the same thread that runs our ILogEventSink.

What would be the Serilog way to solve this problem?

The text was updated successfully, but these errors were encountered:

jezzsantos · 2016-07-11T21:05:03Z

Stephen does a good job of explaining the Producer/Consumer pattern implementation here: http://blog.stephencleary.com/2012/11/async-producerconsumer-queue-using.html

merbla · 2016-07-11T21:25:16Z

Hi @jezzsantos,

Nothing directly exists OoTB that I am aware of. As you mentioned you could implement via the delegating sink or your own sink implementation.

Just out of interest, was the call to the Azure Queue an IO or a network latency issue? (to give you 5x improvement)

nblumhardt · 2016-07-11T21:52:47Z

https://github.com/serilog/serilog-sinks-periodicbatching should go some of the way towards it (it's what the other sinks use).

The difference from what you've suggested is that the queue used is strictly non-blocking (and thus unbounded) but otherwise should cover what you need, and has the benefit of allowing batched writes for fewer roundtrips.

(It'd be interesting to look at a fixed-length/blocking queue, but probably significant work.)

jezzsantos · 2016-07-11T21:53:15Z

@merbla thanks for the confirmation.

I have started creating a sink like that described here, for general purpose use. Might be useful to community for anyone with slow sinks, that they can just plugin perhaps.

An Azure queue should in theory be quick to write to (in terms of network latency), but it is still an HTTP across the network, and that's when in production in the cloud.
In development, with the azure emulator in the mix, it is a call to an underlying local SQL database, which evidently (from pperf testing) blasts the disk to 100% for us when we are doing many API calls during integration testing. Which slows everything down 5X.

Primarily for us, the 5X improvement we want to see is locally in the Azure Emulator. And that directly affects the speed of our integration and E2E testing feedback loop. Which is vital to us. Which is why we are going after it. We assume any improvement seen locally will equate to some improvement in production (at scale), so there is more value in it.

Besides which, since we are calling our logging on average about 50-100 times in a single REST API call, it makes sense to batch these logging requests in memory before writing to queue, so this optimization is long overdue for us anyhow.

jezzsantos · 2016-07-14T09:28:19Z

Guys @merbla @nblumhardt , I have just finished refining and performance testing a BufferedQueueSink : ILogEventSink class that has the following characteristics:

Can be inlined with any other IEventLogSink
Implements the consumer/producer pattern, but that is buffered and bounded.
Can be configured with any buffer size (default is 50) but can be anything. The buffer size controls how many log events can be submitted (produced) before the buffer postpones (asynchronously) any more incoming log events, and forces consumption of those [queued] log events before allowing any postponed log events to be consumed. (This is what makes it a useful rolling buffer). This buffering is to prevent memory footprint exceeding the available memory, in the case where the production rate may be quicker than the consumption rate. Exactly what you need if writing the log events is slower than the code producing the log events. The sink uses the BufferBlock<T> from the MS TPL library.
Supports concurrent consumption, so that a [thread pool] thread will be used to consume log events while the thread that produces logs events returns. This is especially useful if your sink is expensive in time or I/O. The calling thread does not spend its time writing to the sink, it simply queues the log event in a buffered memory queue.
In performance testing, it is demonstrated with this sink, that the logging thread spends less than 10ms producing 1000 log events, and then returns. Whilst the consuming thread takes about 30s to consume the 100 events and persist them to local db storage. Thats a couple orders of magnitude increase in performance for the logging thread than it would be if it were using the original sink to write to local storage directly. Particularly useful if you are utilizing extensive and heavy logging but don't want the code that is logging to be blocked writing too sinks.
It is believed that this sink can help improve the performance of any sink that uses the calling thread to write to the sink. It does come at the cost of a thread pool thread (per instance of the sink).
One improvement would be to dedicate a single thread to be used for all instances of this sink, wherever they are used.

The reasons and motivation for writing this sink is described in the opening post of this issue.

I am happy to share this implementation along with its unit tests and integration tests, as a standalone component with the community to use, if someone can simply guide me how best to do that. The code is less than 200 lines long (2 x classes).

Here is the gist of the code: https://gist.github.com/jezzsantos/44bf844b27bebafcb64d9947e1896649

nblumhardt · 2016-07-14T23:31:40Z

Hey Jezz, sounds awesome! I haven't had a chance to check out the gist, but in terms of making this available what do you think of packaging this in something like Serilog.Sinks.Async?

One possible API is:

Log.Logger = new LoggerConfiguration()
    .WriteTo.Async(wt => wt.File("log.txt"))
    .CreateLogger();

The wt arg would be a LoggerSinkConfiguration, meaning that any existing sink could be adapted this way. Under the hood, the Async() extension method can use a second logging pipeline to achieve this pretty smoothly.

There's enough interest in this kind of thing that I can see potentially being part of Serilog in the long-term, but just standing up a personal repo that we can all help to shape during the development process would be the easiest/quickest way to get started.

What do you think?

jezzsantos · 2016-07-15T00:17:14Z

I am happy to do as you suggest. wilco

jezzsantos · 2016-07-15T04:09:33Z

@nblumhardt might need your help here, to get this to where you need it.

I have the extension method:

        public static LoggerConfiguration Async(this LoggerSinkConfiguration configuration, Action<LoggerSinkConfiguration> sinkConfiguration)
        {
// What to do here?
        }

used like this:

            logger = new LoggerConfiguration()
                .WriteTo.Async(x => x.File("log.txt"))
                .CreateLogger();

nblumhardt · 2016-07-15T04:27:18Z

        public static LoggerConfiguration Async(this LoggerSinkConfiguration configuration, Action<LoggerSinkConfiguration> sinkConfiguration)
        {
            var sublogger = new LoggerConfiguration();

            sinkConfiguration(sublogger.WriteTo);

            var wrapper = new BufferedQueueSink((ILogEventSink)sublogger.CreateLogger());

            return configuration.Sink(wrapper);
        }

nblumhardt · 2016-07-15T04:29:52Z

@jezzsantos I haven't thought this through entirely, and Logger implements ILogEventSink explicitly so it may be better for the BufferedQueueSink to accept a Logger directly (and call Write() rather than Emit()).

Accepting Logger would also make it clearer that BufferedQueueSink needs to be IDisposable and dispose the logger at that point.

Hope this gets the ball rolling, anyway!

jezzsantos · 2016-07-15T05:58:53Z

Thanks here is my first crack at it. https://github.com/jezzsantos/Serilog.Sinks.Async

I am working on the Nuget and AppVeyor CI right now

jezzsantos · 2016-07-15T05:59:23Z

Shall we move the discussion over there?

nblumhardt · 2016-07-15T06:10:06Z

Awesome, sounds great.

nblumhardt · 2016-07-15T06:11:29Z

CC @DmitryNaumov - Dmitry, thinking this could be a good option to provide the background file I/O we were talking about, would be great to have your thoughts. Cheers!

DmitryNaumov · 2016-07-16T15:25:44Z

@nblumhardt @jezzsantos I think all of us understand that most of the sinks are slow, especially in multi-threaded applications. File sink is slow, network also slow, even console is slow. And while our implementation approach may differ, looks like we share the idea that we need way to make any given sink "async"/background. This includes managing how often it would flush, maximum memory would consume, how behave if it can't "write" to destination etc.

I made a glance look at @jezzsantos implementation and it seems matching to what I just mentioned. One thing which bothers me, but I'm not good at DataFlow, is how it will behave when buffer is full, will it block the caller or drop buffers? I especially don't like blocking, that's why we want "async" sink. So imo if we run into situation when producers are steadily faster than destination sink, I would prefer to be able to provide a "strategy" how to deal in this situation - block, drop or something else.

And one thing which may lead to this situation and I mentioned it somewhere before is that to be efficient we need to provide sinks with "batching" method. So when our "async" sink decides it's time to write to destination it can do it in most efficient way.

jezzsantos · 2016-07-16T20:49:45Z

Hey @DmitryNaumov,
I am no expert at Dataflow, the sink we now have over at https://github.com/jezzsantos/Serilog.Sinks.Async is based on work described here: http://blog.stephencleary.com/2012/11/async-producerconsumer-queue-using.html whihc describes how the BufferBlock<T> behaves when the buffer reaches its limit (to prevent) memory overflow.

As I understand it, when the buffer is full, the async producer call is async waited on until the consumer catches up. So, not blocked per-se, just async blocked.

Can we take this discussion over there: https://github.com/jezzsantos/Serilog.Sinks.Async?

I am actually witnessing problems with this implementation in our REST service, that I don't yet understand, and it would be good if we could collaborate on that, over there rather than on this closed thread?

nblumhardt mentioned this issue Jul 12, 2016

Process-sharable log files using O_APPEND/FILE_APPEND_DATA serilog/serilog-sinks-file#7

Closed

nblumhardt closed this as completed Jul 15, 2016

jezzsantos mentioned this issue Jul 15, 2016

Design change: IlogEventSink.Emit() to ILogger.Write() serilog/serilog-sinks-async#4

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multithreaded Sink #809

Multithreaded Sink #809

jezzsantos commented Jul 11, 2016 •

edited

Loading

jezzsantos commented Jul 11, 2016

merbla commented Jul 11, 2016 •

edited

Loading

nblumhardt commented Jul 11, 2016

jezzsantos commented Jul 11, 2016

jezzsantos commented Jul 14, 2016 •

edited

Loading

nblumhardt commented Jul 14, 2016

jezzsantos commented Jul 15, 2016

jezzsantos commented Jul 15, 2016

nblumhardt commented Jul 15, 2016

nblumhardt commented Jul 15, 2016

jezzsantos commented Jul 15, 2016

jezzsantos commented Jul 15, 2016

nblumhardt commented Jul 15, 2016

nblumhardt commented Jul 15, 2016

DmitryNaumov commented Jul 16, 2016 •

edited

Loading

jezzsantos commented Jul 16, 2016

Multithreaded Sink #809

Multithreaded Sink #809

Comments

jezzsantos commented Jul 11, 2016 • edited Loading

jezzsantos commented Jul 11, 2016

merbla commented Jul 11, 2016 • edited Loading

nblumhardt commented Jul 11, 2016

jezzsantos commented Jul 11, 2016

jezzsantos commented Jul 14, 2016 • edited Loading

nblumhardt commented Jul 14, 2016

jezzsantos commented Jul 15, 2016

jezzsantos commented Jul 15, 2016

nblumhardt commented Jul 15, 2016

nblumhardt commented Jul 15, 2016

jezzsantos commented Jul 15, 2016

jezzsantos commented Jul 15, 2016

nblumhardt commented Jul 15, 2016

nblumhardt commented Jul 15, 2016

DmitryNaumov commented Jul 16, 2016 • edited Loading

jezzsantos commented Jul 16, 2016

jezzsantos commented Jul 11, 2016 •

edited

Loading

merbla commented Jul 11, 2016 •

edited

Loading

jezzsantos commented Jul 14, 2016 •

edited

Loading

DmitryNaumov commented Jul 16, 2016 •

edited

Loading