Add support for QueueHandler and Listeners in logging pipeline #250

Ktmi · 2022-07-20T18:05:15Z

This issue is on using QueueHandler+QueueListener to potentially improve logging performance.

In #236 @viniarck discussed how loggers would result in reduced performance in our application hot paths. As part of discussion of this issue, @viniarck suggested that we integrate QueueHandler+QueueListener into our logging pipeline in order to offload the IO operations associated with logging into a separate thread.

As part of the evaluation of #247, I examined the potential performance gains of modifying how we provided the logger to napps by cross validating it with different methods for handling logging in a synthetic benchmark. Included in the log handling methods was a QueueHandler+QueueListener for dispatching records to IO based handlers. I found that using the QueueHandler resulted in the calling thread spending less time logging than using a StreamHandler or RotatingFileHandler. Here are the results I collected:

old_provider, file_handler, 0:00:00.005044, 0:00:02.083217
old_provider, console_handler, 0:00:00.000171, 0:00:15.878804
old_provider, file_queue_handler, 0:00:00.000184, 0:00:01.799022
old_provider, console_queue_handler, 0:00:00.000177, 0:00:01.761531
new_provider, file_handler, 0:00:00.000163, 0:00:00.474181
new_provider, console_handler, 0:00:00.000240, 0:00:03.941527
new_provider, file_queue_handler, 0:00:00.000347, 0:00:00.258122
new_provider, console_queue_handler, 0:00:00.000207, 0:00:00.246152

Note: The first time is instantiation time for the logger, the second is the amount of time it takes to log 10000 records.

@viniarck took part of my synthetic benchmark code to create a test for using QueueHandler+QueueListener in Kytos to measure the performance in a closer to real world setting. He found that versus directly passing logging to IO handlers, that logging with the QueueHandler performed about the same as the IO handlers. Here are his results:

Logging with IO Handlers:

❯ jq -ncM '{method: "POST", url: "http://localhost:8181/api/kytos/flow_manager/v2/flows/00:00:00:00:00:00:00:01", body: { "force": false, "flows": [ { "priority": 10, "match": { "in_port": 1, "dl_vlan": 100 }, "actions": [ { "action_type": "output", "port": 1 } ] } ] } | @base64, header: {"Content-Type": ["application/json"]}}' | vegeta attack -format=json -rate 50/1s -
duration=20s | tee results.bin | vegeta report
Requests      [total, rate, throughput]         1000, 50.05, 50.03
Duration      [total, attack, wait]             19.989s, 19.98s, 8.268ms
Latencies     [min, mean, 50, 90, 95, 99, max]  5.501ms, 15.649ms, 10.278ms, 35.621ms, 45.979ms, 63.295ms, 92.038ms
Bytes In      [total, mean]                     37000, 37.00
Bytes Out     [total, mean]                     123000, 123.00
Success       [ratio]                           100.00%
Status Codes  [code:count]                      202:1000
Error Set:

Logging with Queue Handlers:

❯ jq -ncM '{method: "POST", url: "http://localhost:8181/api/kytos/flow_manager/v2/flows/00:00:00:00:00:00:00:01", body: { "force": false, "flows": [ { "priority": 10, "match": { "in_port": 1, "dl_vlan": 100 }, "actions": [ { "action_type": "output", "port": 1 } ] } ] } | @base64, header: {"Content-Type": ["application/json"]}}' | vegeta attack -format=json -rate 50/1s -
duration=20s | tee results.bin | vegeta report
Requests      [total, rate, throughput]         1000, 50.05, 50.01
Duration      [total, attack, wait]             19.996s, 19.98s, 15.544ms
Latencies     [min, mean, 50, 90, 95, 99, max]  5.546ms, 15.109ms, 10.948ms, 30.644ms, 44.18ms, 64.865ms, 85.668ms
Bytes In      [total, mean]                     37000, 37.00
Bytes Out     [total, mean]                     123000, 123.00
Success       [ratio]                           100.00%
Status Codes  [code:count]                      202:1000
Error Set:

Conclusion

It seems to me that Queue handlers have some potential to improve performance, however to what degree seems to vary. In @viniarck tests it showed little improvement by comparison to the IO Handlers, however in my experiments it showed nearly a 2x performance increase. In order to find out more, I propose we implement a method to use QueueHandlers and QueueListeners with our current logging infrastructure. This can be made an optional feature, which can be used in systems where logging directly is known to be slow.

Notes

It should be noted that both my benchmark and @viniarck's test contain the same bug, where we pass True as a handler for the queue listener, instead of doing respect_handler_level=True. This however, didn't seem to majorly affect performance. Additionally @viniarck uses one logger going to two queues each going to a separate handler, rather than 1 logger to 1 queue to both handlers, which could potentially increase performance.

The text was updated successfully, but these errors were encountered:

viniarck · 2022-07-20T18:23:44Z

Thanks for mapping this, David, and all the findings and progress so far.

Ktmi · 2022-07-27T14:40:01Z

So I was able to figure out a way we could potentially introduce queueing to our logging pipeline. Python's logging framework allows for setting the class of the logger returned by the getLogger function. What we can do is derive a new logger from the base logger that has built in queueing. I was able to construct a synthetic benchmark using this technique, and was able to see that the results were fairly comparable to the ones I got with attaching the queue handler to the regular logger. Here are the results I got:

new_provider, file_handler, 0:00:00.003157, 0:00:00.536122
new_provider, console_handler, 0:00:00.000207, 0:00:02.267803
queue_provider, file_handler, 0:00:00.000973, 0:00:00.273065
queue_provider, console_handler, 0:00:00.000479, 0:00:00.277681

viniarck · 2022-07-27T15:33:58Z

queue_provider, file_handler, 0:00:00.000973, 0:00:00.273065

@Ktmi looks like a very elegant solution. The gains are expressive, another cool thing also about this QueueLogger is that it respects the debug level accordingly, and if it performs well under stress tests on REST endpoints on hot paths that's a double win. In addition to the kytos.napps.*, if you could also cover the root logger and the other major kytos-related loggers aka kytos.*, werkzeug, and engineio, that'd be great, check out this line to see how the logging.ini file is being loaded (and it can be customized by network operators) and how the LogManager loads and manages the current handlers. If you could integrate with LogManager (or augment it) while supporting the handlers on logging.ini that'd be would be great since it would ensure that it work with the current log features that the core provides.

viniarck · 2022-08-05T15:06:31Z

This has landed on #257

Ktmi mentioned this issue Jul 28, 2022

Add support for using Queues in Logging Pipeline #257

Merged

viniarck added 2022.2 Kytos-ng 2022.2 epic_logging Logging performance improvements labels Aug 3, 2022

viniarck closed this as completed Aug 5, 2022

viniarck mentioned this issue Oct 5, 2022

kytos.core.log can contribute to slowness on hot paths (_store_changed_flows is slow too) kytos-ng/flow_manager#77

Closed

viniarck mentioned this issue Dec 1, 2023

kytos.core.log can contribute to slowness on hot paths #236

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for QueueHandler and Listeners in logging pipeline #250

Add support for QueueHandler and Listeners in logging pipeline #250

Ktmi commented Jul 20, 2022

viniarck commented Jul 20, 2022

Ktmi commented Jul 27, 2022

viniarck commented Jul 27, 2022 •

edited

Loading

viniarck commented Aug 5, 2022

Add support for QueueHandler and Listeners in logging pipeline #250

Add support for QueueHandler and Listeners in logging pipeline #250

Comments

Ktmi commented Jul 20, 2022

Conclusion

Notes

viniarck commented Jul 20, 2022

Ktmi commented Jul 27, 2022

viniarck commented Jul 27, 2022 • edited Loading

viniarck commented Aug 5, 2022

viniarck commented Jul 27, 2022 •

edited

Loading