Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible JSON logging performance regression #595

Open
jsiwek opened this issue Sep 24, 2019 · 9 comments · Fixed by #600

Comments

@jsiwek
Copy link
Member

commented Sep 24, 2019

Keith reported to me a possible loss of performance in the JSON logging of Zeek 3.0.0 versus Bro 2.6.4, so I did a quick benchmark using this test.zeek:

option log_mult = 0;

event connection_state_remove(c: connection) &priority=-10
  {
  local n = log_mult;

  while ( n > 0 )
    {
    Log::write(Conn::LOG, c$conn);
    --n;
    }
  }
# Zeek 3.0.0 JSON logging
time -f "%e %M" zeek -G random.seed -r 2009-M57-day11-18.trace test.zeek log_mult=100 LogAscii::use_json=T
45.00 946668

# Bro 2.6.4 JSON logging
time -f "%e %M"  bro -G random.seed -r 2009-M57-day11-18.trace test.zeek log_mult=100 LogAscii::use_json=T
15.79 140820

# Zeek 3.0.0 ASCII logging
time -f "%e %M" zeek -G random.seed -r 2009-M57-day11-18.trace test.zeek log_mult=100 LogAscii::use_json=F
14.29 156476

# Bro 2.6.4 ASCII logging
time -f "%e %M"  bro -G random.seed -r 2009-M57-day11-18.trace test.zeek log_mult=100 LogAscii::use_json=F
15.77 149264

# Zeek 3.0.0 JSON logging, patched to use default nlohmann json object instead of fifo_map
time -f "%e %M" zeek -G random.seed -r 2009-M57-day11-18.trace test.zeek log_mult=100 LogAscii::use_json=T
21.96 615240

And some flamegraphs:

bro-2.6.4-json.zip

11.79% spent in threading::formatter::JSON::Describe()

zeek-3.0.0-json-fifo-map.zip

63.52% spent in threading::formatter::JSON::Describe()

zeek-3.0.0-json-default.zip (not using fifo_map)

44.72% spent in threading::formatter::JSON::Describe()

So at least a good portion of the time relates to the ordering of JSON output. Maybe making that a Zeek configuration option would be something to partially help the loss of performance here.

@timwoj other ideas for things that could help here? Is there anything related to "correctness" of the JSON output in Zeek 3.0.0 versus that of Bro 2.6.4 that helps justify some of the difference ? Or is it clearly just a performance difference between two json-formatting implementations (ours vs. nlohmann) ?

it's also possible my test case is too contrived and hitting some new worst-case scenario we never really expect, but doubt it.

@jsiwek jsiwek added this to Unassigned / Todo in Release 3.1.0 via automation Sep 24, 2019
@jsiwek jsiwek added this to the 3.1.0 milestone Sep 24, 2019
@timwoj

This comment has been minimized.

Copy link
Contributor

commented Sep 24, 2019

My guess (without looking too closely) is that it's almost entirely the difference between our naive append-to-a-string implementation and the nlohmann library. There's time spent in the library creating the objects and then converting the into a string. I can look at whether there's faster ways to do that.

nlohmann's also isn't the fastest json library out there. It's in the middle-to-bottom of the pack when it comes to stringifying (https://github.com/miloyip/nativejson-benchmark#stringify-time).

@sethhall

This comment has been minimized.

Copy link
Member

commented Sep 25, 2019

Thanks for doing the comparison Jon! Unfortunately, this feels to me like one of those critical issues that we'll need to move on pretty quickly since under high logging volume it appears to cause so much overhead (I had been looking at this with Keith recently too)

@sethhall

This comment has been minimized.

Copy link
Member

commented Sep 25, 2019

Oh, I also think we should target this as a fix to 3.0 (3.0.1) in addition to 3.1.0.

@jsiwek

This comment has been minimized.

Copy link
Member Author

commented Sep 25, 2019

I also think we should target this as a fix to 3.0 (3.0.1)

My thought too, I'll create a 3.0.1 project/milestone and add this.

@jsiwek jsiwek added this to Unassigned / Todo in Release 3.0.1 via automation Sep 25, 2019
@jsiwek jsiwek modified the milestones: 3.1.0, 3.0.1 Sep 25, 2019
@timwoj timwoj self-assigned this Sep 25, 2019
@timwoj

This comment has been minimized.

Copy link
Contributor

commented Sep 25, 2019

Part of this is that fifo_map can be very slow:

Inserting a value (via operator[], insert) and removing a value (erase) rely on std::unordered_map::insert and std::unordered_map::erase which have O(1) average complexity and O(n) worst-case complexity.

We're forced to use fifo_map though to preserve the ordering of keys as they're inserted.

EDIT: There's apparently also https://github.com/Tessil/ordered-map which appears to be faster than fifo_map but it fails to build correctly.

Losing the ability to have sorted keys is a good temporary fix but it's going to break a few test cases, and I'm not sure if people are relying on the absolute ordering in the wild (which since it's JSON, they really shouldn't be).

@timwoj

This comment has been minimized.

Copy link
Contributor

commented Sep 25, 2019

Switching over to use tsl/ordered_map gives close to the same performance as without ordering:

no ordering:

real	0m51.466s
user	1m9.390s
sys	0m7.144s

ordered_map:

real	0m54.723s
user	1m12.826s
sys	0m7.293s

There's also some minor sub-second gains to be found using emplace() instead of operator[] for adding values, but emplace has the same issues as with any map where if the key already exists it won't overwrite it.

@jsiwek jsiwek moved this from Unassigned / Todo to Needs Review / Merge in Release 3.0.1 Sep 27, 2019
@jsiwek jsiwek referenced this issue Sep 27, 2019
13 of 14 tasks complete
@jsiwek

This comment has been minimized.

Copy link
Member Author

commented Sep 27, 2019

@timwoj Thanks for taking a look. #600 does seem to help a lot. In a release build, the test commands with log_mult=100 I gave earlier are comparable between 3.0.0 and 2.6.4, but can still crank it to log_mult=1000 and see 3.0.0 taking 1.4x longer than 2.6.4.

Not sure exactly what difference we ultimately want to accept. Since the change to the tsl map is just better than fifo_map regardless, I'd be up for merging #600 (after my minor comments are addressed) and asking Keith to help test if it performs well enough and go from there. Other ideas?

@timwoj

This comment has been minimized.

Copy link
Contributor

commented Sep 27, 2019

Not sure exactly what difference we ultimately want to accept. Since the change to the tsl map is just better than fifo_map regardless, I'd be up for merging #600 (after my minor comments are addressed) and asking Keith to help test if it performs well enough and go from there. Other ideas?

Right now the two major bottlenecks are:

  1. Building JSON tree
  2. Dumping the JSON tree to the ODesc object

Both of these happen a ton of times (~20k times on the 2009-M57-day11-18.trace file). Unfortunately, it happens repeatedly just because of the way that we log values, and any change to that would obviously require a ton of changes. The speedups in the PR are mostly fixing the first part, since we were using a map object with potentially-slow insertion times. I don't think there's any way to speed up the second part since it's literally a single call to a method that turns the json object into a string.

Unless there's some magic juice somewhere in nlohmann/json, I don't think we can get much more out of it.

@jsiwek jsiwek moved this from Unassigned / Todo to Assigned / In Progress in Release 3.1.0 Sep 27, 2019
@jsiwek jsiwek moved this from Assigned / In Progress to Pending Review / Merge in Release 3.1.0 Sep 27, 2019
@jsiwek jsiwek moved this from Pending Review / Merge to Assigned / In Progress in Release 3.1.0 Sep 27, 2019
@jsiwek jsiwek closed this in #600 Oct 1, 2019
Release 3.1.0 automation moved this from Assigned / In Progress to Done Oct 1, 2019
Release 3.0.1 automation moved this from Needs Review / Merge to Done Oct 1, 2019
@jsiwek

This comment has been minimized.

Copy link
Member Author

commented Oct 1, 2019

Just keeping this open until we get more feedback (e.g. from Keith) and then decide from there whether we want to look into more extreme changes to help performance.

@jsiwek jsiwek reopened this Oct 1, 2019
Release 3.1.0 automation moved this from Done to Assigned / In Progress Oct 1, 2019
Release 3.0.1 automation moved this from Done to Assigned / In Progress Oct 1, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Release 3.1.0
  
Assigned / In Progress
Release 3.0.1
  
Assigned / In Progress
3 participants
You can’t perform that action at this time.