Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving List/PList and friends #422

Open
timwoj opened this issue Jun 18, 2019 · 7 comments

Comments

Projects
None yet
4 participants
@timwoj
Copy link
Contributor

commented Jun 18, 2019

I started looking into this because I ran across the loop_over_list method, and thought "why can't that be a ranged-for", which lead me down a rabbit hole of how the list/queue/dict classes could be improved.

The "easy" stuff:

  1. All of the child classes should be templates. This means we can remove the #defines for creating them and just declare them in place using the standard syntax. I have this done already on a branch (topic/timw/template-containers) and it appears to work (all of the tests still pass, anyways).

The "medium" stuff:

  1. The classes should implement the standard iterator interface so that can be taken advantage of with the STL algorithms, ranged-for, etc. I started to work on this and ran into a lot of issues with the dereferencing operator operator* throwing a segfault with List<int>, which lead me to more questions.

The "hard" stuff:

  1. List<(some native type)> is fraught with danger. The only places we use it, we explicitly specify a type that's the same size as a void pointer. This is because we take the value being inserted, consider the value as a pointer, and convert it to void* to be stored in the internal array. This leads to the aforementioned problems with operator*. I'd love to remove this class entirely, but we have a typedef called int_list that's used quite a bit. We'd have to visit each of those uses and see if we could just convert them to vector<int> without any performance penalty. I know the one use in strings.bif can definitely be changed.

  2. Alternatively, we could consider just changing the whole thing over to vector. I know this opens a large can of worms (see the block of text at https://github.com/zeek/zeek/blob/master/src/List.h#L84-L106). I have a few thoughts about this one already.

One is whether we could implement a std::allocator that takes all of those arguments into account and use that with vector to mitigate any performance penalty. How much profiling information do we already have that says there's really a giant gap between our implementation and the STL?

Second is whether we could lean more heavily on move semantics and potentially mitigate those performance issues. I have a feeling that this would lead to a lot more work though, and I'm not entirely sure how we'd go about implementing it just yet.

Thoughts?

@timwoj

This comment has been minimized.

Copy link
Contributor Author

commented Jun 18, 2019

Note that with my comment about std::allocator, the interface for that changes considerably in C++17, moving a bunch of methods and data to std::allocator_traits. If we opt to go that route, it may lead to rework in the future if we get to a point where we can enable C++17 on all of the target platforms.

@jsiwek

This comment has been minimized.

Copy link
Member

commented Jun 19, 2019

The "easy" stuff:

  1. All of the child classes should be templates. This means we can remove the #defines for creating them and just declare them in place using the standard syntax. I have this done already on a branch (topic/timw/template-containers) and it appears to work (all of the tests still pass, anyways).

Sounds good to me -- one of those things that's technically a breaking change, but I expect more people are using the common typedefs directly. e.g. everyone uses val_list, not PList(Val).

The "medium" stuff:

  1. The classes should implement the standard iterator interface so that can be taken advantage of with the STL algorithms, ranged-for, etc.

Yeah, would be a nice improvement.

The "hard" stuff:

  1. List<(some native type)> is fraught with danger. The only places we use it, we explicitly specify a type that's the same size as a void pointer. This is because we take the value being inserted, consider the value as a pointer, and convert it to void* to be stored in the internal array. This leads to the aforementioned problems with operator*. I'd love to remove this class entirely, but we have a typedef called int_list that's used quite a bit. We'd have to visit each of those uses and see if we could just convert them to vector<int> without any performance penalty. I know the one use in strings.bif can definitely be changed.

I only did a quick skim, but didn't look like a whole lot of usages. Most being part of internal implementation details or in the parts of the "API that's public, but not really intended/expected to be used by others". I also don't expect users to be using List<type> directly either, so proceeding to remove it and replace int_list with vector<int> sounds reasonable.

And for judging performance difference in this case, some quick before/after timing measurements on a representative pcap may be a sufficient sanity check.

  1. Alternatively, we could consider just changing the whole thing over to vector. I know this opens a large can of worms (see the block of text at https://github.com/zeek/zeek/blob/master/src/List.h#L84-L106). I have a few thoughts about this one already.

Misc. thoughts:

  • Likely warrants close attention to performance differences if we do change implementations

  • It seems like a Good Thing to at least keep our generic list, dictionary, etc. container types as part of the public API meant for user consumption (e.g. particularly val_list is commonly used in generating events), just we do have wiggle room to tune implementation details without breaking everyone's code

  • Given that last point, the motivation to switch to std::vector should not be high unless it clearly does win a performance contest by default because managing our own block of memory doesn't seem overly complex in terms of maintainability.

One is whether we could implement a std::allocator that takes all of those arguments into account and use that with vector to mitigate any performance penalty. How much profiling information do we already have that says there's really a giant gap between our implementation and the STL?

Don't have any profiling info on the difference, might be fun to check that against our default scripts, but ultimately, usage patterns for lists/dictionaries can depend on a user's custom scripts. Who knows what crazy load they have or thing they want to do. Makes it hard to definitely declare a Best implementation, so probably the general idea should be "let's just not implement it in a way that's difficult to customize/tune later".

Second is whether we could lean more heavily on move semantics and potentially mitigate those performance issues. I have a feeling that this would lead to a lot more work though, and I'm not entirely sure how we'd go about implementing it just yet.

Not sure exactly what you're thinking here. If it's like moving away from "containers of pointers" to "containers of move-able objects", my hunch is that would be Good, but yeah, a lot more work since it's maybe all wrapped up in the general notion of changing "how we do memory management".

@JustinAzoff

This comment has been minimized.

Copy link
Contributor

commented Jun 19, 2019

I would like to benchmark Dict.cc against https://github.com/google/hashtable-benchmarks but since Dict.cc doesn't implement the right interfaces doing so isn't easy.

As a first test of replacing dictionary code, 'Sessions' has this:

PDict(Connection) tcp_conns;
PDict(Connection) udp_conns;
PDict(Connection) icmp_conns;
PDict(FragReassembler) fragments;

Replacing these with a stl map would be easier than trying to replace ALL uses of PDict. I had started trying to replace those with a stl map, but that gets involved when you run into the itercookie stuff.

A Lookup against one of those maps is done for each and every packet received, so improving the perf could be worth it.

@timwoj

This comment has been minimized.

Copy link
Contributor Author

commented Jun 19, 2019

Don't have any profiling info on the difference, might be fun to check that against our default scripts, but ultimately, usage patterns for lists/dictionaries can depend on a user's custom scripts. Who knows what crazy load they have or thing they want to do. Makes it hard to definitely declare a Best implementation, so probably the general idea should be "let's just not implement it in a way that's difficult to customize/tune later".

There's also "if it's not broken, don't fix it". I'll leave that one alone and focus on the templates/iterator support for now.

Not sure exactly what you're thinking here. If it's like moving away from "containers of pointers" to "containers of move-able objects", my hunch is that would be Good, but yeah, a lot more work since it's maybe all wrapped up in the general notion of changing "how we do memory management".

Thinking more about this one, since the containers are already just holding pointers, we probably get the benefits of move semantics since it's just moving pointer locations and not entire objects.

@rsmmr

This comment has been minimized.

Copy link
Member

commented Jun 19, 2019

I havne't fully digested this yet, but couple of quick comments:

(1) Dict has special logic for resizing incrementally to avoid processing spikes; that'd be something to watch out for (and agree with the "not broken" comment).

(2) I've found making something "iterable" quite painful in the past in terms of meeting all the properties, but I believe I also remember some 3rd party header library that provides most of the legwork through a wrapper; might be something to consider

@timwoj

This comment has been minimized.

Copy link
Contributor Author

commented Jun 19, 2019

I've found making something "iterable" quite painful in the past in terms of meeting all the properties

I'm mostly running into the problem implementing operator* because of how BaseList is implemented. Because BaseList uses void* as the internal type for the array of data, you can't typecast the return value of operator* to the type of the list since that makes a temporary lvalue. You're not allowed to return a reference to that temporary lvalue.

I shelved that for a few minutes while I look at whether there's any performance gain/loss for removing int_list/List.

@timwoj

This comment has been minimized.

Copy link
Contributor Author

commented Jun 19, 2019

My goal here isn't purely performance based but more towards modernizing a bit. That said, here's 10 runs of zeek -r 2009-M57-day11-18.trace after changing int_list to be a vector and changing the use of int_list in strings.bif to just be a vector instead.

master:

real    1m3.805s
user    1m2.599s
sys     0m14.744s

with vector:

real    1m3.333s
user    1m2.398s
sys     0m14.585s

A second run of 10 of each results in them running the same time of 1m3.223s, which means the above difference is probably noise as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.