Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of some path expressions #303

Merged
merged 4 commits into from Sep 29, 2015

Conversation

lutter
Copy link
Member

@lutter lutter commented Sep 28, 2015

Path expressions of the form service[75] were very slow when there were a large number of service nodes. These patches try to speed that up.

Tons of thanks to @domcleal for figuring out why this was slow, writing a test that demonstrates the problem and providing lots of detail on what exactly was going on.

Dominic Cleal and others added 3 commits September 28, 2015 11:42
When parsing a file with NagiosObjects and many objects inside, a large
tree forms like this:

    /files/etc/nagios/objects/services.cfg/service[1]
    /files/etc/nagios/objects/services.cfg/service[2]
    ...
    /files/etc/nagios/objects/services.cfg/service[5000]

aug_get performance is pretty terrible at this size when using a path
above, due to the evaluation of the [5000] predicate.  Paths without a
predicate (e.g. using seq or uniquely labelled nodes) are retrieved
quickly.

The time taken to set and get 5,000 such nodes is recorded in
tests/test-perf.log and is currently about:

    testPerfPredicate = 66636ms

A helper to run Valgrind's callgrind tool is added to src/try.
When we evaluate path expressions, we need to construct nodesets, and they
truly need to be sets, i.e. contain each node at most once. This leads to
problems when a node has lots of children with the same name. The duplicate
check in ns_add lead to O(n^2) behavior for nodes with n children.

We simplify this by adding a flag to each tree node that we use only around
repeated calls of ns_add when we construct a node set. This is possible
since the construction of nodesets is very local, and it is therefore easy
to determine where we need to put the call to ns_clear_added when we are
done building the nodeset.
We used to remove nodes that did not match one-by-one by calling memmove
for each of them. Now we batch runs of non-matching nodes and call memmove
only once for each run. The common case of a predicate that matches only
one node at a certain position ('service[17]') now requires two memmoves
rather than size(ns)-1 many memmoves.

This leads to a drastic performance improvement for large nodesets.
@domcleal
Copy link
Member

That's excellent, thanks for taking this up. The optimisations all make sense, and work very well with the test case (68064ms, 18069ms, 1139ms, 318ms per respective commit). Even without the special case it's such a significant improvement that it should be usable for other predicates.

👍

…ions

We know that in a path expression like 'name[42]', only one node can
possibly match, and that we simply need to step through all the nodes and
count until we've reached the 42nd matching node. This greatly simplifies
how we construct the resulting nodeset, and is done in the new function
position_filter.
@lutter lutter merged commit 23d5e48 into hercules-team:master Sep 29, 2015
@lutter
Copy link
Member Author

lutter commented Sep 29, 2015

Thanks for all the work in figuring out where the perf bottlenecks are - that made writing this a ton easier !

@lutter lutter deleted the dev/filter branch October 4, 2015 05:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants