New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of some path expressions #303

Merged
merged 4 commits into from Sep 29, 2015

Conversation

Projects
None yet
2 participants
@lutter
Member

lutter commented Sep 28, 2015

Path expressions of the form service[75] were very slow when there were a large number of service nodes. These patches try to speed that up.

Tons of thanks to @domcleal for figuring out why this was slow, writing a test that demonstrates the problem and providing lots of detail on what exactly was going on.

Dominic Cleal and others added some commits Jul 23, 2015

* test-perf.c (testPerfPredicate): add test showing slow predicates
When parsing a file with NagiosObjects and many objects inside, a large
tree forms like this:

    /files/etc/nagios/objects/services.cfg/service[1]
    /files/etc/nagios/objects/services.cfg/service[2]
    ...
    /files/etc/nagios/objects/services.cfg/service[5000]

aug_get performance is pretty terrible at this size when using a path
above, due to the evaluation of the [5000] predicate.  Paths without a
predicate (e.g. using seq or uniquely labelled nodes) are retrieved
quickly.

The time taken to set and get 5,000 such nodes is recorded in
tests/test-perf.log and is currently about:

    testPerfPredicate = 66636ms

A helper to run Valgrind's callgrind tool is added to src/try.
Speed up the duplicate check when constructing nodesets
When we evaluate path expressions, we need to construct nodesets, and they
truly need to be sets, i.e. contain each node at most once. This leads to
problems when a node has lots of children with the same name. The duplicate
check in ns_add lead to O(n^2) behavior for nodes with n children.

We simplify this by adding a flag to each tree node that we use only around
repeated calls of ns_add when we construct a node set. This is possible
since the construction of nodesets is very local, and it is therefore easy
to determine where we need to put the call to ns_clear_added when we are
done building the nodeset.
* src/pathx.c (ns_filter): speed up by removing nodes in batches
We used to remove nodes that did not match one-by-one by calling memmove
for each of them. Now we batch runs of non-matching nodes and call memmove
only once for each run. The common case of a predicate that matches only
one node at a certain position ('service[17]') now requires two memmoves
rather than size(ns)-1 many memmoves.

This leads to a drastic performance improvement for large nodesets.
@domcleal

This comment has been minimized.

Member

domcleal commented Sep 29, 2015

That's excellent, thanks for taking this up. The optimisations all make sense, and work very well with the test case (68064ms, 18069ms, 1139ms, 318ms per respective commit). Even without the special case it's such a significant improvement that it should be usable for other predicates.

👍

* src/pathx.c (ns_from_locpath): special-case 'name[42]' path express…
…ions

We know that in a path expression like 'name[42]', only one node can
possibly match, and that we simply need to step through all the nodes and
count until we've reached the 42nd matching node. This greatly simplifies
how we construct the resulting nodeset, and is done in the new function
position_filter.

@lutter lutter force-pushed the lutter:dev/filter branch from 5d795e7 to 23d5e48 Sep 29, 2015

@lutter lutter merged commit 23d5e48 into hercules-team:master Sep 29, 2015

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
@lutter

This comment has been minimized.

Member

lutter commented Sep 29, 2015

Thanks for all the work in figuring out where the perf bottlenecks are - that made writing this a ton easier !

@lutter lutter deleted the lutter:dev/filter branch Oct 4, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment