Improve performance of some path expressions #303
Path expressions of the form
Tons of thanks to @domcleal for figuring out why this was slow, writing a test that demonstrates the problem and providing lots of detail on what exactly was going on.
When parsing a file with NagiosObjects and many objects inside, a large tree forms like this: /files/etc/nagios/objects/services.cfg/service /files/etc/nagios/objects/services.cfg/service ... /files/etc/nagios/objects/services.cfg/service aug_get performance is pretty terrible at this size when using a path above, due to the evaluation of the  predicate. Paths without a predicate (e.g. using seq or uniquely labelled nodes) are retrieved quickly. The time taken to set and get 5,000 such nodes is recorded in tests/test-perf.log and is currently about: testPerfPredicate = 66636ms A helper to run Valgrind's callgrind tool is added to src/try.
When we evaluate path expressions, we need to construct nodesets, and they truly need to be sets, i.e. contain each node at most once. This leads to problems when a node has lots of children with the same name. The duplicate check in ns_add lead to O(n^2) behavior for nodes with n children. We simplify this by adding a flag to each tree node that we use only around repeated calls of ns_add when we construct a node set. This is possible since the construction of nodesets is very local, and it is therefore easy to determine where we need to put the call to ns_clear_added when we are done building the nodeset.
We used to remove nodes that did not match one-by-one by calling memmove for each of them. Now we batch runs of non-matching nodes and call memmove only once for each run. The common case of a predicate that matches only one node at a certain position ('service') now requires two memmoves rather than size(ns)-1 many memmoves. This leads to a drastic performance improvement for large nodesets.
That's excellent, thanks for taking this up. The optimisations all make sense, and work very well with the test case (68064ms, 18069ms, 1139ms, 318ms per respective commit). Even without the special case it's such a significant improvement that it should be usable for other predicates.
…ions We know that in a path expression like 'name', only one node can possibly match, and that we simply need to step through all the nodes and count until we've reached the 42nd matching node. This greatly simplifies how we construct the resulting nodeset, and is done in the new function position_filter.