Faster infobool expression evaluation #3677

bavison · 2013-11-18T16:18:50Z

Expession infobools are evaluated at runtime from one or more single infobools
and a combination of boolean NOT, AND and OR operators. Previously, parsing
produced a vector of operands (leaf nodes) and operators in postfix
(reverse-Polish) form, and evaluated all leaf nodes every time the expression
was evaluated. But this ignores the fact that in many cases, once one operand
of an AND or OR operation has been evaluated, there is no need to evaluate the
other operand because its value can have no effect on the ultimate result. It
is also worth noting that AND and OR operations are associative, meaning they
can be rearranged at runtime to better suit the selected skin.

This patch rewrites the expression parsing and evaluation code. Now the
internal repreentation is in the form of a tree where leaf nodes represent a
single infobool, and branch nodes represent either an AND or an OR operation
on two or more child nodes.

Expressions are rewritten at parse time into a form which favours the
formation of groups of associative nodes. These groups are then reordered at
evaluation time such that nodes whose value renders the evaluation of the
remainder of the group unnecessary tend to be evaluated first (these are
true nodes for OR subexpressions, or false nodes for AND subexpressions).
The end effect is to minimise the number of leaf nodes that need to be
evaluated in order to determine the value of the expression. The runtime
adaptability has the advantage of not being customised for any particular skin.

The modifications to the expression at parse time fall into two groups:

Moving logical NOTs so that they are only applied to leaf nodes.
For example, rewriting ![A+B]|C as !A|!B|C allows reordering such that
any of the three leaves can be evaluated first.
Combining adjacent AND or OR operations such that each path from the root
to a leaf encounters a strictly alternating pattern of AND and OR
operations. So [A|B]|[C|D+[[E|F]|G] becomes A|B|C|[D+[E|F|G]].

I measured the effect while the Videos window of the default skin was open
(but idle) on a Raspberry Pi, and this reduced the CPU usage by 2.8% from
41.9% to 39.1%:

          Before          After
          Mean   StdDev   Mean   StdDev  Confidence  Change
IdleCPU%  41.9   0.5      39.1   0.9     100.0%      +7.0%

bavison · 2013-11-21T20:38:17Z

Rebased on top of updated PR #3676

bavison · 2013-12-06T04:18:36Z

Rebased on top of updated PR #3676 (again). Tested briefly.

bavison · 2013-12-09T14:38:12Z

Rebased on master now that PR#3676 has been merged.

jmarshallnz · 2014-03-09T20:34:39Z

Note, there's a crash here induced by this patch:

http://forum.xbmc.org/showthread.php?tid=188083&pid=1647630#pid1647630

As this is included in OpenElec, you probably want to fix it or let them know.

t-nelson · 2014-03-21T19:12:44Z

@jmarshallnz This is G+1, right?

jmarshallnz · 2014-03-21T20:51:37Z

Correct.

bavison · 2014-03-24T22:53:26Z

I've taken a look at this. Basically, it seems that previously it was undefined behaviour what happened when XBMC came to evaluate an invalid infobool expression. It happened that trailing ] characters were completely ignored, but that was a quirk of the parsing process. Normally, invalid expressions caused

ERROR: Error parsing boolean expression <expression>

to be output to xbmc.log. In some cases, XBMC would struggle on and always evaluate such expressions as false, and in others it would segfault.

With my new expression evaluator, it worked out that you got a segfault every time you evaluated an invalid expression. However, there were no exceptions in which they were missed out of xbmc.log, and in fact it gives you an additional diagnostic line about what is wrong with the expression (in the case in question, "ERROR: Unmatched ]").

The additional patch here tweaks things so that if parsing fails, there is no code path which can lead to the expression tree pointer being left uninitialised. This avoids the overhead of testing for null pointers every time an expression is used. The expression tree is set to point to a node representing "false" if the expression parser failed.

Is this acceptable, or are we going to formalise the rule about permitting an indefinite number of trailing ] characters with no effect?

MartijnKaijser · 2014-05-21T16:25:50Z

@bavison care to rebase?

any comments on last comment from people involved?

da-anda · 2014-06-05T07:00:55Z

can we push this PR forward? It seems to be tested quite well so far in all MillhouseVH builds for the PI because it's part of @popcornmix newclock3 branch

bavison · 2014-06-05T18:31:55Z

OK, here's a rebase. I've squashed the trailing ] bugfix into it.

MartijnKaijser · 2014-06-09T16:56:46Z

jenkins build this please

jmarshallnz · 2014-06-09T20:46:42Z

Will review in detail.

xbmc/interfaces/info/InfoExpression.cpp

+      {
+        c = *s++;
+      } while (c == ' ' || c == '\t' || c == '\r' || c == '\n');
+      s--;


jmarshallnz · 2014-06-10T08:27:49Z

Minor nits aside, very nice!

bavison · 2014-06-10T19:20:26Z

I hope I have addressed all your concerns - presented for now as a separate commit for ease of review. I'll squash if you're happy with it.

jmarshallnz · 2014-06-11T23:04:58Z

Yeah - looks better I reckon. If you fix up the (extreme) minors and squash down we'll build test and get it in. Thanks!

Expession infobools are evaluated at runtime from one or more single infobools and a combination of boolean NOT, AND and OR operators. Previously, parsing produced a vector of operands (leaf nodes) and operators in postfix (reverse-Polish) form, and evaluated all leaf nodes every time the expression was evaluated. But this ignores the fact that in many cases, once one operand of an AND or OR operation has been evaluated, there is no need to evaluate the other operand because its value can have no effect on the ultimate result. It is also worth noting that AND and OR operations are associative, meaning they can be rearranged at runtime to better suit the selected skin. This patch rewrites the expression parsing and evaluation code. Now the internal repreentation is in the form of a tree where leaf nodes represent a single infobool, and branch nodes represent either an AND or an OR operation on two or more child nodes. Expressions are rewritten at parse time into a form which favours the formation of groups of associative nodes. These groups are then reordered at evaluation time such that nodes whose value renders the evaluation of the remainder of the group unnecessary tend to be evaluated first (these are true nodes for OR subexpressions, or false nodes for AND subexpressions). The end effect is to minimise the number of leaf nodes that need to be evaluated in order to determine the value of the expression. The runtime adaptability has the advantage of not being customised for any particular skin. The modifications to the expression at parse time fall into two groups: 1) Moving logical NOTs so that they are only applied to leaf nodes. For example, rewriting ![A+B]|C as !A|!B|C allows reordering such that any of the three leaves can be evaluated first. 2) Combining adjacent AND or OR operations such that each path from the root to a leaf encounters a strictly alternating pattern of AND and OR operations. So [A|B]|[C|D+[[E|F]|G] becomes A|B|C|[D+[E|F|G]]. I measured the effect while the Videos window of the default skin was open (but idle) on a Raspberry Pi, and this reduced the CPU usage by 2.8% from 41.9% to 39.1%: Before After Mean StdDev Mean StdDev Confidence Change IdleCPU% 41.9 0.5 39.1 0.9 100.0% +7.0%

bavison · 2014-06-12T01:27:21Z

OK, done.

jmarshallnz · 2014-06-12T01:32:13Z

I notice you haven't changed the fact we have two stacks still. Are they always referencing the same objects? If so, I think we could clean that up by just getting the node type from the nodes stack anyway (we know the type either from rtti or by adding a get_type() to the baseclass) This can be done afterwards - i.e. not required for merge.

jenkins build this please

jmarshallnz · 2014-06-12T02:28:16Z

jenkins build this please

jmarshallnz · 2014-06-12T05:10:01Z

Built fine. Let me know what you think of the two stack -> single stack suggestion. It can be changed afterwards, but I'd like to know one way or another as to whether it's worth doing so before merging. Thanks!

bavison · 2014-06-12T20:05:01Z

Yes, that's probably a reasonable compromise - hiding the complexity inside get_type() methods to keep the tests for how to merge the subtrees simple. While calling get_type() would probably be slightly slower than just reading from a vector of enums, that's balanced against not needing to maintain that vector. And in any case, we're only talking about parse-time processing here anyway: the real speed benefit is in the subsequent evaluation of infobool expressions instead.

jmarshallnz · 2014-06-12T20:07:48Z

Agreed - will look to clean that up next time I have a few minutes - thanks again!

…uation Faster infobool expression evaluation

bavison mentioned this pull request Dec 5, 2013

Add selective caching of infobools in list items #3676

Closed

jmarshallnz modified the milestones: Awaiting answer from dev, Pending for inclusion, H* 14.0-alpha1 Mar 21, 2014

t-nelson added the Helix label Mar 28, 2014

jmarshallnz reviewed Jun 10, 2014
View reviewed changes

xbmc/interfaces/info/InfoExpression.cpp

{

c = *s++;

} while (c == ' ' || c == '\t' || c == '\r' || c == '\n');

s--;

This comment was marked as spam.

Sign in to view

jmarshallnz added a commit that referenced this pull request Jun 12, 2014

Merge pull request #3677 from bavison/faster_infobool_expression_eval…

192223b

…uation Faster infobool expression evaluation

jmarshallnz merged commit 192223b into xbmc:master Jun 12, 2014

jmarshallnz mentioned this pull request Jun 14, 2014

Cleanup node type #4916

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Faster infobool expression evaluation #3677

Faster infobool expression evaluation #3677

bavison commented Nov 18, 2013

bavison commented Nov 21, 2013

bavison commented Dec 6, 2013

bavison commented Dec 9, 2013

jmarshallnz commented Mar 9, 2014

t-nelson commented Mar 21, 2014

jmarshallnz commented Mar 21, 2014

bavison commented Mar 24, 2014

MartijnKaijser commented May 21, 2014

da-anda commented Jun 5, 2014

bavison commented Jun 5, 2014

MartijnKaijser commented Jun 9, 2014

jmarshallnz commented Jun 9, 2014

This comment was marked as spam.

jmarshallnz commented Jun 10, 2014

bavison commented Jun 10, 2014

jmarshallnz commented Jun 11, 2014

bavison commented Jun 12, 2014

jmarshallnz commented Jun 12, 2014

jmarshallnz commented Jun 12, 2014

jmarshallnz commented Jun 12, 2014

bavison commented Jun 12, 2014

jmarshallnz commented Jun 12, 2014

Faster infobool expression evaluation #3677

Faster infobool expression evaluation #3677

Conversation

bavison commented Nov 18, 2013

bavison commented Nov 21, 2013

bavison commented Dec 6, 2013

bavison commented Dec 9, 2013

jmarshallnz commented Mar 9, 2014

t-nelson commented Mar 21, 2014

jmarshallnz commented Mar 21, 2014

bavison commented Mar 24, 2014

MartijnKaijser commented May 21, 2014

da-anda commented Jun 5, 2014

bavison commented Jun 5, 2014

MartijnKaijser commented Jun 9, 2014

jmarshallnz commented Jun 9, 2014

This comment was marked as spam.

jmarshallnz commented Jun 10, 2014

bavison commented Jun 10, 2014

jmarshallnz commented Jun 11, 2014

bavison commented Jun 12, 2014

jmarshallnz commented Jun 12, 2014

jmarshallnz commented Jun 12, 2014

jmarshallnz commented Jun 12, 2014

bavison commented Jun 12, 2014

jmarshallnz commented Jun 12, 2014