-
-
Notifications
You must be signed in to change notification settings - Fork 6.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Faster infobool expression evaluation #3677
Faster infobool expression evaluation #3677
Conversation
Rebased on top of updated PR #3676 |
Rebased on top of updated PR #3676 (again). Tested briefly. |
Rebased on master now that PR#3676 has been merged. |
Note, there's a crash here induced by this patch: http://forum.xbmc.org/showthread.php?tid=188083&pid=1647630#pid1647630 As this is included in OpenElec, you probably want to fix it or let them know. |
@jmarshallnz This is G+1, right? |
Correct. |
I've taken a look at this. Basically, it seems that previously it was undefined behaviour what happened when XBMC came to evaluate an invalid infobool expression. It happened that trailing ] characters were completely ignored, but that was a quirk of the parsing process. Normally, invalid expressions caused
to be output to xbmc.log. In some cases, XBMC would struggle on and always evaluate such expressions as false, and in others it would segfault. With my new expression evaluator, it worked out that you got a segfault every time you evaluated an invalid expression. However, there were no exceptions in which they were missed out of xbmc.log, and in fact it gives you an additional diagnostic line about what is wrong with the expression (in the case in question, "ERROR: Unmatched ]"). The additional patch here tweaks things so that if parsing fails, there is no code path which can lead to the expression tree pointer being left uninitialised. This avoids the overhead of testing for null pointers every time an expression is used. The expression tree is set to point to a node representing "false" if the expression parser failed. Is this acceptable, or are we going to formalise the rule about permitting an indefinite number of trailing ] characters with no effect? |
@bavison care to rebase? any comments on last comment from people involved? |
can we push this PR forward? It seems to be tested quite well so far in all MillhouseVH builds for the PI because it's part of @popcornmix newclock3 branch |
OK, here's a rebase. I've squashed the trailing ] bugfix into it. |
jenkins build this please |
Will review in detail. |
{ | ||
c = *s++; | ||
} while (c == ' ' || c == '\t' || c == '\r' || c == '\n'); | ||
s--; |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
Minor nits aside, very nice! |
I hope I have addressed all your concerns - presented for now as a separate commit for ease of review. I'll squash if you're happy with it. |
Yeah - looks better I reckon. If you fix up the (extreme) minors and squash down we'll build test and get it in. Thanks! |
Expession infobools are evaluated at runtime from one or more single infobools and a combination of boolean NOT, AND and OR operators. Previously, parsing produced a vector of operands (leaf nodes) and operators in postfix (reverse-Polish) form, and evaluated all leaf nodes every time the expression was evaluated. But this ignores the fact that in many cases, once one operand of an AND or OR operation has been evaluated, there is no need to evaluate the other operand because its value can have no effect on the ultimate result. It is also worth noting that AND and OR operations are associative, meaning they can be rearranged at runtime to better suit the selected skin. This patch rewrites the expression parsing and evaluation code. Now the internal repreentation is in the form of a tree where leaf nodes represent a single infobool, and branch nodes represent either an AND or an OR operation on two or more child nodes. Expressions are rewritten at parse time into a form which favours the formation of groups of associative nodes. These groups are then reordered at evaluation time such that nodes whose value renders the evaluation of the remainder of the group unnecessary tend to be evaluated first (these are true nodes for OR subexpressions, or false nodes for AND subexpressions). The end effect is to minimise the number of leaf nodes that need to be evaluated in order to determine the value of the expression. The runtime adaptability has the advantage of not being customised for any particular skin. The modifications to the expression at parse time fall into two groups: 1) Moving logical NOTs so that they are only applied to leaf nodes. For example, rewriting ![A+B]|C as !A|!B|C allows reordering such that any of the three leaves can be evaluated first. 2) Combining adjacent AND or OR operations such that each path from the root to a leaf encounters a strictly alternating pattern of AND and OR operations. So [A|B]|[C|D+[[E|F]|G] becomes A|B|C|[D+[E|F|G]]. I measured the effect while the Videos window of the default skin was open (but idle) on a Raspberry Pi, and this reduced the CPU usage by 2.8% from 41.9% to 39.1%: Before After Mean StdDev Mean StdDev Confidence Change IdleCPU% 41.9 0.5 39.1 0.9 100.0% +7.0%
OK, done. |
I notice you haven't changed the fact we have two stacks still. Are they always referencing the same objects? If so, I think we could clean that up by just getting the node type from the nodes stack anyway (we know the type either from rtti or by adding a get_type() to the baseclass) This can be done afterwards - i.e. not required for merge. jenkins build this please |
jenkins build this please |
Built fine. Let me know what you think of the two stack -> single stack suggestion. It can be changed afterwards, but I'd like to know one way or another as to whether it's worth doing so before merging. Thanks! |
Yes, that's probably a reasonable compromise - hiding the complexity inside get_type() methods to keep the tests for how to merge the subtrees simple. While calling get_type() would probably be slightly slower than just reading from a vector of enums, that's balanced against not needing to maintain that vector. And in any case, we're only talking about parse-time processing here anyway: the real speed benefit is in the subsequent evaluation of infobool expressions instead. |
Agreed - will look to clean that up next time I have a few minutes - thanks again! |
…uation Faster infobool expression evaluation
Expession infobools are evaluated at runtime from one or more single infobools
and a combination of boolean NOT, AND and OR operators. Previously, parsing
produced a vector of operands (leaf nodes) and operators in postfix
(reverse-Polish) form, and evaluated all leaf nodes every time the expression
was evaluated. But this ignores the fact that in many cases, once one operand
of an AND or OR operation has been evaluated, there is no need to evaluate the
other operand because its value can have no effect on the ultimate result. It
is also worth noting that AND and OR operations are associative, meaning they
can be rearranged at runtime to better suit the selected skin.
This patch rewrites the expression parsing and evaluation code. Now the
internal repreentation is in the form of a tree where leaf nodes represent a
single infobool, and branch nodes represent either an AND or an OR operation
on two or more child nodes.
Expressions are rewritten at parse time into a form which favours the
formation of groups of associative nodes. These groups are then reordered at
evaluation time such that nodes whose value renders the evaluation of the
remainder of the group unnecessary tend to be evaluated first (these are
true nodes for OR subexpressions, or false nodes for AND subexpressions).
The end effect is to minimise the number of leaf nodes that need to be
evaluated in order to determine the value of the expression. The runtime
adaptability has the advantage of not being customised for any particular skin.
The modifications to the expression at parse time fall into two groups:
For example, rewriting ![A+B]|C as !A|!B|C allows reordering such that
any of the three leaves can be evaluated first.
to a leaf encounters a strictly alternating pattern of AND and OR
operations. So [A|B]|[C|D+[[E|F]|G] becomes A|B|C|[D+[E|F|G]].
I measured the effect while the Videos window of the default skin was open
(but idle) on a Raspberry Pi, and this reduced the CPU usage by 2.8% from
41.9% to 39.1%: