Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compiled stateful expression #491

Merged
merged 1 commit into from Apr 29, 2016

Conversation

yosiat
Copy link
Contributor

@yosiat yosiat commented Apr 24, 2016

Hi,

This is a one big pull request with the next bottom line changes:

  • Performance of evaluating stateful expression signifactly improved
  • Added 11 unit tests for stateful expression and coverage got up from 16.2% to 18.2%
  • All tests are passing - I changed all the usages of tick.NewStatefulExper to use the new one - and all integrations tests passed.
  • There a behaviour changes - priority to errors have changes, etc - but in my opinion they are not big
  • DurationNode is not supported
  • Currently, I didn't replaced the stateful expression with the new one.

Implementation

Those are explanations of the core algorithm, if there are more questions/clarifications requested, I will update this.

Basic explanation

The overall idea: Instead of using stack-based AST interpreter compiled the expressions to specialized functions.
For example: given this expression "value" > 8.0, let's assume two assumptions:

  • "value" is float64
  • 8.0 is float64

The specializer will take this expression and will evantually run float64 > float64 all the time, instead of doing for every evaluation:

  • Type checking and guessing: checking the type of ref node and the right node type
  • Run through the whole AST tree

Deeper explanation

First, let's set up simple terminology:

  • Dynamic Node - node that is value changes on runtime like FunctionNode and ReferenceNode
  • Constant Node - node that is value is constant for the whole lifetime of the tick script
  • Evaluation Function - evaluation function is the function that accepts three arguments: scope, left and right node (this is a simplified version)

When we get a BinaryNode we determine if it's dynamic or constant - let's examine the dynamic case.

If this is dynamic node in the constructor (NewStatefulExpr) we set the evaluation function to be "dynamic evaluation function" otherwise
we fetch the matching evaluation function based on the nodes types and their operator.

The dynamic evaluation function is doing the next instructions (this were the "specialization" happens):

  • Read the values of the left and right node (for example: for a reference node we will access the scope and read the value)
  • Find a matching evaluation function based on the types we got and save it (in field in the StatefulExpression struct)
  • call EvalBool

The real meat is in EvalBool/EvalNum:

  1. If the evaluation function is null it means that we have some error:
  2. Type mismatch: int > string
  3. Not a comparison/math operator: int - int
  4. Invalid operator for type: bool > bool
    1. We have evaluation function and evaluate her - the evaluation function returns bool and error
    2. We examine the error if it's our special error (ErrTypeGuardFailed) that indicates we ran the wrong comparison function - this can happen on type changes - for example: "value" started as int64 and eventually changed to float6
  5. If we have an error - go to dynamic evaluation - to specialise the evaluation function
    1. Return the results - bool and error

It's important to say: that we handle single nodes as well for example: EvalBool(BoolNode), etc.

Performance

I ran the tests on MacBook Pro (13-inch, Late 2011) - i5 2.4ghz, 8GB RAM and 128GB SSD.
The tests ran with the flag of "--count=5" and compared using benchstat.

EvalBool Benchmarks

name                                                                       old time/op    new time/op    delta
_EvalBool_OneOperator_UnaryNode_BoolNode-4                                    252ns ± 2%      68ns ± 1%   -73.02%  (p=0.008 n=5+5)
_EvalBool_OneOperator_NumberFloat64_NumberFloat64-4                           540ns ± 2%      41ns ± 2%   -92.33%  (p=0.008 n=5+5)
_EvalBool_OneOperator_NumberFloat64_NumberInt64-4                             550ns ± 3%      43ns ± 3%   -92.23%  (p=0.008 n=5+5)
_EvalBool_OneOperator_NumberInt64_NumberInt64-4                               539ns ± 2%      40ns ± 3%   -92.56%  (p=0.008 n=5+5)
_EvalBool_OneOperator_ReferenceNodeFloat64_NumberFloat64-4                    524ns ± 3%      76ns ± 3%   -85.57%  (p=0.008 n=5+5)
_EvalBool_OneOperator_ReferenceNodeFloat64_NumberInt64-4                      526ns ± 1%      78ns ± 6%   -85.21%  (p=0.008 n=5+5)
_EvalBool_OneOperator_ReferenceNodeFloat64_ReferenceNodeFloat64-4             495ns ± 3%     121ns ± 2%   -75.46%  (p=0.008 n=5+5)
_EvalBool_OneOperatorWith11ScopeItem_ReferenceNodeFloat64_NumberFloat64-4     534ns ± 3%      94ns ± 3%   -82.37%  (p=0.008 n=5+5)
_EvalBool_OneOperatorValueChanges_ReferenceNodeFloat64_NumberFloat64-4       2.98µs ± 1%    1.25µs ± 3%   -58.21%  (p=0.008 n=5+5)
_EvalBool_OneOperator_ReferenceNodeInt64_ReferenceNodeInt64-4                 503ns ± 3%     118ns ± 4%   -76.49%  (p=0.008 n=5+5)
_EvalBool_OneOperatorWith11ScopeItem_ReferenceNodeInt64_NumberInt64-4         533ns ± 1%      89ns ± 4%   -83.23%  (p=0.008 n=5+5)
_EvalBool_OneOperatorValueChanges_ReferenceNodeInt64_NumberInt64-4           3.08µs ± 4%    1.25µs ± 3%   -59.33%  (p=0.008 n=5+5)

name                                                                       old alloc/op   new alloc/op   delta
_EvalBool_OneOperator_UnaryNode_BoolNode-4                                    18.0B ± 0%      8.0B ± 0%   -55.56%  (p=0.008 n=5+5)
_EvalBool_OneOperator_NumberFloat64_NumberFloat64-4                           72.0B ± 0%     0.0B ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperator_NumberFloat64_NumberInt64-4                             72.0B ± 0%     0.0B ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperator_NumberInt64_NumberInt64-4                               72.0B ± 0%     0.0B ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperator_ReferenceNodeFloat64_NumberFloat64-4                    64.0B ± 0%     0.0B ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperator_ReferenceNodeFloat64_NumberInt64-4                      64.0B ± 0%     0.0B ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperator_ReferenceNodeFloat64_ReferenceNodeFloat64-4             49.0B ± 0%     0.0B ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperatorWith11ScopeItem_ReferenceNodeFloat64_NumberFloat64-4     64.0B ± 0%     0.0B ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperatorValueChanges_ReferenceNodeFloat64_NumberFloat64-4        64.0B ± 0%     0.0B ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperator_ReferenceNodeInt64_ReferenceNodeInt64-4                 49.0B ± 0%     0.0B ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperatorWith11ScopeItem_ReferenceNodeInt64_NumberInt64-4         64.0B ± 0%     0.0B ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperatorValueChanges_ReferenceNodeInt64_NumberInt64-4            64.0B ± 0%     0.0B ±NaN%  -100.00%  (p=0.008 n=5+5)

name                                                                       old allocs/op  new allocs/op  delta
_EvalBool_OneOperator_UnaryNode_BoolNode-4                                     3.00 ± 0%      1.00 ± 0%   -66.67%  (p=0.008 n=5+5)
_EvalBool_OneOperator_NumberFloat64_NumberFloat64-4                            5.00 ± 0%     0.00 ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperator_NumberFloat64_NumberInt64-4                              5.00 ± 0%     0.00 ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperator_NumberInt64_NumberInt64-4                                5.00 ± 0%     0.00 ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperator_ReferenceNodeFloat64_NumberFloat64-4                     4.00 ± 0%     0.00 ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperator_ReferenceNodeFloat64_NumberInt64-4                       4.00 ± 0%     0.00 ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperator_ReferenceNodeFloat64_ReferenceNodeFloat64-4              3.00 ± 0%     0.00 ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperatorWith11ScopeItem_ReferenceNodeFloat64_NumberFloat64-4      4.00 ± 0%     0.00 ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperatorValueChanges_ReferenceNodeFloat64_NumberFloat64-4         4.00 ± 0%     0.00 ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperator_ReferenceNodeInt64_ReferenceNodeInt64-4                  3.00 ± 0%     0.00 ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperatorWith11ScopeItem_ReferenceNodeInt64_NumberInt64-4          4.00 ± 0%     0.00 ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperatorValueChanges_ReferenceNodeInt64_NumberInt64-4             4.00 ± 0%     0.00 ±NaN%  -100.00%  (p=0.008 n=5+5)

AlertTask benchmarks

name                     old time/op    new time/op    delta
_T10_P500_AlertTask-4       138ms ± 5%     133ms ± 6%     ~     (p=0.421 n=5+5)
_T10_P50000_AlertTask-4     13.7s ± 6%     13.1s ± 5%     ~     (p=0.421 n=5+5)
_T1000_P500_AlertTask-4     13.7s ± 2%     13.0s ± 3%   -4.91%  (p=0.008 n=5+5)

name                     old alloc/op   new alloc/op   delta
_T10_P500_AlertTask-4      33.0MB ± 0%    32.0MB ± 0%   -2.85%  (p=0.008 n=5+5)
_T10_P50000_AlertTask-4    3.36GB ± 0%    3.26GB ± 0%   -2.86%  (p=0.008 n=5+5)
_T1000_P500_AlertTask-4    3.29GB ± 0%    3.19GB ± 0%   -2.90%  (p=0.008 n=5+5)

name                     old allocs/op  new allocs/op  delta
_T10_P500_AlertTask-4        466k ± 0%      408k ± 0%  -12.58%  (p=0.008 n=5+5)
_T10_P50000_AlertTask-4     47.5M ± 0%     41.5M ± 0%  -12.62%  (p=0.008 n=5+5)
_T1000_P500_AlertTask-4     46.1M ± 0%     40.2M ± 0%  -12.73%  (p=0.008 n=5+5)

Questions / Notes

Tests

I added more tests to stateful expression, to make sure we cover more and more cases.
The coverage for eval package is now 73.5%.
I added those tests:

  • TestStatefulExpression_EvalBool_BinaryNodeWithDurationNode
  • TestStatefulExpression_EvalNum_FunctionWithTimeValue
  • TestStatefulExpression_Eval_NotSupportedNode
  • TestStatefulExpression_Eval_NodeAndEvalTypeNotMatching
  • TestStatefulExpression_EvalBool_BinaryNodeWithBoolUnaryNode
  • TestStatefulExpression_EvalBool_BinaryNodeWithNumericUnaryNode
  • TestStatefulExpression_EvalBool_TwoLevelsDeepBinaryWithEvalNum_Int64
  • TestStatefulExpression_EvalBool_TwoLevelsDeepBinaryWithEvalNum_Float64
  • TestStatefulExpression_EvalBool_SanityCallingFunction
  • TestStatefulExpression_EvalNum_SanityCallingFunctionWithArgs
  • TestStatefulExpression_EvalBool_SanityCallingFunctionWithArgs

Important

@nathanielc / pull request reviewer, please read those very carefully and answer them please!
The notes/questions are ordered by importance:

  1. Didn't tested function return type changes - there is need to? If so, do we have function to do so? or should I need to create new one and stub it in?
  2. Not supported DurationNode - I saw the stateful expression did handle DurationNode, but I can't figure out where it's used - not in BinaryNode and not as single node (ex: EvalNum(DurationNode))
  3. In StatefulExpression we are calling "node.eval" - why so? in the new one we don't call this methods are all tests are passing, are we missing tests?.
  4. Creating expression return error - this is new "behaviour", compiling an expression can return an error, there is test for it: TestStatefulExpression_Eval_NotSupportedNode, examples:
  5. passing invalid node to compile, example: commentnode
  6. passing invalid node in binarynode

5.@nathanielc - you requested to separate to packages as ast and etc, I didn't do this in this pull request, because it's too much big PR
6. I can fix #490 pretty easily, do you want to?

Nice-To-Haves

Those are nice to haves, maybe in this pull request and maybe another:

  • Debug logs for optimising: add debug log for when guard files and etc, can be useful in performance investigations
  • Performance optimisation (not related to this pr): In mergeFieldsAndTags we put all tags and fields in the scope, I think we can traverse the node AST and get a list of needed scope variables and then fetch them, in my opinion it can yield great performance improvement - I will research this after this PR will get merged

Fee, I finished 👍
That was a really fun and educating experience, thanks @nathanielc for being open to changes :)

  • Yosi

@yosiat yosiat changed the title [WIP] Compiled stateful expression Compiled stateful expression Apr 25, 2016
@nathanielc
Copy link
Contributor

nathanielc commented Apr 25, 2016

Didn't tested function return type changes - there is need to? If so, do we have function to do so? or should I need to create new one and stub it in?

No, the return type of a function is constant. In fact you could add a method to the Func interface so this is explicit and you can add functions to the constant case. https://github.com/influxdata/kapacitor/blob/master/tick/functions.go#L14

Not supported DurationNode - I saw the stateful expression did handle DurationNode, but I can't figure out where it's used - not in BinaryNode and not as single node (ex: EvalNum(DurationNode))

There are no methods or functions that use them yet to my knowledge but it is planned see #169

In StatefulExpression we are calling "node.eval" - why so? in the new one we don't call this methods are all tests are passing, are we missing tests?.

Can you point to a line where node.eval is being called? Not sure what you are talking about here.

Creating expression return error - this is new "behaviour", compiling an expression can return an error, there is test for it: TestStatefulExpression_Eval_NotSupportedNode, examples:
passing invalid node to compile, example: commentnode
passing invalid node in binarynode

Seems like there are two kinds of errors:

  1. Errors that cannot be resolved. Aka passing commentnode or some other node that makes no sense and cannot be evaluated
  2. Errors that are data dependent and could be resolved if the correct data types are passed in.

For case number one the error should cause the task to fail and stop executing.
For the second case the error should be logged and the task should continue executing.

@nathanielc - you requested to separate to packages as ast and etc, I didn't do this in this pull request, because it's too much big PR

That's fine we can do it later or not at all if its too much.

I can fix #490 pretty easily, do you want to?

Yes, go ahead an fix it. Just add a note to this PR that it is fixed.

Haven't looked at the code much yet but overall the process and explanations are good.

Thanks!

@yosiat
Copy link
Contributor Author

yosiat commented Apr 25, 2016

  1. Are you sure about this? the return type is interface{}, what makes sure that future developers won't return sometime int64 and others float64?
  2. Ok, so this just confusing to put in the stack eval code (https://github.com/influxdata/kapacitor/blob/master/tick/stateful_expr.go#L94,L95)
  3. After a second look, I got confused with the code.. so what is the point of this code? and when it's used? https://github.com/influxdata/kapacitor/blob/master/tick/eval.go#L93
  4. TryCompileStatefulExpression will return an error only if the expression couldn't be evaluated (this is the first kinda of ever your described) ever not like type mismatch error which can fixed at runtime. so I think we are ok & agree about this?
  5. This is not too much, this will be an easy work but that will make this PR even harder to CR.
  6. I will fix UnaryNode doesn't support reference node #490 later today, this is just making UnaryNode a dynamic node if it's contains a ReferenceNode

By the way, do you want me to squash commits or something? there is huge number of commits whom are poorly written and doesn't describe the change very well.

@nathanielc
Copy link
Contributor

Are you sure about this? the return type is interface{}, what makes sure that future developers won't return sometime int64 and others float64?

You do :) I checked current implementations and non have variable return types. Since we are specializing the evaluation now we can make this part of the contract. There are three raw data types we deal with now bool, int64 and float64. So you could add CallBool, CallInt and CallFloat methods to the interface to force it. Then if we add more types we can easily do that later.

Ok, so this just confusing to put in the stack eval code (https://github.com/influxdata/kapacitor/blob/master/tick/stateful_expr.go#L94,L95)

Yes

After a second look, I got confused with the code.. so what is the point of this code? and when it's used? https://github.com/influxdata/kapacitor/blob/master/tick/eval.go#L93

That is the recursive stack based eval function. Since not all expressions are guaranteed to be a BinaryNode that is the entrypoint to evaluation. If the expression is an NumberNode for example it will get pushed on the stack and then popped as the final result of the expression: lambda: 1.0 is a valid expression that always evals to float64(1).

TryCompileStatefulExpression will return an error only if the expression couldn't be evaluated (this is the first kinda of ever your described) ever not like type mismatch error which can fixed at runtime. so I think we are ok & agree about this?

yes, its fine if the compile returns an error for the first case. Can you give an example of how it can fix a type issue at runtime? Seems like there are cases when it is not possible to fix?

This is not too much, this will be an easy work but that will make this PR even harder to CR.
I will fix #490 later today, this is just making UnaryNode a dynamic node if it's contains a ReferenceNode

In that case let's focus on the PR at hand and get a fix in for this later. It will be beneficial to have a simple PR that fixes a small issue in the new code base so others can begin to see how it works.

@yosiat
Copy link
Contributor Author

yosiat commented Apr 25, 2016

Just to make sure - This PR is finished and waiting for CR feedback

@nathanielc
Copy link
Contributor

Just to make sure - This PR is finished and waiting for CR feedback

I have your branch checkout locally and I am looking through it....

@nathanielc
Copy link
Contributor

Where this eval is called?

Here is the main entry point: https://github.com/influxdata/kapacitor/blob/master/tick/stateful_expr.go#L36

and then its called recursively for example here https://github.com/influxdata/kapacitor/blob/master/tick/stateful_expr.go#L107

@nathanielc
Copy link
Contributor

nathanielc commented Apr 25, 2016

Currently, I didn't replaced the stateful expression with the new one.

Is there any case where this new code is used? I didn't see any... That is beyond the test and benchmarks?

@nathanielc
Copy link
Contributor

nathanielc commented Apr 25, 2016

Here is a brain dump of some first thoughts after digging through the code. These are just observations or questions not suggestion for change yet. I haven't dug in deep enough to understand it all so I don't want to suggest any changes yet, but I do want to start a discussion.

  • The use of the terms expression vs node confused me. In my mind an expression is an AST of nodes. Yet the EvalBinaryNode uses an expression to perform the evaluation of the node. I see it now but it was confusing at first.
  • Along with the above the caching of expressions confused me. Why can't the EvalBinaryNode keep a reference to the compiled expressionFunc itself? Why do we need the cache?
  • If we removed support for Reference Nodes changing type how much would that simplify the code? Are RefernceNodes the only source of dynamic types? I think the use case for a referenced value changing type is invalid and we shouldn't support it. Maybe I am wrong here but that is my first impression. And in the case that the source data has the wrong type you can always explicitly coerce it to the correct type. For example: lambda: "value" * 2.0 if value has type int, then you can do this lambda: float("value") * 2.0.
  • Seems like the result of getConstantNodeType is deterministic based on the given node. Is it? If it is couldn't we only call it during the compile phase and not during eval.
  • In my mind with this change there should be a clear compile step and then a distinct evaluate step. But is seems that eval will call compile in certain conditions. Is there a reason the steps can't be separated? It would make understanding to code much easier.

Well there a brain dump of some of my initial thoughts. Again no need to go change everything yet it will take me a bit more for all of this to sink in and maybe then I'll probably change my mind about half of those thoughts anyway.

@yosiat
Copy link
Contributor Author

yosiat commented Apr 26, 2016

Hi,
Great questions asked here 👍 and it's important to have conversation about this PR because it brings a big change.

Before answering your question, I see there is confusion that there two phases in the evaluation: "compile" and "eval". and the confusion might be as a result of the function named "TryCompileStatefulExpression" while there is no compilation.

In compilers, generally speaking, you have two types:

  • AOT - Ahead of time
  • JIT - Just in time

In this branch, we implement the latter, "JIT" compiler (and I am really careful about talking about JIT/AOT compilers here). In order to implement "AOT" you must do some static analysis, and currently we don't have much power to do static analysis on dynamic nodes. and let's dissect this statement:

  • FunctionNode - at the current state their return type is interface{} although you suggested a change, so let's ignore FuctionNode for now.
  • ReferenceNode - we don't know at the stateful expression creation time the type of the referenced value, we don't know that "value" is int64 and "server_name" is string.

The only "AOT" compilation we do is with two static nodes - https://github.com/yosiat/kapacitor/blob/compiled-stateful-expression/tick/eval/specialized_binary_stateful_expr.go#L90

In conclusion, we do compilation in the evaluation time and this is done using "evaluateDynamicNode" and "evalWithNodes", so there is separate phase - they are fused together.
If you have any idea (and good reason) to "unfuse" them, I am open to here it 👍

The use of the terms expression vs node confused me. In my mind an expression is an AST of nodes. Yet the EvalBinaryNode uses an expression to perform the evaluation of the node. I see it now but it was confusing at first.

I used expression as name there because we are holding StatefulExpression, I can change the names of ExpressionCache to one of:

  • StatefulExpressionCache
  • ExpressionEvaluatorCache

Along with the above the caching of expressions confused me. Why can't the EvalBinaryNode keep a reference to the compiled expressionFunc itself? Why do we need the cache?

We can't do for this example: "vaule" > 12 , the "value" type might change over time so we need the "SpecializedStatefulExpression".
And another reason for this, we will be simple code clarity and safe abstractions.

If we removed support for Reference Nodes changing type how much would that simplify the code? Are RefernceNodes the only source of dynamic types? I think the use case for a referenced value changing type is invalid and we shouldn't support it. Maybe I am wrong here but that is my first impression. And in the case that the source data has the wrong type you can always explicitly coerce it to the correct type. For example: lambda: "value" * 2.0 if value has type int, then you can do this lambda: float("value") * 2.0.

If you want to remove Reference Node changing types, you will need to remove the lines that handle ErrTypeGuardFailed, for example: https://github.com/yosiat/kapacitor/blob/compiled-stateful-expression/tick/eval/specialized_binary_stateful_expr.go#L144,L155

But, actually you can't remove it because you don't know what is the type of "value" at the "TryCompileStatefulExpression" phase, we don't have enough "static analysis" powers here to determine the types of "dynamic types" before hand.

Seems like the result of getConstantNodeType is deterministic based on the given node. Is it? If it is couldn't we only call it during the compile phase and not during eval.

Where do we call getConstantNodeType during eval?
If you talk about this section of code: https://github.com/yosiat/kapacitor/blob/compiled-stateful-expression/tick/eval/node_value_acessors.go#L233,L242

It will be a big change to change it at "compile time", do you have an idea how to?

In my mind with this change there should be a clear compile step and then a distinct evaluate step. But is seems that eval will call compile in certain conditions. Is there a reason the steps can't be separated? It would make understanding to code much easier.

I answer this above ~

@nathanielc
Copy link
Contributor

Things are making more sense now.

AOT vs JIT, makes sense that we need JIT, I guess I was thinking we would cheat and have one compilation step after the first data point and then treat it as AOT from there on out. But you make a good point lets do this right and have real JIT as that will be less confusing in the long run. I think a simple explanation in the package doc would be helpful.

OK at this point I am going to start making inline comment in the code. I feel I understand the design well enough to start asking more specific questions. Wave of questions incoming...

return NodeValueAccessor(&EvalFunctionNode{Node: node})

case *tick.UnaryNode:
return NodeValueAccessor(&EvalUnaryNode{Node: node})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At each of these cases you can call getConstantNodeType and store the result on the struct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way - this is great proposal! We can now change Eval{X}Node to contain data like expression cache or another info, but this for later change 👍

@yosiat
Copy link
Contributor Author

yosiat commented Apr 26, 2016

@nathanielc Ok, and do make it short I flag your response with "+1" that means I will fix it ASAP

@nathanielc
Copy link
Contributor

High level question: What is the API to this package? It is not clear to me, which is a problem.

Is the API for the package essentially the StatefulExpression interface plus the TryCompileStatefulExpression method?

This comes back to my confusion between expression and node. Since StatefulExpression is used internally to the EvalBinaryNode and externally it seems like we are coupling things that do not need to be coupled.

I don't see a clear interface for evaluating the result of an AST of nodes. In some places we create a NodeValueAccesor directly from the node in others we create a SpecializedBinaryStatefulExpression and then evaluate. Will there be a SpecializedUnaryStatefulExpression too?

For example the args to function calls are assumed to always be simple NodeValueAccessors.

I am going to write up a quick example of how I think this will work....

@yosiat
Copy link
Contributor Author

yosiat commented Apr 26, 2016

The API of this package is the interface and "TryCompileStatefulExpression" which I want to change to "NewStatefulExpr" after we remove the old code.

First, I agree, there is some confusion that we used both NodeValueAccesor and Specialized{X}StatefulExpression, I tried to think a way to factor this out.

About the unary, I am currently not sure about how to implement this, it can be a "change" to SpecializedBinaryExpression to support an array of NodeValueAccesor or maybe a change to NodeValueAccesorSpecializedExpression.

The args to function calls should be transformed to expression.

I am waiting for your example 👍

Edit: some important thing to note that NodeValueAccesor must stay, it might be the root cause of the confusion, but it's the key abstraction for "hiding" our we access the value of a node.

@nathanielc
Copy link
Contributor

Here is my example/explanation of expression vs node

First we have two types StatefulExpression and NodeValueAccessor which essentially correspond to the ideas expressions and node.

A StatefulExpression is responsible for maintaining state across different evaluations of the expression. A NodeValueAccessor is responsible for performing the actual evaluation of the tick.Node. The StatefulExpression contains NodeValueAccessors and to evaluate an expression calls the interface on the NodeValueAccessor.

Here are the proposed interfaces for each type.

// Evaluate an expression for a given scope.
type StatefulExpression interface {
    EvalFloat64(scope *tick.Scope) (float64, error)
    EvalInt64(scope *tick.Scope) (int64, error)
    EvalString(scope *tick.Scope) (string, error)
    EvalBool(scope *tick.Scope) (bool, error)
    EvalRegex(scope *tick.Scope) (*regexp.Regexp, error)
}

// Evaluate an node for a given scope and state
type NodeValueAccessor interface {
    EvalFloat64(scope *tick.Scope, state *ExecutionState) (float64, error)
    EvalInt64(scope *tick.Scope, state *ExecutionState) (int64, error)
    EvalString(scope *tick.Scope, state *ExecutionState) (string, error)
    EvalBool(scope *tick.Scope, state *ExecutionState) (bool, error)
    EvalRegex(scope *tick.Scope, state *ExecutionState) (*regexp.Regexp, error)
}

Ignore the fact for now that the StatefulExpression interface is not compatible with the current code.

The only different between a StatefulExpression is that is maintains state while the NodeValueAccessor is responsible for evaluating the node in the context of both scope and state. Here the state is essentially the ExecutionAux struct you already have.

The implementations of the Eval{} methods for the StatefulExpression become simple calls to nva methods but maintaining the same state across calls.

type SE struct {
   nva NodeValueAccessor
   state *ExecutionState
}

func (se *SE) EvalFloat64(scope *tick.Scope) (float64,error) {
     return se.nva.EvalFloat64(scope, se.state)
}
func (se *SE) EvalInt64(scope *tick.Scope) (int64,error) {
     return se.nva.EvalInt64(scope, se.state)
}
func (se *SE) EvalBool(scope *tick.Scope) (bool,error) {
     return se.nva.EvalBool(scope, se.state)
}
func (se *SE) EvalString(scope *tick.Scope) (string,error) {
     return se.nva.EvalString(scope, se.state)
}
func (se *SE) EvalRegex(scope *tick.Scope) (*regexp.Regexp,error) {
     return se.nva.EvalRegex(scope, se.state)
}

The NodeValueAccessor types for each of the nodes handle the JIT compilation and the different methods for the type. The EvalBinaryNode (NVA) will look something like the evalWithNodes method on the SpecializedBinaryStatefulExpression.

Then thins start to simplify a bit since there aren't any Specialized{}StatefulExpressions because an expression doesn't care about the type of node. It only concerns itself with maintinaing state across calls.

I may have missed something critical that makes this hard to do but lets discuss.

@yosiat
Copy link
Contributor Author

yosiat commented Apr 26, 2016

@nathanielc WOW 👍 👍
I really like the proposed changes, and I was the in the middle of writing the idea I had with your inline changes.

When I changed EvalBinaryNode to be next to it's implementation and it actually made me think.. wait.. now we can add to EvalBinaryNode struct more info like Type and cache the constant type (as you suggested)
And after that, I thought.. wait, now I have function like this:

type EvalBinaryNode struct {
    Node *tick.BinaryNode
    Type ValueType
}

func CreateEvalBinaryNode(binaryNode *tick.BinaryNode) NodeValueAccessor {
    return &EvalBinaryNode{
        Node: binaryNode,
        Type: getConstantNodeType(binaryNode),
    }
}

Why not simply return an error - for example: node type is InvalidType and etc (as we called them "compile error type")
And merged with your idea - it's brilliant.

Just one question - the current StatefulExpression allows to evaluate to bool or number, do you see any reason in the future of kapacitor that you would like to return regex/string/etc?

Here are the tasks:

  • I will extract all NodeValueAccessor to be in their own package - "eval/node_value_accesors"
  • Change NodeValueAccessorStatefulExpression to be the implementation you suggested - with EvalNum and EvalBool.
  • Create an "Adapter" from {EvalNum,EvalBool} to the new interface and make sure that tests passes for single node
  • Merge SpecializedBinaryExpression with EvalBinaryNode

@yosiat
Copy link
Contributor Author

yosiat commented Apr 26, 2016

Please give me your ok and I will start those tasks ASAP

@nathanielc
Copy link
Contributor

@yosiat Looks good! I don't see a need for an eval/node_value_accesors package but if it helps keep expression vs node separate then go for it.

Just one question - the current StatefulExpression allows to evaluate to bool or number, do you see any reason in the future of kapacitor that you would like to return regex/string/etc?

InfluxDB supports four data types float, int, bool, string. Kapacitor should support the same types. So no on returning a regex node but the rest yes.

@yosiat
Copy link
Contributor Author

yosiat commented Apr 26, 2016

@nathanielc I meant in Kapacitor nodes (StreamNode, FromNode, etc) do you any need for evaluating expression to string?

@nathanielc
Copy link
Contributor

@yosiat Technically yes, but there isn't a lot of support for it yet(aka string manipulation functions). I'd add it as it will help flesh out the rest of the interface better.

@yosiat
Copy link
Contributor Author

yosiat commented Apr 26, 2016

@nathanielc just found out that you are missing generic value return function to NodeValueAccesor interface.

It's important for the "dynamic evaluation", so we can call it and detect the type if it's dynamic node.

@yosiat
Copy link
Contributor Author

yosiat commented Apr 26, 2016

@nathanielc By the way, why you are passing reference to ExecutionState and not value reference?
Instead of:

EvalFloat64(scope *tick.Scope, state *ExecutionState) (float64, error)

Do:

EvalFloat64(scope *tick.Scope, state ExecutionState) (float64, error)

Is there any special reason?

@yosiat
Copy link
Contributor Author

yosiat commented Apr 26, 2016

@nathanielc quick status update:

  • I added support for UnaryNode{Reference Node} when the operator is TokenNot (otherwise this is math operator, and only on runtime we know for sure the type - int64 or float64)
  • Extracted NodeValueAccessor to multiple files, hope it's more clear.
  • Renamed ExecutionAux to ExecutionState
  • Renamed NodeValueAccessor to "Eval{X}" as you suggested
  • Changed TypeGuardFailed to be struct that will contain RequestType and ActualType, in the future I will add ErrorMessage - this change is to make sure that we can immediately return the error, for more cleaner implementation of StatefulExpression.

Now I am starting on -
Change NodeValueAccessorStatefulExpression to be the implementation you suggested - with EvalNum and EvalBool.

@nathanielc
Copy link
Contributor

why you are passing reference to ExecutionState and not value reference?

No reason...

Sounds good. Let me know when you want me to take a second look.

@yosiat
Copy link
Contributor Author

yosiat commented Apr 29, 2016

@nathanielc I started to convert all to use the new stateful expression, But one of the tests are failing:

--- FAIL: TestStream_AlertSigma (0.00s)
    streamer_test.go:3707: digraph TestStream_AlertSigma {
        stream0 -> from1;
        from1 -> eval2;
        eval2 -> alert3;
        }
    streamer_test.go:3212: unexpected alert data for request: 1 unexpected series values: i: 0
        exp [[1971-01-01 00:00:07 +0000 UTC 2.469916402324427 16]]
        got [[1971-01-01 00:00:07 +0000 UTC 2.5566063655482916 16]]
    streamer_test.go:3212: unexpected alert data for request: 2 unexpected series values: i: 0
        exp [[1971-01-01 00:00:08 +0000 UTC 0.3053477916297622 93.4]]
        got [[1971-01-01 00:00:08 +0000 UTC 0.31474529935965045 93.4]]
FAIL
FAIL    github.com/influxdata/kapacitor/integrations    0.426s

Do you have any idea? (do you want me to push the commit for it?)

@yosiat yosiat force-pushed the compiled-stateful-expression branch from 29eedc0 to db583b8 Compare April 29, 2016 17:33
@nathanielc
Copy link
Contributor

nathanielc commented Apr 29, 2016

@yosiat Sigma is a stateful function. If its state if somehow getting corrupted than that would fail the test. That is my first guess anyways.

@yosiat
Copy link
Contributor Author

yosiat commented Apr 29, 2016

@nathanielc It looks like the first value - 97.1 is passed twice, I am looking into it.

@yosiat
Copy link
Contributor Author

yosiat commented Apr 29, 2016

@nathanielc I found the reason for the error - EvalNum calls both EvalInt and EvalFloat if the result is float..

This is will be an easy fix, I am fixing it (and adding test of course) and pushing a commit.

@yosiat
Copy link
Contributor Author

yosiat commented Apr 29, 2016

@nathanielc The problem is in here:
https://github.com/yosiat/kapacitor/blob/compiled-stateful-expression/tick/stateful/stateful_expr.go#L63,L78

Since we evaluating twice we are destroying the state, so I am changing it to use GetType.
But I am currently thinking how we can TNumeric, it's a bit difficult.

sorry, we won't have tnumeric

@yosiat
Copy link
Contributor Author

yosiat commented Apr 29, 2016

@nathanielc do you know this is failing?
It just says -

FAIL    github.com/influxdata/kapacitor/tick [build failed]

On my computer "./build.py --build" finishes successfully .

@nathanielc
Copy link
Contributor

@yosiat I am seeing this locally after running ./build.py --test

tick/stateful_expr_test.go:20: undefined: tick.NewStatefulExpr
tick/stateful_expr_test.go:47: undefined: tick.NewStatefulExpr
tick/stateful_expr_test.go:71: undefined: tick.NewStatefulExpr
tick/stateful_expr_test.go:83: undefined: tick.NewStatefulExpr
tick/stateful_expr_test.go:100: undefined: tick.NewStatefulExpr
tick/stateful_expr_test.go:119: undefined: tick.NewStatefulExpr
tick/stateful_expr_test.go:142: undefined: tick.NewStatefulExpr
tick/stateful_expr_test.go:666: undefined: tick.NewStatefulExpr
tick/stateful_expr_test.go:730: undefined: tick.NewStatefulExpr
tick/stateful_expr_test.go:768: undefined: tick.NewStatefulExpr
tick/stateful_expr_test.go:768: too many errors

@yosiat yosiat force-pushed the compiled-stateful-expression branch from f79d097 to e8656f8 Compare April 29, 2016 19:23
@yosiat
Copy link
Contributor Author

yosiat commented Apr 29, 2016

@nathanielc Fixed it! why it isn't written in circleci output?

@nathanielc
Copy link
Contributor

Not sure, maybe stderr got discarded or something.

On Fri, Apr 29, 2016 at 1:23 PM Yosi Attias notifications@github.com
wrote:

@nathanielc https://github.com/nathanielc Fixed it! why it isn't
written in circleci output?


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#491 (comment)

Nathaniel Cook
Kapacitor Lead
https://influxdata.com

@yosiat
Copy link
Contributor Author

yosiat commented Apr 29, 2016

@nathanielc Ok, once the tests will pass on circleci, I think we are ready to merge it.

By the way, here is the final benchmarks:
Stateful expression benchmark don't show UnaryNode/TwoLevelDeepBinary perf because it wasn't included in the original one

Alert (compared to master)

name                     old time/op    new time/op    delta
_T10_P500_AlertTask-4       138ms ± 5%     132ms ± 4%     ~     (p=0.151 n=5+5)
_T10_P50000_AlertTask-4     13.7s ± 6%     13.2s ± 6%     ~     (p=0.548 n=5+5)
_T1000_P500_AlertTask-4     13.7s ± 2%     12.7s ± 2%   -6.98%  (p=0.008 n=5+5)

name                     old alloc/op   new alloc/op   delta
_T10_P500_AlertTask-4      33.0MB ± 0%    32.2MB ± 0%   -2.42%  (p=0.008 n=5+5)
_T10_P50000_AlertTask-4    3.36GB ± 0%    3.26GB ± 0%   -2.86%  (p=0.008 n=5+5)
_T1000_P500_AlertTask-4    3.29GB ± 0%    3.21GB ± 0%   -2.48%  (p=0.008 n=5+5)

name                     old allocs/op  new allocs/op  delta
_T10_P500_AlertTask-4        466k ± 0%      408k ± 0%  -12.57%  (p=0.008 n=5+5)
_T10_P50000_AlertTask-4     47.5M ± 0%     41.5M ± 0%  -12.62%  (p=0.008 n=5+5)
_T1000_P500_AlertTask-4     46.1M ± 0%     40.2M ± 0%  -12.72%  (p=0.008 n=5+5)

Stateful expression benchmarks (compared to master):

name                                                                       old time/op    new time/op    delta
_EvalBool_OneOperator_UnaryNode_BoolNode-4                                    252ns ± 2%      16ns ± 3%   -93.53%  (p=0.008 n=5+5)
_EvalBool_OneOperator_NumberFloat64_NumberFloat64-4                           540ns ± 2%      40ns ± 2%   -92.58%  (p=0.008 n=5+5)
_EvalBool_OneOperator_NumberFloat64_NumberInt64-4                             550ns ± 3%      40ns ± 3%   -92.68%  (p=0.008 n=5+5)
_EvalBool_OneOperator_NumberInt64_NumberInt64-4                               539ns ± 2%      39ns ± 4%   -92.77%  (p=0.008 n=5+5)
_EvalBool_OneOperator_ReferenceNodeFloat64_NumberFloat64-4                    524ns ± 3%      74ns ± 5%   -85.93%  (p=0.008 n=5+5)
_EvalBool_OneOperator_ReferenceNodeFloat64_NumberInt64-4                      526ns ± 1%      76ns ± 3%   -85.51%  (p=0.008 n=5+5)
_EvalBool_OneOperator_ReferenceNodeFloat64_ReferenceNodeFloat64-4             495ns ± 3%     116ns ± 4%   -76.48%  (p=0.008 n=5+5)
_EvalBool_OneOperatorWith11ScopeItem_ReferenceNodeFloat64_NumberFloat64-4     534ns ± 3%      90ns ± 4%   -83.11%  (p=0.008 n=5+5)
_EvalBool_OneOperatorValueChanges_ReferenceNodeFloat64_NumberFloat64-4       2.98µs ± 1%    1.24µs ± 3%   -58.44%  (p=0.008 n=5+5)
_EvalBool_OneOperator_ReferenceNodeInt64_ReferenceNodeInt64-4                 503ns ± 3%     119ns ± 7%   -76.25%  (p=0.008 n=5+5)
_EvalBool_OneOperatorWith11ScopeItem_ReferenceNodeInt64_NumberInt64-4         533ns ± 1%      87ns ± 3%   -83.69%  (p=0.008 n=5+5)
_EvalBool_OneOperatorValueChanges_ReferenceNodeInt64_NumberInt64-4           3.08µs ± 4%    1.26µs ± 3%   -59.22%  (p=0.008 n=5+5)

name                                                                       old alloc/op   new alloc/op   delta
_EvalBool_OneOperator_UnaryNode_BoolNode-4                                    18.0B ± 0%     0.0B ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperator_NumberFloat64_NumberFloat64-4                           72.0B ± 0%     0.0B ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperator_NumberFloat64_NumberInt64-4                             72.0B ± 0%     0.0B ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperator_NumberInt64_NumberInt64-4                               72.0B ± 0%     0.0B ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperator_ReferenceNodeFloat64_NumberFloat64-4                    64.0B ± 0%     0.0B ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperator_ReferenceNodeFloat64_NumberInt64-4                      64.0B ± 0%     0.0B ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperator_ReferenceNodeFloat64_ReferenceNodeFloat64-4             49.0B ± 0%     0.0B ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperatorWith11ScopeItem_ReferenceNodeFloat64_NumberFloat64-4     64.0B ± 0%     0.0B ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperatorValueChanges_ReferenceNodeFloat64_NumberFloat64-4        64.0B ± 0%     0.0B ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperator_ReferenceNodeInt64_ReferenceNodeInt64-4                 49.0B ± 0%     0.0B ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperatorWith11ScopeItem_ReferenceNodeInt64_NumberInt64-4         64.0B ± 0%     0.0B ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperatorValueChanges_ReferenceNodeInt64_NumberInt64-4            64.0B ± 0%     0.0B ±NaN%  -100.00%  (p=0.008 n=5+5)

name                                                                       old allocs/op  new allocs/op  delta
_EvalBool_OneOperator_UnaryNode_BoolNode-4                                     3.00 ± 0%     0.00 ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperator_NumberFloat64_NumberFloat64-4                            5.00 ± 0%     0.00 ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperator_NumberFloat64_NumberInt64-4                              5.00 ± 0%     0.00 ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperator_NumberInt64_NumberInt64-4                                5.00 ± 0%     0.00 ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperator_ReferenceNodeFloat64_NumberFloat64-4                     4.00 ± 0%     0.00 ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperator_ReferenceNodeFloat64_NumberInt64-4                       4.00 ± 0%     0.00 ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperator_ReferenceNodeFloat64_ReferenceNodeFloat64-4              3.00 ± 0%     0.00 ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperatorWith11ScopeItem_ReferenceNodeFloat64_NumberFloat64-4      4.00 ± 0%     0.00 ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperatorValueChanges_ReferenceNodeFloat64_NumberFloat64-4         4.00 ± 0%     0.00 ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperator_ReferenceNodeInt64_ReferenceNodeInt64-4                  3.00 ± 0%     0.00 ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperatorWith11ScopeItem_ReferenceNodeInt64_NumberInt64-4          4.00 ± 0%     0.00 ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperatorValueChanges_ReferenceNodeInt64_NumberInt64-4             4.00 ± 0%     0.00 ±NaN%  -100.00%  (p=0.008 n=5+5)

Stateful expression benchamrks (compared to time of submiting PR):

name                                                                       old time/op    new time/op    delta
_EvalBool_OneOperator_UnaryNode_BoolNode-4                                   68.1ns ± 1%    16.3ns ± 3%   -76.01%          (p=0.008 n=5+5)
_EvalBool_OneOperator_UnaryNode_ReferenceNode-4                               105ns ± 3%      52ns ± 2%   -50.15%          (p=0.008 n=5+5)
_EvalBool_OneOperator_NumberFloat64_NumberFloat64-4                          41.4ns ± 2%    40.1ns ± 2%    -3.24%          (p=0.032 n=5+5)
_EvalBool_OneOperator_NumberFloat64_NumberInt64-4                            42.7ns ± 3%    40.2ns ± 3%    -5.81%          (p=0.008 n=5+5)
_EvalBool_OneOperator_NumberInt64_NumberInt64-4                              40.1ns ± 3%    39.0ns ± 4%      ~             (p=0.103 n=5+5)
_EvalBool_OneOperator_UnaryNode-4                                             136ns ± 3%      81ns ± 3%   -40.50%          (p=0.008 n=5+5)
_EvalBool_OneOperator_ReferenceNodeFloat64_NumberFloat64-4                   75.6ns ± 3%    73.7ns ± 5%      ~             (p=0.238 n=5+5)
_EvalBool_OneOperator_ReferenceNodeFloat64_NumberInt64-4                     77.7ns ± 6%    76.1ns ± 3%      ~             (p=0.310 n=5+5)
_EvalBool_OneOperator_ReferenceNodeFloat64_ReferenceNodeFloat64-4             121ns ± 2%     116ns ± 4%    -4.12%          (p=0.032 n=5+5)
_EvalBool_OneOperatorWith11ScopeItem_ReferenceNodeFloat64_NumberFloat64-4    94.1ns ± 3%    90.1ns ± 4%    -4.21%          (p=0.032 n=5+5)
_EvalBool_OneOperatorValueChanges_ReferenceNodeFloat64_NumberFloat64-4       1.25µs ± 3%    1.24µs ± 3%      ~             (p=1.000 n=5+5)
_EvalBool_OneOperator_ReferenceNodeInt64_ReferenceNodeInt64-4                 118ns ± 4%     119ns ± 7%      ~             (p=1.000 n=5+5)
_EvalBool_OneOperatorWith11ScopeItem_ReferenceNodeInt64_NumberInt64-4        89.4ns ± 4%    86.9ns ± 3%      ~             (p=0.206 n=5+5)
_EvalBool_TwoLevelDeep-4                                                      344ns ± 3%     176ns ± 5%   -49.01%          (p=0.008 n=5+5)
_EvalBool_OneOperatorValueChanges_ReferenceNodeInt64_NumberInt64-4           1.25µs ± 3%    1.26µs ± 3%      ~             (p=0.500 n=5+5)

name                                                                       old alloc/op   new alloc/op   delta
_EvalBool_OneOperator_UnaryNode_BoolNode-4                                    8.00B ± 0%    0.00B ±NaN%  -100.00%          (p=0.008 n=5+5)
_EvalBool_OneOperator_UnaryNode_ReferenceNode-4                               8.00B ± 0%    0.00B ±NaN%  -100.00%          (p=0.008 n=5+5)
_EvalBool_OneOperator_NumberFloat64_NumberFloat64-4                          0.00B ±NaN%    0.00B ±NaN%      ~     (all samples are equal)
_EvalBool_OneOperator_NumberFloat64_NumberInt64-4                            0.00B ±NaN%    0.00B ±NaN%      ~     (all samples are equal)
_EvalBool_OneOperator_NumberInt64_NumberInt64-4                              0.00B ±NaN%    0.00B ±NaN%      ~     (all samples are equal)
_EvalBool_OneOperator_UnaryNode-4                                             8.00B ± 0%    0.00B ±NaN%  -100.00%          (p=0.008 n=5+5)
_EvalBool_OneOperator_ReferenceNodeFloat64_NumberFloat64-4                   0.00B ±NaN%    0.00B ±NaN%      ~     (all samples are equal)
_EvalBool_OneOperator_ReferenceNodeFloat64_NumberInt64-4                     0.00B ±NaN%    0.00B ±NaN%      ~     (all samples are equal)
_EvalBool_OneOperator_ReferenceNodeFloat64_ReferenceNodeFloat64-4            0.00B ±NaN%    0.00B ±NaN%      ~     (all samples are equal)
_EvalBool_OneOperatorWith11ScopeItem_ReferenceNodeFloat64_NumberFloat64-4    0.00B ±NaN%    0.00B ±NaN%      ~     (all samples are equal)
_EvalBool_OneOperatorValueChanges_ReferenceNodeFloat64_NumberFloat64-4       0.00B ±NaN%    0.00B ±NaN%      ~     (all samples are equal)
_EvalBool_OneOperator_ReferenceNodeInt64_ReferenceNodeInt64-4                0.00B ±NaN%    0.00B ±NaN%      ~     (all samples are equal)
_EvalBool_OneOperatorWith11ScopeItem_ReferenceNodeInt64_NumberInt64-4        0.00B ±NaN%    0.00B ±NaN%      ~     (all samples are equal)
_EvalBool_TwoLevelDeep-4                                                     0.00B ±NaN%    0.00B ±NaN%      ~     (all samples are equal)
_EvalBool_OneOperatorValueChanges_ReferenceNodeInt64_NumberInt64-4           0.00B ±NaN%    0.00B ±NaN%      ~     (all samples are equal)

name                                                                       old allocs/op  new allocs/op  delta
_EvalBool_OneOperator_UnaryNode_BoolNode-4                                     1.00 ± 0%     0.00 ±NaN%  -100.00%          (p=0.008 n=5+5)
_EvalBool_OneOperator_UnaryNode_ReferenceNode-4                                1.00 ± 0%     0.00 ±NaN%  -100.00%          (p=0.008 n=5+5)
_EvalBool_OneOperator_NumberFloat64_NumberFloat64-4                           0.00 ±NaN%     0.00 ±NaN%      ~     (all samples are equal)
_EvalBool_OneOperator_NumberFloat64_NumberInt64-4                             0.00 ±NaN%     0.00 ±NaN%      ~     (all samples are equal)
_EvalBool_OneOperator_NumberInt64_NumberInt64-4                               0.00 ±NaN%     0.00 ±NaN%      ~     (all samples are equal)
_EvalBool_OneOperator_UnaryNode-4                                              1.00 ± 0%     0.00 ±NaN%  -100.00%          (p=0.008 n=5+5)
_EvalBool_OneOperator_ReferenceNodeFloat64_NumberFloat64-4                    0.00 ±NaN%     0.00 ±NaN%      ~     (all samples are equal)
_EvalBool_OneOperator_ReferenceNodeFloat64_NumberInt64-4                      0.00 ±NaN%     0.00 ±NaN%      ~     (all samples are equal)
_EvalBool_OneOperator_ReferenceNodeFloat64_ReferenceNodeFloat64-4             0.00 ±NaN%     0.00 ±NaN%      ~     (all samples are equal)
_EvalBool_OneOperatorWith11ScopeItem_ReferenceNodeFloat64_NumberFloat64-4     0.00 ±NaN%     0.00 ±NaN%      ~     (all samples are equal)
_EvalBool_OneOperatorValueChanges_ReferenceNodeFloat64_NumberFloat64-4        0.00 ±NaN%     0.00 ±NaN%      ~     (all samples are equal)
_EvalBool_OneOperator_ReferenceNodeInt64_ReferenceNodeInt64-4                 0.00 ±NaN%     0.00 ±NaN%      ~     (all samples are equal)
_EvalBool_OneOperatorWith11ScopeItem_ReferenceNodeInt64_NumberInt64-4         0.00 ±NaN%     0.00 ±NaN%      ~     (all samples are equal)
_EvalBool_TwoLevelDeep-4                                                      0.00 ±NaN%     0.00 ±NaN%      ~     (all samples are equal)
_EvalBool_OneOperatorValueChanges_ReferenceNodeInt64_NumberInt64-4            0.00 ±NaN%     0.00 ±NaN%      ~     (all samples are equal)


// StatefulExpression is interface that describe expression with state and
// it's evaluation.
type StatefulExpression interface {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we named the package stateful we should change this to Expression.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want to rename the file as well? so it will be "tick/stateful/expr.go" ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, lets rename the files to match. Thanks

@nathanielc
Copy link
Contributor

@yosiat LGTM 👍 Just read through everything again. Just needs a CHANGELOG entry and to fix that one comment.

@yosiat
Copy link
Contributor Author

yosiat commented Apr 29, 2016

@nathanielc Fixed the comment,

This changelog entry is ok?

  • Compiled stateful expression #491: BREAKING: Rewriting stateful expression in order to improve performance, the only breaking change is: short circuit evaluation for booleans - for example: lambda: "bool_value" && (count() > 100) if "bool_value" is false, we won't evaluate "count".

By the way, aren't we at 0.14? and 0.13 is released?

@nathanielc
Copy link
Contributor

@yosiat Changelog entry looks good.

0.13 is not yet released, so this will make it in.

@yosiat yosiat force-pushed the compiled-stateful-expression branch from d646d0e to 107aa66 Compare April 29, 2016 20:57
Creating specialized stateful expression in order to improve performance
and code clarity.
Speicalized stateful expression eliminates all heap allocations and most
of interface{} conversions.

The idea behind "specialized stateful expression" is to implement (very
very simple) "JIT
Compiler" for stateful expression.

For example: given this expression ``"value" > 8.0``, at runtime we
will try to find the type of "value", once we find the type (for this
example, let's say it's float64) we will comparison functions which
accepts float64 both on the right and left side.

Contiuing with our example, let's say that on runtime after evaluating
multiple times - "value" changed is type to int64, if that happens - we
will have "type guard" that will raise an error and say this is not
"float64 > float64" it's "int64 > float64" and we will adjust the
evaluation function.

There are more changes, those are the bottom line:
* Tests - lots of tests added, at the time of writing this commit we have
77.8% coverage for "stateful" package.
* Compile time errors - simple compile time errors that will stop the
tasks starting if there is some simple mistakes.

Finally, benchmarks:
* Compared against the interepeted one - "NewStatefulExpr"
* Ran with count=5 on Macbook Pro 13" Late 2011 (i5, 8GB RAM, 120GB SSD)
name
me
old time/op    new time/op    delta
_EvalBool_OneOperator_UnaryNode_BoolNode-4
252ns ± 2%      16ns ± 3%   -93.53%  (p=0.008 n=5+5)
_EvalBool_OneOperator_NumberFloat64_NumberFloat64-4
540ns ± 2%      40ns ± 2%   -92.58%  (p=0.008 n=5+5)
_EvalBool_OneOperator_NumberFloat64_NumberInt64-4
550ns ± 3%      40ns ± 3%   -92.68%  (p=0.008 n=5+5)
_EvalBool_OneOperator_NumberInt64_NumberInt64-4
539ns ± 2%      39ns ± 4%   -92.77%  (p=0.008 n=5+5)
_EvalBool_OneOperator_ReferenceNodeFloat64_NumberFloat64-4
524ns ± 3%      74ns ± 5%   -85.93%  (p=0.008 n=5+5)
_EvalBool_OneOperator_ReferenceNodeFloat64_NumberInt64-4
526ns ± 1%      76ns ± 3%   -85.51%  (p=0.008 n=5+5)
_EvalBool_OneOperator_ReferenceNodeFloat64_ReferenceNodeFloat64-4
495ns ± 3%     116ns ± 4%   -76.48%  (p=0.008 n=5+5)
_EvalBool_OneOperatorWith11ScopeItem_ReferenceNodeFloat64_NumberFloat64-4
534ns ± 3%      90ns ± 4%   -83.11%  (p=0.008 n=5+5)
_EvalBool_OneOperatorValueChanges_ReferenceNodeFloat64_NumberFloat64-4
2.98µs ± 1%    1.24µs ± 3%   -58.44%  (p=0.008 n=5+5)
_EvalBool_OneOperator_ReferenceNodeInt64_ReferenceNodeInt64-4
503ns ± 3%     119ns ± 7%   -76.25%  (p=0.008 n=5+5)
_EvalBool_OneOperatorWith11ScopeItem_ReferenceNodeInt64_NumberInt64-4
533ns ± 1%      87ns ± 3%   -83.69%  (p=0.008 n=5+5)
_EvalBool_OneOperatorValueChanges_ReferenceNodeInt64_NumberInt64-4
3.08µs ± 4%    1.26µs ± 3%   -59.22%  (p=0.008 n=5+5)

name
old alloc/op   new alloc/op   delta
_EvalBool_OneOperator_UnaryNode_BoolNode-4
18.0B ± 0%     0.0B ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperator_NumberFloat64_NumberFloat64-4
72.0B ± 0%     0.0B ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperator_NumberFloat64_NumberInt64-4
72.0B ± 0%     0.0B ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperator_NumberInt64_NumberInt64-4
72.0B ± 0%     0.0B ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperator_ReferenceNodeFloat64_NumberFloat64-4
64.0B ± 0%     0.0B ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperator_ReferenceNodeFloat64_NumberInt64-4
64.0B ± 0%     0.0B ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperator_ReferenceNodeFloat64_ReferenceNodeFloat64-4
49.0B ± 0%     0.0B ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperatorWith11ScopeItem_ReferenceNodeFloat64_NumberFloat64-4
64.0B ± 0%     0.0B ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperatorValueChanges_ReferenceNodeFloat64_NumberFloat64-4
64.0B ± 0%     0.0B ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperator_ReferenceNodeInt64_ReferenceNodeInt64-4
49.0B ± 0%     0.0B ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperatorWith11ScopeItem_ReferenceNodeInt64_NumberInt64-4
64.0B ± 0%     0.0B ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperatorValueChanges_ReferenceNodeInt64_NumberInt64-4
64.0B ± 0%     0.0B ±NaN%  -100.00%  (p=0.008 n=5+5)

name
old allocs/op  new allocs/op  delta
_EvalBool_OneOperator_UnaryNode_BoolNode-4
3.00 ± 0%     0.00 ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperator_NumberFloat64_NumberFloat64-4
5.00 ± 0%     0.00 ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperator_NumberFloat64_NumberInt64-4
5.00 ± 0%     0.00 ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperator_NumberInt64_NumberInt64-4
5.00 ± 0%     0.00 ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperator_ReferenceNodeFloat64_NumberFloat64-4
4.00 ± 0%     0.00 ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperator_ReferenceNodeFloat64_NumberInt64-4
4.00 ± 0%     0.00 ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperator_ReferenceNodeFloat64_ReferenceNodeFloat64-4
3.00 ± 0%     0.00 ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperatorWith11ScopeItem_ReferenceNodeFloat64_NumberFloat64-4
4.00 ± 0%     0.00 ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperatorValueChanges_ReferenceNodeFloat64_NumberFloat64-4
4.00 ± 0%     0.00 ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperator_ReferenceNodeInt64_ReferenceNodeInt64-4
3.00 ± 0%     0.00 ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperatorWith11ScopeItem_ReferenceNodeInt64_NumberInt64-4
4.00 ± 0%     0.00 ±NaN%  -100.00%  (p=0.008 n=5+5)
_EvalBool_OneOperatorValueChanges_ReferenceNodeInt64_NumberInt64-4
4.00 ± 0%     0.00 ±NaN%  -100.00%  (p=0.008 n=5+5)
@yosiat yosiat force-pushed the compiled-stateful-expression branch from 107aa66 to 7e536e9 Compare April 29, 2016 20:57
@yosiat
Copy link
Contributor Author

yosiat commented Apr 29, 2016

@nathanielc I am confused with the versions because I am running the nightly version =X

Finished with changes & squashed.

@nathanielc
Copy link
Contributor

@yosiat Wonderful, merging on green...

@nathanielc nathanielc merged commit 1072121 into influxdata:master Apr 29, 2016
@toddboom
Copy link
Contributor

@nathanielc @yosiat this is awesome - thanks for all the hard work!

@yosiat
Copy link
Contributor Author

yosiat commented Apr 29, 2016

Yay 🎉

@yosiat yosiat deleted the compiled-stateful-expression branch April 29, 2016 21:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

UnaryNode doesn't support reference node
3 participants