Implementing a syntaxicTree for version 2.0 of ExpressionEvaluator #58

codingseb · 2020-07-08T14:11:26Z

I create here an issue to follow and document (first for myself but for who is interested) the evolution of the work to version 2.0

The goal is to deeply refactor ExpressionEvaluator to internally build a syntaxic tree with different kinds of tokens.
The evaluation will be executed in 2 phases.

Parse the expression string and build the syntaxic tree.
Execute the syntaxic tree.

This will allow a lot of stuffs that are not possible with the current "on the fly" way of evaluating things.

It should allow to :

Cache expressions to speed up evaluations when executed multiple times.
Managing better the evaluation flow.
- To correct ConditionalAnd with variables #56 (Conditional and and Conditional always execute their right part) without breaking operators precedence
- To correct Sum of decimal fail #65 (Problem to select the good method when multiple method with same name defined that take lambda as arguments)
Validate more things before executing and prevent partial executions
Make better self explained exceptions.

Risks and drawdowns

I'm not sure yet that all existing functionalities can work with this new way of doing things.
Will take time
A lot of tests exists to ensure that what is working now will still working after but there are not exaustive. So there is a high risk of breaking things. More tests are needed.
As the structure will change, the way that some of the existing functionalities works will change. It means that the corresponding documentation will need to be rewrite.

I will also decide how to split things between version 1 and 2.

Fork, Branch or continue in the same repo and same release flow
How to manage documentation differences.
Split code in multiple files, keep it in one for easy copy or make a precompile merge or something like this.

TheoVC · 2021-07-21T01:57:40Z

I've been making a little investigation and found that this library make something "like" the version 2 of this library should make.
https://github.com/zzzprojects/System.Linq.Dynamic.Core
It has a parselamba method wich parse a string into an AST abtract syntactic tree. Using linq.Expressions. so you can compile at the end and get an amazing fast lambda wich make whatever you want.
This library can manage a Context (because can be strongly typed). however it doesn't have variables functionality, does not support ?. or ?? operators... and has a lot of other database functionality that is innecesary.
We should study it...
Tomorrow I'm gonna post a little demo that I did.

TheoVC · 2021-07-21T14:41:35Z

This is the context

public class Person
{
    public string Name { get; set; }
    public decimal Value { get; set; }
}

those the usings

using System.Linq.Dynamic.Core;
using System.Linq.Expressions;
using System.Diagnostics;

and this is the main test

List<Person> MyList = new List<Person>();

for(int i= 0; i< 100000; i++)
{
    MyList.Add(new Person { Name = "Theo" + i, Value = i + 0.11m });
}

var testFunc = Test3();

Stopwatch Reloj = new Stopwatch();

Reloj.Start();

foreach (var p in MyList)
{
    var rtn = testFunc(p);
    if(rtn) Console.WriteLine(rtn + p.Name);
}

Reloj.Stop();

Console.WriteLine(Reloj.ElapsedMilliseconds + " milisecs");

public static Func<Person, decimal> Test2()
{
    var arbol = DynamicExpressionParser.ParseLambda<Person, decimal>(new ParsingConfig() { }, false, "Value + 3.3m");
    return arbol.Compile();
}
public static Func<Person, bool> Test3()
{
    var tree = DynamicExpressionParser.ParseLambda<Person, bool>(new ParsingConfig() { }, false, "Name.Trim() == \"Theo45\"");
    // at this point you have the expression tree (in Linq.Expressions) in tree and walk trought it.
    // at the end we compile
    return tree.Compile();
}

In my machine, I get 3 miliseconds for the 100,000 iterations.
making something similar with codingSeb it took 24000 miliseconds

codingseb · 2021-07-23T20:50:30Z

@TheoVC Thanks for your your investigation and your tests. Yes I think syntaxic tree or compilation are a big speed enhencement for multiple execution. Speed is a big drawdown of EE.
I will investigate a bit on how it is done. The issue I see for now, is that a lot of current functionalities available in current version of EE seems to be hard to implement with these kind of systems. On the fly evaluations events, syntax customization and script evaluations for example. I need to investigate more to find solutions to implement them.

codingseb · 2021-08-25T07:10:25Z

I investigated a bit in the wonderful world of System.Linq.Expressions.
What I see for now and reflect on

The gains in performance are enormous.
Some recurring EE Pitfalls are easier to manage or handle automatically (Method override selection, && and || right operand evaluation versus syntax exception)
Everything need to be typed and so a lot of casting need to be done compared the duck typing of the current version of EE.
So variables types must be defined at parsing time and can not change for each evaluation.
On the fly events equivalent should occurs at parsing time to define the type of the value to return and define a delegate to call at evaluation time.
A lot of options and syntax customization will applied at parsing time.
After compilation a lot of stuffs are rigid.
async await asked in [suggestion] Implement async / await #76 are not supported in System.Linq.Expressions for now (refs : stackoverflow, Microsoft)

TheoVC · 2021-08-25T18:57:40Z

For investigation purposes... I have done a recursive descent parser based on the Richard Weeks series.
https://www.youtube.com/watch?v=_hYV5N_NAgk&list=PLHLYG7mk_iQloafxJ55Uqxc2n7m4NnNO7&index=12
and extended it to implement more complicated c# syntax. don't know if it would be usefull to you. I can send it to you if you wish.
as you said everything have to be strong typed, even variables, and of course the context.
so the constructor could receive the type of the context...

var evaluator = new Evaluator(typeof( PERSON ));

of maybe

var evaluator = new Evaluator<PERSONA>();

the type could be null if you don't want a context... and just use the variables feature

another feature that would be wonderfull is that the library not just evaluates an expression, but also produce an string explaining to the user what’s happening inside the expression.

Something like this:

For expression:

200.0 + variable1 + myFunction( variable2 ) + ( variable3 ? 100 : variable 4 )

produce an explanation like this:

200.0 + /* variable1 */ 200 + /*myFunction( [variable2] 5 )*/ 20 + ( /* variable3 */ true ? 100 : variable 4 )

if you remove the coments you can have something like this;

200.0 + 200 + 20 + ( true ? 100 : variable 4 )

So this string can be displayed by the application and easily the IT guy or even the user himself can deduce what it’s going wrong? or simply to know why ??? you get that result.

We already achieve this using codingseb but has a lot of manual parsing and it’s slow…
I did a lot of research and there's NO library that can do it.

PhenX · 2021-08-30T20:12:59Z

Hello, if that's of any help, what do you think about this library ? https://github.com/davideicardi/DynamicExpresso
It looks like a good starting point.

codingseb · 2021-08-31T09:37:28Z

Hello, if that's of any help, what do you think about this library ? https://github.com/davideicardi/DynamicExpresso
It looks like a good starting point.

Yes it's a really good lib.

I created a new repo where I test some partial rewriting of EE with a ExpressionTree compilation.
It work already for small operation without all variables, methods and properties access and on the fly stuffs :
CSXPression.

I made a lot of copy paste modify operations from EE so it's not really a Lexer/ASTNode but I made a parser part that build a tokens tree from which I can get an expression tree and then compile to Func<object> delegate.

codingseb · 2021-08-31T09:46:01Z

I am just a bit stuck on how to adapt the variables and on the fly events (that were for me the big cool stuff of EE with syntax customization) without doing just the same as these already existing libraries.

codingseb · 2021-08-31T09:55:15Z

For investigation purposes... I have done a recursive descent parser based on the Richard Weeks series.
https://www.youtube.com/watch?v=_hYV5N_NAgk&list=PLHLYG7mk_iQloafxJ55Uqxc2n7m4NnNO7&index=12

Also very interesting stuff here thank for sharing this serie. I am watching it. And I am asking myself about the perfs of Lexer/ASTNodes char by char parsing versus my big regex way of parsing.

lofcz · 2021-09-01T10:36:36Z

@codingseb one of the cleanest architectures of hand written parser is here - https://github.com/scriban/scriban
I would possibly use one of LL(k) parser generators not only for perf but also for ease of maintenance.
Currently I can't use EE due to perf so I put together a js-c# like language that transpiles to lua scripts and I execute them with Moonsharp (heavy clr usage makes up for slower vm compared to other lua implementations for c#).

Anyway for v2 our main goal should be performance as I think we can live without cool customization when such implementaion can't be then used in real world applications. Hence a good solution would imho be able to generate bytecode for either native c# or custom vm which would then interpret this simpler language on runtime. Lua vm has only ~50 bytecode opcodes.

Of almost the same importance is imho ability to properly sandbox the script evaluation as untrusted code will be interpreted in many use cases. This should include recursion depth limit, total instructions to interpret limit, no access to clr classes unless explicitly provided.

Last thing I would love to see is native async / await support. This is becoming more and more important with each new version of c#.

codingseb added enhancement explanation labels Jul 8, 2020

codingseb added this to the 2.0.0.0 milestone Jul 8, 2020

codingseb self-assigned this Jul 8, 2020

codingseb mentioned this issue Jul 8, 2020

ConditionalAnd with variables #56

Closed

codingseb mentioned this issue Aug 27, 2020

Sum of decimal fail #65

Closed

codingseb mentioned this issue Aug 25, 2021

&& || operator not short-circuit as C# did #114

Open

Alisis33 mentioned this issue Jun 19, 2022

Syntax error in Functions #140

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementing a syntaxicTree for version 2.0 of ExpressionEvaluator #58

Implementing a syntaxicTree for version 2.0 of ExpressionEvaluator #58

codingseb commented Jul 8, 2020 •

edited

Loading

TheoVC commented Jul 21, 2021 •

edited

Loading

TheoVC commented Jul 21, 2021

codingseb commented Jul 23, 2021 •

edited

Loading

codingseb commented Aug 25, 2021

TheoVC commented Aug 25, 2021 •

edited

Loading

PhenX commented Aug 30, 2021

codingseb commented Aug 31, 2021 •

edited

Loading

codingseb commented Aug 31, 2021 •

edited

Loading

codingseb commented Aug 31, 2021

lofcz commented Sep 1, 2021 •

edited

Loading

Implementing a syntaxicTree for version 2.0 of ExpressionEvaluator #58

Implementing a syntaxicTree for version 2.0 of ExpressionEvaluator #58

Comments

codingseb commented Jul 8, 2020 • edited Loading

TheoVC commented Jul 21, 2021 • edited Loading

TheoVC commented Jul 21, 2021

codingseb commented Jul 23, 2021 • edited Loading

codingseb commented Aug 25, 2021

TheoVC commented Aug 25, 2021 • edited Loading

PhenX commented Aug 30, 2021

codingseb commented Aug 31, 2021 • edited Loading

codingseb commented Aug 31, 2021 • edited Loading

codingseb commented Aug 31, 2021

lofcz commented Sep 1, 2021 • edited Loading

codingseb commented Jul 8, 2020 •

edited

Loading

TheoVC commented Jul 21, 2021 •

edited

Loading

codingseb commented Jul 23, 2021 •

edited

Loading

TheoVC commented Aug 25, 2021 •

edited

Loading

codingseb commented Aug 31, 2021 •

edited

Loading

codingseb commented Aug 31, 2021 •

edited

Loading

lofcz commented Sep 1, 2021 •

edited

Loading