Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deeply nested IF statements take very long to parse #1015

Closed
tomsellek opened this issue Apr 18, 2023 · 5 comments
Closed

Deeply nested IF statements take very long to parse #1015

tomsellek opened this issue Apr 18, 2023 · 5 comments

Comments

@tomsellek
Copy link

For example, calling extract on the following formula takes more than three minutes on my machine. Using this formula in formulon.io shows similar behavior.

    IF(INCLUDES( SomeField__c  ,'S1'),'Negative',
      IF(INCLUDES(SomeField__c,'S2'),'Positive',
        IF(INCLUDES(SomeField__c,'S3'),'Negative',
          IF(INCLUDES(SomeField__c,'S4'),'Neutral',
            IF(INCLUDES(SomeField__c,'S5'),'Negative',
              IF(INCLUDES(SomeField__c,'S6'),'Neutral',
                IF(INCLUDES(SomeField__c,'S7'),'Neutral',
                  IF(INCLUDES(SomeField__c,'S8'),'Positive',
                    IF(INCLUDES(SomeField__c,'S9'),'Negative',
                      IF(INCLUDES(SomeField__c,'S10'),'Positive',""))))))))))
@leifg
Copy link
Owner

leifg commented Apr 19, 2023

That's odd. Thank you for reporting I could confirm.

I'll have a look on the weekend.

@leifg
Copy link
Owner

leifg commented Apr 23, 2023

This is a tricky one. The problem is with parsing the formula and building the AST.

I was trying to build a trace for this formula via the Peggy trace option and aborted the run after 6h and a 500GB trace file.

I used an simpler version of the formula and built a trace which resulted in 20k lines:

const formulon = require('../lib/formulon.js');

formulon.ast("IF(INCLUDES(SomeField__c  ,'S1'),'Negative',\"\")")

See trace

Any help to optimizing this would be greatly appreciated.

@tomsellek
Copy link
Author

tomsellek commented Apr 24, 2023

I messed around a bit with the tracing format and got something like:

rule.enter@1 start
rule.enter@1  PrimaryExpression
rule.enter@1   LogicalConcatinationExpression
rule.enter@1    LogicalCompareExpression
rule.enter@1     ArithmeticExpression
... [about 4K lines] ...
rule.match@1     ArithmeticExpression: 'IF(INCLUDES(SomeField__c  ,'S1'),'Negative', '')' => {"type":"callExpression","id":"if","arguments":[{"type":"callExpression","id":"includes","arguments":[{"type":"identifier","name":"SomeField__c"},{"type":"literal","value":"S1","dataType":"text","options":{"length":2}}]},{"type":"literal","value":"Negative","dataType":"text","options":{"length":8}},{"type":"literal","value":"","dataType":"text","options":{"length":0}}]}
... [about 4K lines] ...
rule.match@1     ArithmeticExpression: 'IF(INCLUDES(SomeField__c  ,'S1'),'Negative', '')' => {"type":"callExpression","id":"if","arguments":[{"type":"callExpression","id":"includes","arguments":[{"type":"identifier","name":"SomeField__c"},{"type":"literal","value":"S1","dataType":"text","options":{"length":2}}]},{"type":"literal","value":"Negative","dataType":"text","options":{"length":8}},{"type":"literal","value":"","dataType":"text","options":{"length":0}}]}
rule.match@1    LogicalCompareExpression: 'IF(INCLUDES(SomeField__c  ,'S1'),'Negative', '')' => {"type":"callExpression","id":"if","arguments":[{"type":"callExpression","id":"includes","arguments":[{"type":"identifier","name":"SomeField__c"},{"type":"literal","value":"S1","dataType":"text","options":{"length":2}}]},{"type":"literal","value":"Negative","dataType":"text","options":{"length":8}},{"type":"literal","value":"","dataType":"text","options":{"length":0}}]}
... [about 4K lines] ...
rule.match@1     ArithmeticExpression: 'IF(INCLUDES(SomeField__c  ,'S1'),'Negative', '')' => {"type":"callExpression","id":"if","arguments":[{"type":"callExpression","id":"includes","arguments":[{"type":"identifier","name":"SomeField__c"},{"type":"literal","value":"S1","dataType":"text","options":{"length":2}}]},{"type":"literal","value":"Negative","dataType":"text","options":{"length":8}},{"type":"literal","value":"","dataType":"text","options":{"length":0}}]}
... [about 4K lines] ...
rule.match@1     ArithmeticExpression: 'IF(INCLUDES(SomeField__c  ,'S1'),'Negative', '')' => {"type":"callExpression","id":"if","arguments":[{"type":"callExpression","id":"includes","arguments":[{"type":"identifier","name":"SomeField__c"},{"type":"literal","value":"S1","dataType":"text","options":{"length":2}}]},{"type":"literal","value":"Negative","dataType":"text","options":{"length":8}},{"type":"literal","value":"","dataType":"text","options":{"length":0}}]}
rule.match@1    LogicalCompareExpression: 'IF(INCLUDES(SomeField__c  ,'S1'),'Negative', '')' => {"type":"callExpression","id":"if","arguments":[{"type":"callExpression","id":"includes","arguments":[{"type":"identifier","name":"SomeField__c"},{"type":"literal","value":"S1","dataType":"text","options":{"length":2}}]},{"type":"literal","value":"Negative","dataType":"text","options":{"length":8}},{"type":"literal","value":"","dataType":"text","options":{"length":0}}]}
rule.match@1   LogicalConcatinationExpression: 'IF(INCLUDES(SomeField__c  ,'S1'),'Negative', '')' => {"type":"callExpression","id":"if","arguments":[{"type":"callExpression","id":"includes","arguments":[{"type":"identifier","name":"SomeField__c"},{"type":"literal","value":"S1","dataType":"text","options":{"length":2}}]},{"type":"literal","value":"Negative","dataType":"text","options":{"length":8}},{"type":"literal","value":"","dataType":"text","options":{"length":0}}]}
rule.match@1  PrimaryExpression: 'IF(INCLUDES(SomeField__c  ,'S1'),'Negative', '')' => {"type":"callExpression","id":"if","arguments":[{"type":"callExpression","id":"includes","arguments":[{"type":"identifier","name":"SomeField__c"},{"type":"literal","value":"S1","dataType":"text","options":{"length":2}}]},{"type":"literal","value":"Negative","dataType":"text","options":{"length":8}},{"type":"literal","value":"","dataType":"text","options":{"length":0}}]}
rule.match@1 start: 'IF(INCLUDES(SomeField__c  ,'S1'),'Negative', '')' => {"type":"callExpression","id":"if","arguments":[{"type":"callExpression","id":"includes","arguments":[{"type":"identifier","name":"SomeField__c"},{"type":"literal","value":"S1","dataType":"text","options":{"length":2}}]},{"type":"literal","value":"Negative","dataType":"text","options":{"length":8}},{"type":"literal","value":"","dataType":"text","options":{"length":0}}]}

I think it tries to match with the first alternative of LogicalConcatinationExpression, gets a match on the 'head' part (which is a LogicalCompareExpression, which see below) but fails to find an operator, so it tries to match against the second alternative, which is also a LogicalCompareExpression.
LogicalCompareExpression itself has two alternatives, and it tries to match against the first one first, successfully matches against the head part (which is an ArithmeticExpression) but fails to find an operator so it tries to match against the second alternative, which is also an ArithmeticExpression.
And every successful match against ArithmeticExpression requires a re-parse of everything inside the parentheses, which explains the exponential effect we're seeing in nested calls.

When I changed PrimaryExpression to be

PrimaryExpression
  = ArithmeticExpression
  / LogicalConcatinationExpression
  / UnaryExpression
  / CallExpression
  / Identifier
  / Literal

This specific expression parsed in ~1K trace lines. But of course this isn't a general solution.

@leifg
Copy link
Owner

leifg commented May 22, 2023

Seems like I got a fix with #1040. Essentially what I did was instead of recursively parsing logical concatenation (&& and ||) I just treat them similarly to arithmetic terms and map every element to a term and then resolve the precedence.

This way the CallExpression is quicker and the call stack decreases significantly.

Can you test on branch fix-parser?

@leifg leifg closed this as completed in 49e0127 May 24, 2023
@github-actions
Copy link

🎉 This issue has been resolved in version 6.25.1 🎉

The release is available on:

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants