New math parser #956

adamchalmers · 2023-10-31T17:41:18Z

Performance

The math parser benchmark now runs in only 3.37% of its previous time (30x speedup). Here's the criterion data from parsing math.kcl:

// Main
time:   [54.612 ms 54.651 ms 54.699 ms]   
// This PR
time:   [1.8411 ms 1.8420 ms 1.8430 ms]
change: [-96.630% -96.626% -96.622%] (p = 0.00 < 0.05)

Algorithmic differences

The previous algorithm received as input a vector of KCL syntax tokens. This meant the previous math parser had to identify which tokens were operands and which were operators. For example, the input "a[0] + (b[0] + c)" has to be parsed to figure out that a[0] is an operand, and to convert the flat vec of tokens into a nested structure (by understanding the parentheses around b[0] + c).

This algorithm is a little different. It receives a vec of operators and operands -- not tokens. The previous parser step in Winnow already parsed the KCL syntax tokens into operators and operands, so there's no need for the math parser to redo this work. The Winnow parser also started building a tree of binary expressions, and it understands nested brackets.

So, my new parser has a lot less work to do (because it reuses previous Winnow steps). My parser really just needs to understand precedence and associativity, and convert the flat vec of operator/operands into a binary expression tree. Because some of these operands are already binary expressions, this is kind of combining expression trees into one big tree.

As a result, I use a simplified version of the "shunting yard" algorithm for converting infix to postfix. This algorithm doesn't have to worry about open/close parentheses (those parentheses have already been converted into expression trees by previous Winnow parsing step). This applies precedence and associativity.

Then I run the evaluate(rpn) function to convert the postfix tokens into a binary expression tree. You just read operands/operators off the postfix list. You push operands into a stack, and if you get an operator, you pop its left and right subtree from the stack, make a new tree, and push that tree onto the stack.

Also, my algorithm iterates over a stack rather than using recursion, which explains at least some performance improvement (though not much compared to reusing the winnow operand/operator parsing). It should reduce the risk of stack overflows.

vercel · 2023-10-31T17:41:23Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Updated (UTC)
modeling-app	✅ Ready (Inspect)	Visit Preview	Oct 31, 2023 7:04pm

atlantis-for-zoo · 2023-10-31T18:12:08Z

Error: This repo is not allowlisted for Atlantis.

adamchalmers · 2023-10-31T18:15:56Z

src/wasm-lib/kcl/Cargo.toml

@@ -50,6 +50,9 @@ engine = []
 panic = "abort"
 debug = true

+[profile.bench]
+debug = true


Flamegraphs of the benchmarks need this for accurately knowing the names of functions

adamchalmers · 2023-10-31T18:16:15Z

src/wasm-lib/kcl/src/ast/types.rs

@@ -2175,6 +2175,7 @@ impl BinaryExpression {
            BinaryOperator::Mul => (left * right).into(),
            BinaryOperator::Div => (left / right).into(),
            BinaryOperator::Mod => (left % right).into(),
+            BinaryOperator::Pow => (left.powf(right)).into(),


Added solely so I could test the new math parser with examples from wikipedia

Irev-Dev

Looks great, just had one question about some commented stuff, just wanted to make sure they weren't forgotten about?

src/wasm-lib/kcl/src/parser/parser_impl.rs

paultag

Looks great, this is a definite improvement

paultag · 2023-11-02T14:02:30Z

src/wasm-lib/kcl/src/ast/types.rs

@@ -2257,13 +2258,46 @@ pub enum BinaryOperator {
    #[serde(rename = "%")]
    #[display("%")]
    Mod,
+    /// Raise a number to a power.
+    #[serde(rename = "^")]


Pow being ^ does make sense from a math point of view (specifically those coming from LaTeX, etc), but Python, Rust, C, Go, etc use ** for exponentiation. We may want to document this pretty heavily and/or come out with a "We don't do bitwise maths" which may help avoid confusion since XOR wouldn't ever be present.

Yeah, I think given our audience (mech eng) and that we never manipulate network etc, we're probably not going to support bitwise operations.

paultag · 2023-11-02T14:05:28Z

src/wasm-lib/kcl/src/ast/types.rs

+    pub fn associativity(&self) -> Associativity {
+        match self {
+            Self::Add | Self::Sub | Self::Mul | Self::Div | Self::Mod => Associativity::Left,
+            Self::Pow => Associativity::Right,


Nice. I've seen a lot of naive parsers that get X^Y^Z wrong.

paultag · 2023-11-02T14:08:43Z

src/wasm-lib/kcl/src/parser/math.rs

+                }
+                operator_stack.push(o1);
+            }
+            o @ BinaryExpressionToken::Operand(_) => output.push(o),


TIL about the match @ operator, this is neat.

adamchalmers force-pushed the achalmers/new-math-parser branch from 4b1c941 to 991a28f Compare October 31, 2023 17:45

vercel bot deployed to Preview October 31, 2023 17:56 View deployment

New math parser

9a62240

adamchalmers force-pushed the achalmers/new-math-parser branch 2 times, most recently from 37c9c48 to fb347b2 Compare October 31, 2023 18:10

Remove old parser

d2bef98

adamchalmers force-pushed the achalmers/new-math-parser branch from fb347b2 to d2bef98 Compare October 31, 2023 18:11

adamchalmers marked this pull request as ready for review October 31, 2023 18:12

adamchalmers requested review from jessfraz, Irev-Dev and paultag October 31, 2023 18:12

jessfraz approved these changes Oct 31, 2023

View reviewed changes

vercel bot deployed to Preview October 31, 2023 18:20 View deployment

Comments

4e78b16

adamchalmers commented Oct 31, 2023

View reviewed changes

Move tests into parser_impl, remove dead code

f7ee75d

Irev-Dev reviewed Oct 31, 2023

View reviewed changes

src/wasm-lib/kcl/src/parser/parser_impl.rs Outdated Show resolved Hide resolved

vercel bot deployed to Preview October 31, 2023 18:53 View deployment

Backport some math tests

ecf453c

adamchalmers force-pushed the achalmers/new-math-parser branch from 9b4b079 to ecf453c Compare October 31, 2023 18:56

Irev-Dev approved these changes Oct 31, 2023

View reviewed changes

vercel bot deployed to Preview October 31, 2023 19:04 View deployment

adamchalmers merged commit 023c3cb into main Oct 31, 2023
14 checks passed

adamchalmers deleted the achalmers/new-math-parser branch October 31, 2023 19:16

paultag reviewed Nov 2, 2023

View reviewed changes

Irev-Dev mentioned this pull request Sep 17, 2024

Should we bump power precedence? #3900

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New math parser #956

New math parser #956

adamchalmers commented Oct 31, 2023 •

edited

Loading

vercel bot commented Oct 31, 2023 •

edited

Loading

atlantis-for-zoo bot commented Oct 31, 2023

adamchalmers Oct 31, 2023

adamchalmers Oct 31, 2023

Irev-Dev left a comment

paultag left a comment

paultag Nov 2, 2023

adamchalmers Nov 2, 2023

paultag Nov 2, 2023

paultag Nov 2, 2023

New math parser #956

New math parser #956

Conversation

adamchalmers commented Oct 31, 2023 • edited Loading

Performance

Algorithmic differences

vercel bot commented Oct 31, 2023 • edited Loading

atlantis-for-zoo bot commented Oct 31, 2023

adamchalmers Oct 31, 2023

Choose a reason for hiding this comment

adamchalmers Oct 31, 2023

Choose a reason for hiding this comment

Irev-Dev left a comment

Choose a reason for hiding this comment

paultag left a comment

Choose a reason for hiding this comment

paultag Nov 2, 2023

Choose a reason for hiding this comment

adamchalmers Nov 2, 2023

Choose a reason for hiding this comment

paultag Nov 2, 2023

Choose a reason for hiding this comment

paultag Nov 2, 2023

Choose a reason for hiding this comment

adamchalmers commented Oct 31, 2023 •

edited

Loading

vercel bot commented Oct 31, 2023 •

edited

Loading