Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sub expression simplification #38

Closed
2 tasks done
RelativeForce opened this issue Jun 2, 2020 · 0 comments · Fixed by #47
Closed
2 tasks done

Sub expression simplification #38

RelativeForce opened this issue Jun 2, 2020 · 0 comments · Fixed by #47
Assignees
Labels
enhancement New feature or request

Comments

@RelativeForce
Copy link
Collaborator

RelativeForce commented Jun 2, 2020

Details:

Hours Spent: 16

Background

Optimal performance is required. Can be developed in tandem with #30

Overview

Add the concept of a complex expression. When an expression has repeated subexpressions, instead of evaluating the value of that subexpression repeatedly, substitute an ExpressionVariable for that subexpression that can be evaluated along the side the overall expression.

Create a subexpression map that holds the expression that the variable is masking and the expression that it is used in ExpressionLet VariableName Expression Expression
which means let x = a in b. Where a is the subexpression and b is the parent expression.

Prefix all sub-expression variables with $ and the Expression.Simplifier module should be used before joining the common subexpressions.

Example1:
sin(sin(x)) + sin(x)
becomes
let $v1 = sin(x) in let $v2 = sin($v1) in let $v3 = $v2+$v1 in $v3

Example 2:
sin(sin (sin(x))) + sin (sin(x)) + sin(x)
becomes
let $v1 = sin(x) in let $v2 = sin($v1) in let $v3 = sin($v2) in let $v4 = $v3+$v2 in let $v5 = $v4+$v1 in $v5

Algorithm:

  1. Construct a set of sub-expressions (OR map from sub-expressions to occurrence count)
  2. Assign numbers to all these sub-expressions, ie convert the set to indexToSubExpressionMap :: Map Int Expression
  3. Invert this mapping, ie create subExpressionToIndexMap :: Map Expression Int
  4. Substitute in indexToSubExpressionMap :: Map Int Expression top-level sub-terms with variables, giving indexToSubstitutedExpressionMap :: Map Int Expression, eg:
    BinOp Add expr1 expr2 -> BinOp Add (Var "$v10") (Var "$v22")
    BinOp Add (Var "x") expr2 -> BinOp Add (Var "x") (Var "$v22")
  5. Order the variables by dependency, creating sortedIndexAndSubstitutedExpressions :: Array (Tuple Int Expression) in multiple passes, starting with indexToSubstitutedExpressionMap, removing expressions that only contain known variables until the map is empty

Acceptance Criteria

  • Add joinCommonSubExpressions function that will convert the example expressions into their simplified versions
  • This function is covered by unit tests

Closing Notes

Overview of changes can be found in PR #47

The algorithm implemented is as follows.

  1. Split the expression into splitSubExpressions
    Subexpressions are unary and binary operations. This means that the expression ExpressionLiteral 40 has no sub-expressions and is not a subexpression itself.
  2. Create the subExpressionToVariableMap
    In this map the subexpression as the key and the variable as the value. This map is biopic meaning that both the key and value in each entry are unique in the map.
  3. Substitute the variables into each of the sub-expressions
    Each subexpression is either a unary or a binary operation. This means that they are expression trees that are only at maximum one deep. In this step, each entry in the map is treated over and the nested operation(s) in the unary and binary operations are looked up in the subExpressionToVariableMap. If they are found they are replaced with their respective variable.
  4. Order the entries by their interdependencies
    This works by recursively removing variables that are not dependent on any other sub-expressions in the subExpressionToVariableMap. This results in an array of the entries in the order that they should appear in the nested ExpressionLet chain.
  5. Rebuild the expression as an ExpressionLet chain
    This simply folds the chain of sub-expressions into a single expression tree. If there are no sub-expressions in the subExpressionToVariable array then the original expression is used.

There are some small differences between what the examples expect and what is produced by this algorithm. The only real difference is that there is no subexpression variable used for the top-most subexpression.

  • sin(sin(x)) + sin(x) becomes
    let $v1 = sin(x) in let $v2 = sin($v1) in $v2+$v1
  • sin(sin(sin(x))) + sin(sin(x)) + sin(x) becomes
    let $v1 = sin(x) in let $v2 = sin($v1) in let $v3 = sin($v2) in let $v4 = $v3+$v2 in $v4+$v1

I did have to update the Expression.Simplifier as now that the definition for Expression has 5 data constructors doing nested pattern matching causes a warning.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant