Skip to content

Conversation

@cloud-fan
Copy link
Contributor

What changes were proposed in this pull request?

This is a followup to #53658. Instead of recursively calling the entire actualExprTransformer to optimize new Not expressions created by De Morgan's Laws, this PR extracts a simplifyNot method that only handles Not simplification and calls itself recursively.

Why are the changes needed?

This is more efficient because:

  • We only apply the Not-specific simplifications (not the And/Or factor elimination logic)
  • We avoid the overhead of tree pattern matching and pruning checks
  • The recursion is more direct and focused

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Existing tests.

Was this patch authored or co-authored using generative AI tooling?

Yes.

Made with Cursor

…plification

This is a followup to apache#53658. Instead of recursively calling the entire `actualExprTransformer` to optimize new Not expressions created by De Morgan's Laws, this PR extracts a `simplifyNot` method that only handles Not simplification and calls itself recursively.

This is more efficient because:
- We only apply the Not-specific simplifications (not the And/Or factor elimination logic)
- We avoid the overhead of tree pattern matching and pruning checks
- The recursion is more direct and focused

Co-authored-by: Cursor <cursoragent@cursor.com>
@github-actions
Copy link

github-actions bot commented Feb 3, 2026

JIRA Issue Information

=== Improvement SPARK-54881 ===
Summary: BooleanSimplification rule using transformExpressionsUp instead of transformExpressionsDown, is inefficient in some cases resulting in delayed idempotency
Assignee: Asif
Status: Resolved
Affected: ["4.0.2","3.5.8","4.1.2"]


This comment was automatically generated by GitHub Actions

@github-actions github-actions bot added the SQL label Feb 3, 2026
@cloud-fan
Copy link
Contributor Author

cc @ahshahid @peter-toth

case _ => not
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have 2 comments:

  1. Have found recursive calls to be less efficient, as compared to getting the recursive behaviour via call on an instance, where the instance itself is changing at every step.. like it is happening in transformDown
  2. Any future modification , say adding new case etc, would require analysis whether it should go in both the places ( original transform down, and new simplifyNot function, or only in once place... atleast this is what I feel.. though I may be wrong.
    But if that identification would be straightforward and any hypothetical new case added would be needed only in one of the functions, then this distinction is good. Though recursive call on method will be a concern for complex expressions....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously it was O(n^2): actualExprTransformer is called in transformExpressionsUpWithPruning, and itself also calls transformDownWithPruning. We repeatedly transform the sub-expression tree.

I don't think recursion is a concern, as these transform APIs are all recursive under the hood.

@ahshahid
Copy link
Contributor

ahshahid commented Feb 3, 2026 via email

@cloud-fan
Copy link
Contributor Author

@ahshahid checking Not pattern also traverse the tree.

@ahshahid
Copy link
Contributor

ahshahid commented Feb 3, 2026 via email

Copy link
Contributor

@peter-toth peter-toth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very similar what I suggested to amend the original PR: ahshahid#1, especially at this point: ahshahid@ab29d7b.

I think due to @ahshahid provided ruleId to the Not handling transformDownWithPruning(), unlikely it is ever O(n^2) in practice.
But I like the clean approach of this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants