Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-37467][SQL] Consolidate whole stage and non whole stage subexpression elimination #34727

Closed

Conversation

Kimahriman
Copy link
Contributor

What changes were proposed in this pull request?

This PR consolidates the code paths for subexpression elimination in whole stage and non-whole stage codegen. Whole stage codegen seemed to be mostly a superset of the non-whole stage subexpression elimination, just with whole stage not using the codegen context to track subexpressions. Since subexpression values are replaced with empty blocks when evaluated, the context should be able to track the subexpressions across multiple operators. Not sure if there's corner cases I'm missing though.

It shouldn't result in any functionality changes, but there are slight differences in the generated code as a result of this:

  • Subexpressions in whole stage always use mutable state for results instead of inlining results to support code splitting in non-whole stage
  • Non-whole stage now supports the same inlining subexpressions if small enough as whole stage codegen
  • Subexpressions are tracked across multiple physical operators in whole stage. They are still only calculated in each operator, but if you happen to have an expression in a later operator that was a subexpression in a previous operator, it will be used in the later operator.

Why are the changes needed?

Currently, there are different code paths to handle subexpression elimination in whole stage and non-whole stage codegen. This makes it harder to add new capabilities to subexpression elimination having to deal with independent code paths.

Does this PR introduce any user-facing change?

No, just slight changes in generated code.

How was this patch tested?

Existing unit tests.

@github-actions github-actions bot added the SQL label Nov 27, 2021
@Kimahriman
Copy link
Contributor Author

Kimahriman commented Nov 27, 2021

@viirya. I've been playing around with this and I haven't thought of any breaking cases, but curious if there's anything you can think of or problems with this approach. Mostly trying to consolidate things before playing around with subexpression elimination inside of lambda functions.

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@Kimahriman Kimahriman force-pushed the consolidate-subexpr-elimination branch 3 times, most recently from 233de65 to 8fe5f0b Compare December 24, 2021 15:44
@Kimahriman Kimahriman force-pushed the consolidate-subexpr-elimination branch from 8fe5f0b to c245738 Compare January 28, 2022 20:56
@Kimahriman Kimahriman force-pushed the consolidate-subexpr-elimination branch from c245738 to 7d754a0 Compare April 25, 2022 00:49
@Kimahriman Kimahriman force-pushed the consolidate-subexpr-elimination branch from 7d754a0 to 51d8657 Compare June 5, 2022 17:48
@Kimahriman Kimahriman force-pushed the consolidate-subexpr-elimination branch from 51d8657 to eec318f Compare September 23, 2022 23:58
@Kimahriman Kimahriman force-pushed the consolidate-subexpr-elimination branch from eec318f to a9e5151 Compare October 18, 2022 12:02
@Kimahriman Kimahriman force-pushed the consolidate-subexpr-elimination branch from a9e5151 to adaa8d1 Compare November 6, 2022 13:33
@Kimahriman Kimahriman force-pushed the consolidate-subexpr-elimination branch from adaa8d1 to 70b8441 Compare January 1, 2023 14:40
@Kimahriman Kimahriman force-pushed the consolidate-subexpr-elimination branch from 70b8441 to e35f5bc Compare February 4, 2023 13:37
@Kimahriman Kimahriman force-pushed the consolidate-subexpr-elimination branch from e35f5bc to 16d897d Compare March 3, 2023 12:54
@Kimahriman Kimahriman force-pushed the consolidate-subexpr-elimination branch from 16d897d to 7560c20 Compare March 30, 2023 11:45
@Kimahriman Kimahriman force-pushed the consolidate-subexpr-elimination branch from 7560c20 to 0c803ed Compare June 21, 2023 11:20
@Kimahriman Kimahriman force-pushed the consolidate-subexpr-elimination branch from 0c803ed to e6a113a Compare August 13, 2023 12:58
@Kimahriman Kimahriman closed this Oct 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
2 participants