Skip to content

[GLUTEN-12413][VL] Fix Generate pre-project to use Alias in project list and AttributeRe…#12414

Open
jiangjiangtian wants to merge 2 commits into
apache:mainfrom
jiangjiangtian:generate
Open

[GLUTEN-12413][VL] Fix Generate pre-project to use Alias in project list and AttributeRe…#12414
jiangjiangtian wants to merge 2 commits into
apache:mainfrom
jiangjiangtian:generate

Conversation

@jiangjiangtian

@jiangjiangtian jiangjiangtian commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Fix #12413.
After the fix, the plan is:

== Physical Plan ==
VeloxColumnarToRow
+- ^(6) BroadcastHashJoinExecTransformer [emp_id#19], [emp_id#27], Inner, BuildRight, false
   :- ^(6) HashAggregateTransformer(keys=[emp_id#19, emp_name#20], functions=[count(1)], isStreamingAgg=false)
   :  +- ^(6) InputIteratorTransformer[emp_id#19, emp_name#20, count#42L]
   :     +- ColumnarExchange hashpartitioning(emp_id#19, emp_name#20, 2000), ENSURE_REQUIREMENTS, [emp_id#19, emp_name#20, count#42L], [plan_id=2005], [shuffle_writer_type=hash], [OUTPUT] List(emp_id:IntegerType, emp_name:StringType, count:LongType)
   :        +- VeloxResizeBatches 1024, 2147483647, 10485760
   :           +- ^(2) ProjectExecTransformer [hash(emp_id#19, emp_name#20, 42) AS hash_partition_key#70, emp_id#19, emp_name#20, count#42L]
   :              +- ^(2) FlushableHashAggregateTransformer(keys=[emp_id#19, emp_name#20], functions=[partial_count(1)], isStreamingAgg=false)
   :                 +- ^(2) ProjectExecTransformer [emp_id#19, emp_name#20]
   :                    +- ^(2) HashAggregateTransformer(keys=[emp_id#19, emp_name#20, dept_names#21, sale#22], functions=[], isStreamingAgg=false)
   :                       +- ^(2) InputIteratorTransformer[emp_id#19, emp_name#20, dept_names#21, sale#22]
   :                          +- ColumnarExchange hashpartitioning(emp_id#19, emp_name#20, dept_names#21, sale#22, 2000), ENSURE_REQUIREMENTS, [emp_id#19, emp_name#20, dept_names#21, sale#22], [plan_id=1996], [shuffle_writer_type=hash], [OUTPUT] List(emp_id:IntegerType, emp_name:StringType, dept_names:StringType, sale:IntegerType)
   :                             +- VeloxResizeBatches 1024, 2147483647, 10485760
   :                                +- ^(1) ProjectExecTransformer [hash(emp_id#19, emp_name#20, dept_names#21, sale#22, 42) AS hash_partition_key#69, emp_id#19, emp_name#20, dept_names#21, sale#22]
   :                                   +- ^(1) FlushableHashAggregateTransformer(keys=[emp_id#19, emp_name#20, dept_names#21, sale#22], functions=[], isStreamingAgg=false)
   :                                      +- ^(1) ProjectExecTransformer [emp_id#19, emp_name#20, dept_names#21, sale#22]
   :                                         +- ^(1) GenerateExecTransformer explode(_pre_0#45), [emp_id#19, emp_name#20, dept_names#21, sale#22], false, [dept_name#36]
   :                                            +- ^(1) ProjectExecTransformer [emp_id#19, emp_name#20, dept_names#21, sale#22, split(dept_names#21, ,, -1) AS _pre_0#45]
   :                                               +- ^(1) InputIteratorTransformer[emp_id#19, emp_name#20, dept_names#21, sale#22]
   :                                                  +- RowToVeloxColumnar
   :                                                     +- LocalTableScan [emp_id#19, emp_name#20, dept_names#21, sale#22]
   +- ^(6) InputIteratorTransformer[emp_id#27, sum(sale)#40L]
      +- ColumnarBroadcastExchange HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [plan_id=2116]
         +- ^(5) HashAggregateTransformer(keys=[emp_id#27], functions=[sum(sale#30)], isStreamingAgg=false)
            +- ^(5) InputIteratorTransformer[emp_id#27, sum#44L]
               +- ColumnarExchange hashpartitioning(emp_id#27, 2000), ENSURE_REQUIREMENTS, [emp_id#27, sum#44L], [plan_id=2097], [shuffle_writer_type=hash], [OUTPUT] List(emp_id:IntegerType, sum:LongType)
                  +- VeloxResizeBatches 1024, 2147483647, 10485760
                     +- ^(4) ProjectExecTransformer [hash(emp_id#27, 42) AS hash_partition_key#72, emp_id#27, sum#44L]
                        +- ^(4) FlushableHashAggregateTransformer(keys=[emp_id#27], functions=[partial_sum(sale#30)], isStreamingAgg=false)
                           +- ^(4) ProjectExecTransformer [emp_id#27, sale#30]
                              +- ^(4) HashAggregateTransformer(keys=[emp_id#27, emp_name#28, dept_names#29, sale#30], functions=[], isStreamingAgg=false)
                                 +- ^(4) InputIteratorTransformer[emp_id#27, emp_name#28, dept_names#29, sale#30]
                                    +- ReusedExchange [emp_id#27, emp_name#28, dept_names#29, sale#30], ColumnarExchange hashpartitioning(emp_id#19, emp_name#20, dept_names#21, sale#22, 2000), ENSURE_REQUIREMENTS, [emp_id#19, emp_name#20, dept_names#21, sale#22], [plan_id=1996], [shuffle_writer_type=hash], [OUTPUT] List(emp_id:IntegerType, emp_name:StringType, dept_names:StringType, sale:IntegerType)

There is a ReusedExchange in the plan, which is in expectation.

Side effect: after this PR, GenerateExec with generator offloadable will not fallback because the child of generator is Alias and its validation will always succeed.

@github-actions github-actions Bot added the VELOX label Jul 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[VL] Exchange reuse not applied due to unnormalized Alias in GenerateExecTransformer's generator

1 participant