Skip to content

[GLUTEN-11550][UT] Fix PlanStability test suites for Velox backend#11799

Merged
baibaichen merged 1 commit intoapache:mainfrom
baibaichen:feature/plan
Mar 20, 2026
Merged

[GLUTEN-11550][UT] Fix PlanStability test suites for Velox backend#11799
baibaichen merged 1 commit intoapache:mainfrom
baibaichen:feature/plan

Conversation

@baibaichen
Copy link
Contributor

@baibaichen baibaichen commented Mar 20, 2026

What changes were proposed in this pull request?

#11512 introduced PlanStability test suites for Spark 4.0 and 4.1 by extending Spark's original suites with GlutenTestsCommonTrait. While the tests appeared to pass, they were not actually loading the Gluten plugin — they were effectively running vanilla Spark, which trivially passes golden file comparison against Spark's own approved plans.

This PR fixes the suites properly by:

  1. Switching from GlutenTestsCommonTrait to GlutenSQLTestsBaseTrait, which configures spark.plugins=org.apache.gluten.GlutenPlugin and other Gluten-specific settings (off-heap memory, columnar shuffle, etc.), ensuring the Gluten native engine is actually loaded.

  2. Fixing BroadcastHashJoinExecTransformerBase.requiredChildDistribution — it used raw buildKeyExprs (AttributeReference) to construct HashedRelationBroadcastMode, while Spark's EnsureRequirements creates BroadcastExchange with bound+rewritten keys (BoundReference). This mode mismatch caused ValidateRequirements.validate(plan) to fail for every query containing a BroadcastHashJoin when AQE is disabled. Fixed by aligning with Spark's BroadcastHashJoinExec using BindReferences.bindReferences(HashJoin.rewriteKeyExpr(keys), output).

  3. Adding GlutenPlanStabilityTestTrait that overrides testQuery() to validate plans via ValidateRequirements instead of golden file comparison, since Gluten intentionally produces different physical plans with native Transformer operators (e.g., BroadcastHashJoinExecTransformer, ColumnarExchange).

How was this patch tested?

All 7 test suites pass on both Spark 4.0 and 4.1 with Gluten plugin loaded:

Suite Tests Spark 4.0 Spark 4.1
GlutenTPCDSV1_4_PlanStabilitySuite 97
GlutenTPCDSV1_4_PlanStabilityWithStatsSuite 97
GlutenTPCDSV2_7_PlanStabilitySuite 32
GlutenTPCDSV2_7_PlanStabilityWithStatsSuite 32
GlutenTPCDSModifiedPlanStabilitySuite 21
GlutenTPCDSModifiedPlanStabilityWithStatsSuite 21
GlutenTPCHPlanStabilitySuite 22

TODO

  • Verify requiredChildDistribution change has no broader impact (only affects AQE-disabled path)
  • Implement Gluten-specific golden file comparison for plan stability regression detection

Related issue: #11550

Fix BroadcastHashJoinExecTransformerBase.requiredChildDistribution to use
bound+rewritten keys (matching Spark's BroadcastHashJoinExec), so that
ValidateRequirements.validate(plan) passes when AQE is disabled.

Add GlutenPlanStabilityTestTrait to override testQuery() - validates plans
via ValidateRequirements instead of golden file comparison, since Gluten
produces different physical plans with native Transformer operators.

Apply fixes to all PlanStability test suites for both Spark 4.0 and 4.1:
- GlutenTPCDSV1_4_PlanStabilitySuite
- GlutenTPCDSV1_4_PlanStabilityWithStatsSuite
- GlutenTPCDSV2_7_PlanStabilitySuite
- GlutenTPCDSV2_7_PlanStabilityWithStatsSuite
- GlutenTPCDSModifiedPlanStabilitySuite
- GlutenTPCDSModifiedPlanStabilityWithStatsSuite
- GlutenTPCHPlanStabilitySuite

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions github-actions bot added the CORE works for Gluten Core label Mar 20, 2026
@github-actions
Copy link

Run Gluten Clickhouse CI on x86

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes the Spark 4.0 / 4.1 PlanStability test suites so they actually run with the Gluten plugin loaded on the Velox backend, and switches the verification from Spark-plan golden file comparison to validating Spark distribution/ordering requirements on Gluten-transformed physical plans.

Changes:

  • Update Spark 4.0/4.1 PlanStability wrappers to use GlutenSQLTestsBaseTrait and a new GlutenPlanStabilityTestTrait that validates plans via ValidateRequirements.
  • Fix BroadcastHashJoinExecTransformerBase.requiredChildDistribution to use bound + rewritten build keys so its BroadcastDistribution mode matches the BroadcastExchange mode created by Spark’s EnsureRequirements.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
gluten-ut/spark41/src/test/scala/org/apache/spark/sql/GlutenPlanStabilitySuite.scala Introduces GlutenPlanStabilityTestTrait and updates PlanStability suite wrappers to load the Gluten plugin and validate requirements instead of golden files.
gluten-ut/spark40/src/test/scala/org/apache/spark/sql/GlutenPlanStabilitySuite.scala Same as Spark 4.1 version: ensures plugin loading and validates distribution/ordering requirements for Gluten plans.
gluten-substrait/src/main/scala/org/apache/gluten/execution/JoinExecTransformer.scala Aligns BHJ requiredChildDistribution with Spark’s bound+rewritten key behavior to avoid broadcast mode mismatches.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@baibaichen baibaichen marked this pull request as ready for review March 20, 2026 12:16
@baibaichen baibaichen merged commit 3f463a3 into apache:main Mar 20, 2026
71 of 72 checks passed
@baibaichen baibaichen deleted the feature/plan branch March 20, 2026 12:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CORE works for Gluten Core

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants