Skip to content

[SPARK-56483][SQL][TESTS] Use abstract SparkSession/DataFrame in SQLTestUtilsBase#55346

Closed
zhengruifeng wants to merge 12 commits into
apache:masterfrom
zhengruifeng:sql_test_utils_abstract_session
Closed

[SPARK-56483][SQL][TESTS] Use abstract SparkSession/DataFrame in SQLTestUtilsBase#55346
zhengruifeng wants to merge 12 commits into
apache:masterfrom
zhengruifeng:sql_test_utils_abstract_session

Conversation

@zhengruifeng
Copy link
Copy Markdown
Contributor

@zhengruifeng zhengruifeng commented Apr 15, 2026

What changes were proposed in this pull request?

Decouple SQLTestUtilsBase from classic SparkSession and DataFrame types, switching to the abstract SQL API types instead. This is a preparatory step toward merging SQLTestUtilsBase/SQLTestUtils with QueryTestBase/QueryTest.

Core change in SQLTestUtils.scala:

  • Remove override protected def spark: classic.SparkSession from SQLTestUtilsBase, letting it inherit the abstract SparkSession from SparkSessionProvider
  • Move DataFrame and SparkSession imports from o.a.s.sql.classic to o.a.s.sql (abstract API)
  • Cast to classic.SparkSession locally in methods that require classic-only APIs (testImplicits, stripSparkFilter, logicalPlanToSparkQuery, uncacheTable)
  • logicalPlanToSparkQuery returns classic.DataFrame (not abstract) since it satisfies both classic and abstract targets via subtyping, avoiding the need for implicit chaining

API visibility fix:

  • Widen Dataset.withColumnsRenamed(Seq, Seq) from protected to protected[spark] in the abstract API, with override in classic and connect — unifying the visibility across all three

Downstream test fixes (using import testImplicits._ to provide ClassicConversions):

  • ReplaceIntegerLiteralsWithOrdinalsDataframeSuite — added import testImplicits._
  • CollationExpressionWalkerSuite — added import testImplicits._
  • AlterTableRenameSuiteBase — added import testImplicits._
  • DropTableSuiteBase — added import testImplicits._
  • OuterJoinSuite — widened import testImplicits.toRichColumn to import testImplicits._
  • SchemaPruningSuite — added import testImplicits.castToImpl (class-level testImplicits._ conflicts with catalyst DSL $"col")
  • HiveQuerySuite — replaced import TestHive.sparkSession.implicits._ with import testImplicits._
  • HiveUDFSuite — added import testImplicits.castToImpl
  • MaterializeTablesSuite — added class-level import testImplicits._, removed per-test import session.implicits._ and unused val session = spark
  • PipelineTest, BaseCoreExecutionTest — added import testImplicits._

Other fixes:

  • DDLSuitespark.artifactManagerspark.sessionState.artifactManager
  • CachedTableSuitegetNumInMemoryRelations accepts abstract Dataset[_]
  • ParquetFilterSuite — removed pre-existing unused ClassicConversions._ import

Why are the changes needed?

SQLTestUtilsBase currently requires a classic.SparkSession, which tightly couples it to the classic implementation. This makes it difficult to merge SQLTestUtilsBase with QueryTestBase (which uses the abstract SparkSession via SparkSessionProvider). Decoupling the base types is a necessary first step toward consolidating the test utility class hierarchy.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

  • Compilation verified for sql/Test/compile, hive/Test/compile, connect-common/compile, pipelines/Test/compile, avro/Test/compile, and kafka-0-10-sql/Test/compile.
  • Affected test suites run locally and all passed.

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Code (claude-opus-4-6)

@zhengruifeng zhengruifeng requested a review from cloud-fan April 15, 2026 08:02
@zhengruifeng zhengruifeng force-pushed the sql_test_utils_abstract_session branch from d5e389b to a497d90 Compare April 15, 2026 08:37
Copy link
Copy Markdown
Contributor

@cloud-fan cloud-fan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean decoupling of SQLTestUtilsBase from classic.SparkSession. The type-narrowing override removal is well-scoped, the asInstanceOf casts are localized to the few sites that genuinely need classic APIs, and the withColumnsRenamed visibility fix correctly unifies the override hierarchy across abstract/classic/connect.

No design, correctness, or coverage concerns.

@zhengruifeng
Copy link
Copy Markdown
Contributor Author

merged to master

@zhengruifeng zhengruifeng deleted the sql_test_utils_abstract_session branch April 15, 2026 12:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants