fix(spark): register Spark SQLAlchemy dialect so spark:// URIs resolve to SparkEngineSpec by Khrol · Pull Request #38299 · apache/superset

Khrol · 2026-02-27T08:46:36Z

SUMMARY

SparkEngineSpec wasn't even usable before because hive://... URLs were resolved to HiveEngineSpec.

spark:// connection strings were not resolving to SparkEngineSpec because the Spark dialect was not registered with SQLAlchemy. This PR:

Sets engine = "spark" on SparkEngineSpec so get_engine_spec correctly maps spark:// URIs
Registers the HiveDialect under the "spark" name via sqlalchemy.dialects.registry
Preserves Spark-native SQL functions like BOOL_OR instead of rewriting them to LOGICAL_OR via the Hive dialect

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

Before:

After:

TESTING INSTRUCTIONS

Run the new unit tests:

pytest tests/unit_tests/sql/test_spark_dialect.py -v

Verify all tests pass, confirming:
- spark:// URIs resolve to SparkEngineSpec
- SparkEngineSpec.engine is "spark"
- BOOL_OR is preserved (not rewritten to LOGICAL_OR) when using the Spark engine
- BOOL_OR is preserved after applying a LIMIT (the SQLLab flow)
- The Hive dialect still rewrites BOOL_OR to LOGICAL_OR (contrast test)

ADDITIONAL INFORMATION

Has associated issue:
Required feature flags:
Changes UI
Includes DB Migration (follow approval process in SIP-59)
- Migration is atomic, supports rollback & is backwards-compatible
- Confirm DB migration upgrade and downgrade tested
- Runtime estimates and downtime expectations provided
Introduces new feature or API
Removes existing feature or API

codecov · 2026-02-27T08:56:43Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 64.87%. Comparing base (761cee2) to head (4c83948).

Additional details and impacted files

@@            Coverage Diff             @@
##           master   #38299      +/-   ##
==========================================
+ Coverage   64.10%   64.87%   +0.77%     
==========================================
  Files        1810     2483     +673     
  Lines       71288   123136   +51848     
  Branches    22694    28567    +5873     
==========================================
+ Hits        45696    79882   +34186     
- Misses      25592    41857   +16265     
- Partials        0     1397    +1397

Flag	Coverage Δ
hive	`41.11% <100.00%> (?)`
mysql	`64.05% <100.00%> (?)`
postgres	`64.13% <100.00%> (?)`
presto	`41.13% <100.00%> (?)`
python	`65.90% <100.00%> (?)`
sqlite	`63.72% <100.00%> (?)`
unit	`100.00% <ø> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

bito-code-review

Code Review Agent Run #e30f65

Actionable Suggestions - 1

tests/unit_tests/sql/test_spark_dialect.py - 1
- Parametrize argument type incorrect · Line 46-48

Review Details

Files reviewed - 2 · Commit Range: 2f6cc89..805fd18
- superset/db_engine_specs/spark.py
- tests/unit_tests/sql/test_spark_dialect.py
Files skipped - 0
Tools
- Whispers (Secret Scanner) - ✔︎ Successful
- Detect-secrets (Secret Scanner) - ✔︎ Successful
- MyPy (Static Code Analysis) - ✔︎ Successful
- Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

/review - Manually triggers a full AI review.
/pause - Pauses automatic reviews on this pull request.
/resume - Resumes automatic reviews.
/resolve - Marks all Bito-posted review comments as resolved.
/abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by

tests/unit_tests/sql/test_spark_dialect.py

…e to SparkEngineSpec - Set engine = "spark" on SparkEngineSpec so get_engine_spec correctly maps spark:// URIs - Register HiveDialect under the "spark" name via sqlalchemy.dialects.registry - Preserve Spark-native SQL functions like BOOL_OR instead of rewriting them to LOGICAL_OR via the Hive dialect - Update example connection string to use spark:// scheme

bito-code-review · 2026-02-27T12:23:57Z

Code Review Agent Run #593878

Actionable Suggestions - 0

Review Details

Files reviewed - 2 · Commit Range: 4c83948..4c83948
- superset/db_engine_specs/spark.py
- tests/unit_tests/sql/test_spark_dialect.py
Files skipped - 0
Tools
- Whispers (Secret Scanner) - ✔︎ Successful
- Detect-secrets (Secret Scanner) - ✔︎ Successful
- MyPy (Static Code Analysis) - ✔︎ Successful
- Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

/review - Manually triggers a full AI review.
/pause - Pauses automatic reviews on this pull request.
/resume - Resumes automatic reviews.
/resolve - Marks all Bito-posted review comments as resolved.
/abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by

Copilot

Pull request overview

This PR makes Spark connections resolvable via spark:// SQLAlchemy URIs by mapping them to SparkEngineSpec, ensuring Superset uses the sqlglot Spark dialect (so Spark-native functions like BOOL_OR are preserved instead of being rewritten by the Hive dialect).

Changes:

Set SparkEngineSpec.engine = "spark" and register a SQLAlchemy dialect handler under the spark scheme.
Update Spark engine spec metadata to use a spark://... connection string template.
Add unit tests validating spark:// resolution and confirming BOOL_OR preservation in Spark (and Hive contrast behavior).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
`superset/db_engine_specs/spark.py`	Switches Spark engine key to `spark`, registers a `spark` SQLAlchemy dialect, and updates the connection string template.
`tests/unit_tests/sql/test_spark_dialect.py`	Adds unit tests covering engine spec resolution and sqlglot formatting behavior for Spark vs Hive.

Copilot · 2026-03-04T18:59:34Z

superset/db_engine_specs/spark.py

+    engine = "spark"
+    registry.register("spark", "pyhive.sqlalchemy_hive", "HiveDialect")


Changing SparkEngineSpec.engine to "spark" means this engine spec will only be considered “available” if get_available_engine_specs() can detect an installed SQLAlchemy dialect backend named spark. Today that detection only scans SQLAlchemy built-in dialects plus installed entry points; a runtime registry.register("spark", ...) doesn’t feed into that scan, so drivers["spark"] will likely stay empty and /api/v1/database/available will filter Spark out entirely (it skips engine specs with no drivers). Consider updating driver discovery to also check sqlalchemy.dialects.registry.load("spark") (or similar) and populate drivers["spark"], or add an alias/fallback mechanism that doesn’t cause get_engine_spec("hive") to resolve to SparkEngineSpec.

Suggested change

engine = "spark"

registry.register("spark", "pyhive.sqlalchemy_hive", "HiveDialect")

engine = "hive"

This proposal proposes to revert the changes.
https://github.com/apache/superset/blob/master/superset/db_engine_specs/ascend.py#L27-L28 - the same pattern is used here.

pull-request-size bot added the size/L label Feb 27, 2026

Khrol marked this pull request as draft February 27, 2026 08:46

dosubot bot added the data:connect Namespace | Anything related to db connections / integrations label Feb 27, 2026

pull-request-size bot added size/M and removed size/L labels Feb 27, 2026

Khrol marked this pull request as ready for review February 27, 2026 09:55

pull-request-size bot added size/L and removed size/M labels Feb 27, 2026

bito-code-review bot reviewed Feb 27, 2026

View reviewed changes

tests/unit_tests/sql/test_spark_dialect.py Show resolved Hide resolved

pull-request-size bot added size/M size/L and removed size/L size/M labels Feb 27, 2026

Khrol force-pushed the spark-dialect-fix branch from d901aab to 4c83948 Compare February 27, 2026 11:49

sadpandajoe requested review from betodealmeida and Copilot March 4, 2026 18:51

Copilot started reviewing on behalf of sadpandajoe March 4, 2026 18:52 View session

Copilot AI reviewed Mar 4, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(spark): register Spark SQLAlchemy dialect so spark:// URIs resolve to SparkEngineSpec#38299

fix(spark): register Spark SQLAlchemy dialect so spark:// URIs resolve to SparkEngineSpec#38299
Khrol wants to merge 1 commit intoapache:masterfrom
Khrol:spark-dialect-fix

Khrol commented Feb 27, 2026 •

edited

Loading

Uh oh!

codecov bot commented Feb 27, 2026 •

edited

Loading

Uh oh!

bito-code-review bot left a comment •

edited

Loading

Uh oh!

Uh oh!

bito-code-review bot commented Feb 27, 2026 •

edited

Loading

Code Review Agent Run #593878

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 4, 2026

Uh oh!

Khrol Mar 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		engine = "spark"
		registry.register("spark", "pyhive.sqlalchemy_hive", "HiveDialect")

	engine = "spark"
	registry.register("spark", "pyhive.sqlalchemy_hive", "HiveDialect")
	engine = "hive"

Conversation

Khrol commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

SUMMARY

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

ADDITIONAL INFORMATION

Uh oh!

codecov bot commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

bito-code-review bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Code Review Agent Run #e30f65

Uh oh!

Uh oh!

bito-code-review bot commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review Agent Run #593878

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

Khrol Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Khrol commented Feb 27, 2026 •

edited

Loading

codecov bot commented Feb 27, 2026 •

edited

Loading

bito-code-review bot left a comment •

edited

Loading

bito-code-review bot commented Feb 27, 2026 •

edited

Loading