feat: don't force db connect if using serverless by eakmanrq · Pull Request #3781 · TobikoData/sqlmesh

eakmanrq · 2025-02-03T22:47:22Z

Prior to this PR, if a user said they wanted to use Serverless for databricks-connect then it forced the use of databricks-connect and therefore one could not use the python sql connector. In addition the documentation said that the SQL connector did not support Databricks Serverless Compute which was misleading - although it doesn't support the workspace side Serverless, typically used by Notebooks and Jobs, it does support SQL Warehouse Serverless compute.

Therefore a user could have wanted to use serverless across their stack - Serverless compute for jobs that require PySpark DataFrame and SQL Warehouse Serverless for their SQL queries. This PR now enables this pattern.

One key limitation it works around was temporary objects - since serverless doesn't support global temporary objects, and instead requires session temporary objects, there was an issue mixing databricks-connect and Python SQL connector across the serverless products since they couldn't share this state. This PR resolves this by recording in session connection metadata if a temporary objects was made in a databricks-connect session and if so it will force using databricks-connect for the remainder of the session.

This PR also adds improvements to documentation, removes excess log output in the console, and improves error message if the user has different default catalogs across their SQL and databricks-connect sessions.

Initial PR that added serverless support for context: #3001

docs/integrations/engines/databricks.md

treysp · 2025-02-04T15:06:43Z

docs/integrations/engines/databricks.md

-SQLMesh's Databricks Connect implementation supports Databricks Runtime 13.0 or higher. If SQLMesh detects that you have Databricks Connect installed, then it will use it for all Python models (both Pandas and PySpark DataFrames).
+If SQLMesh detects that you have Databricks Connect installed, then it will automatically configure the connection and use it for all Python models that return a Pandas or PySpark DataFrame.
+
+To have databricks-connect installed but ignored by SQLMesh, set `disable_databricks_connect` to `true` in the connection configuration.


When/why would someone want to have it installed but ignored?

Just needing installed in the env for a non-sqlmesh reason, or people would switch back and forth between using it in sqlmesh and not?

Just needing installed in the env for a non-sqlmesh reason

Yeah this is what I am thinking of. They use a single python environment for their work and therefore they don't want to uninstall it just so SQLMesh behaves as they expect.

This reverts commit 591645c.

eakmanrq requested review from izeigerman and treysp February 3, 2025 22:47

treysp approved these changes Feb 4, 2025

View reviewed changes

eakmanrq force-pushed the eakmanrq/improvements_databricks_serverless_handling branch from a8ccac5 to 0619fd3 Compare February 4, 2025 17:58

feat: don't force db connect if using serverless

9977f31

eakmanrq force-pushed the eakmanrq/improvements_databricks_serverless_handling branch from 0619fd3 to 9977f31 Compare February 4, 2025 17:59

eakmanrq enabled auto-merge (squash) February 4, 2025 18:00

eakmanrq merged commit 591645c into main Feb 4, 2025
21 checks passed

eakmanrq deleted the eakmanrq/improvements_databricks_serverless_handling branch February 4, 2025 18:08

izeigerman added a commit that referenced this pull request Feb 4, 2025

Revert "feat: don't force db connect if using serverless (#3781)"

474a0df

This reverts commit 591645c.

izeigerman mentioned this pull request Feb 4, 2025

Revert "feat: don't force db connect if using serverless" #3784

Merged

eakmanrq restored the eakmanrq/improvements_databricks_serverless_handling branch February 4, 2025 20:57

eakmanrq deleted the eakmanrq/improvements_databricks_serverless_handling branch February 4, 2025 20:58

eakmanrq mentioned this pull request Feb 5, 2025

feat: don't force db connect if using serverless #3786

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: don't force db connect if using serverless#3781

feat: don't force db connect if using serverless#3781
eakmanrq merged 1 commit intomainfrom
eakmanrq/improvements_databricks_serverless_handling

eakmanrq commented Feb 3, 2025 •

edited

Loading

Uh oh!

Uh oh!

treysp Feb 4, 2025

Uh oh!

eakmanrq Feb 4, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

eakmanrq commented Feb 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

treysp Feb 4, 2025

Choose a reason for hiding this comment

Uh oh!

eakmanrq Feb 4, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

eakmanrq commented Feb 3, 2025 •

edited

Loading