Enhance SQLAlchemy engine with safe_execute and inspector methods for SnowFlake by mauro-cerzosimo · Pull Request #8920 · marimo-team/marimo

mauro-cerzosimo · 2026-03-30T01:05:34Z

📝 Summary

This pull request adds multi-database engine support, starting with Snowflake, and improves type safety and exception handling in the SQLAlchemy engine integration.

Continues the work from #8824.
Closes

🔍 Description of Changes

Multi-Database Support

Introduced _get_inspector context manager that yields an appropriate SQLAlchemy Inspector for a given database. For dialects that require a USE DATABASE command (e.g., Snowflake), it opens a connection, executes the command, and yields an inspector bound to that connection. For all other dialects, it falls back to the existing self.inspector.
Added _get_snowflake_database_names to fetch all available databases when the target database is not specified or not found.
Refactored existing methods (_get_schema_names, _get_table_names, _get_columns, _fetch_primary_keys, _fetch_indexes) to accept a database parameter and use _get_inspector, enabling per-database inspection across all operations.

Type Safety & Code Quality

Corrected type annotations for inspector-related methods, using SQLAlchemy's official ReflectedColumn and ReflectedIndex types to resolve mypy errors.
Improved exception handling in _get_snowflake_database_names by chaining exceptions with raise ... from err (B904 compliance).

📋 Checklist

I have read the contributor guidelines.
For large changes, or changes that affect the public API: this change was discussed or approved through an issue, on Discord, or the community discussions (Please provide a link if applicable).
Tests have been added for the changes made.
Documentation has been updated where applicable, including docstrings for API changes.
Pull request title is a good summary of the changes — it will be used in the release notes.

…ctor methods

vercel · 2026-03-30T01:05:41Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
marimo-docs	Ready	Preview, Comment	Mar 31, 2026 2:50pm

Light2Dark

this is great, thanks. Couple of comments

Light2Dark · 2026-03-30T08:15:22Z

marimo/_sql/engines/sqlalchemy.py

+        from sqlalchemy import inspect, text
+
+        _use_database_dialect_command: dict[str, str] = {
+            "snowflake": f"""USE DATABASE {f'"{database}"' if re.search(r"[^A-Za-z0-9_]", database) else database}""",


[^A-Za-z0-9_] looks like this is a common constant, let's extract it out and put at the top

moved and improved the indentifier quoting for snowflake
feat: implement identifier quoting for Snowflake in SQLAlchemyEngine feat: implement identifier quoting for Snowflake in SQLAlchemyEngine

Light2Dark · 2026-03-30T08:19:44Z

marimo/_sql/engines/sqlalchemy.py

+    def decorator(func: Callable[..., Any]) -> Callable[..., Any]:
+        @functools.wraps(func)
+        def wrapper(*args: Any, **kwargs: Any) -> Any:
+            try:
+                return func(*args, **kwargs)
+            except silent_exceptions:
+                return fallback
+            except Exception:
+                getattr(LOGGER, log_level)(message, exc_info=True)
+                return fallback
+
+        return wrapper


Could we make this strongly typed, maybe with Generics.

Else, we are overriding the function return types. I get this typecheck error

Function decorator obscures type of function because its type is Any basedpyright[reportAny]

Thanks @Light2Dark I improved with your suggest
refactor: update safe_execute signature to use TypeVar and ParamSpec …refactor: update safe_execute signature to use TypeVar and ParamSpec …

Copilot

Pull request overview

Adds multi-database catalog discovery to the SQLAlchemy engine (starting with Snowflake) and introduces a safe_execute helper to standardize exception handling/logging during introspection.

Changes:

Added safe_execute decorator and applied it across schema/table/detail discovery helpers.
Implemented per-database inspection via _get_inspector, including Snowflake USE DATABASE switching and SHOW DATABASES discovery.
Expanded unit tests to cover safe_execute, inspector helpers, and Snowflake database listing behavior.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File	Description
`marimo/_sql/engines/sqlalchemy.py`	Adds `safe_execute`, Snowflake database discovery, and per-database inspector context manager used across introspection.
`tests/_sql/test_sqlalchemy.py`	Adds tests for `safe_execute`, `_get_inspector`, per-database schema/table/detail helpers, and Snowflake database-name discovery.

Copilot · 2026-03-30T14:08:07Z

tests/_sql/test_sqlalchemy.py

+
+
+# ------------------------
+# Decoracotors


Typo in section header: "Decoracotors" should be "Decorators".

Suggested change

# Decoracotors

# Decorators

Copilot · 2026-03-30T14:08:08Z

marimo/_sql/engines/sqlalchemy.py

+            database_names = [
+                str(row[name_col_index])
+                if re.search(r"[^A-Za-z0-9_]", str(row[name_col_index]))
+                else str(row[name_col_index]).lower()
+                for row in result.fetchall()
+            ]


This normalization lowercases any database name that doesn’t contain “special” characters, but Snowflake quoted identifiers can be mixed-case (e.g. MyDb) without special characters. Lowercasing those will change the identifier and can make subsequent USE DATABASE / inspection fail. Consider preserving the exact name returned by SHOW DATABASES and only applying case-insensitive matching for display/default selection, or detect mixed-case and avoid lowercasing/ensure quoting.

Good point, but I think the current behavior is actually correct for Snowflake's semantics.

Snowflake stores unquoted identifiers internally as UPPERCASE. So when SHOW DATABASES returns DB_A, it means the database was created with an unquoted identifier (e.g., CREATE DATABASE db_a or CREATE DATABASE DB_A) — Snowflake normalized it to DB_A. Lowercasing it to db_a is safe because Snowflake resolves unquoted identifiers case-insensitively, so USE DATABASE db_a and USE DATABASE DB_A are equivalent.

Copilot · 2026-03-30T14:08:08Z

marimo/_sql/engines/sqlalchemy.py

+        import re
+
+        from sqlalchemy import inspect, text
+
+        _use_database_dialect_command: dict[str, str] = {
+            "snowflake": f"""USE DATABASE {f'"{database}"' if re.search(r"[^A-Za-z0-9_]", database) else database}""",


Snowflake identifier quoting here is not robust: it doesn’t escape embedded double-quotes and it also treats mixed-case identifiers (e.g. a quoted Snowflake database like "MyDb") as safe to use unquoted, which can resolve to a different database (Snowflake uppercases unquoted identifiers). Prefer using SQLAlchemy’s identifier preparer (or a dedicated quoting helper) to quote/escape identifiers correctly rather than interpolating into the SQL string.

Suggested change

import re

from sqlalchemy import inspect, text

_use_database_dialect_command: dict[str, str] = {

"snowflake": f"""USE DATABASE {f'"{database}"' if re.search(r"[^A-Za-z0-9_]", database) else database}""",

from sqlalchemy import inspect, text

from sqlalchemy.sql.compiler import IdentifierPreparer

# Robustly quote the database identifier for Snowflake, preserving case

# and correctly escaping embedded double-quotes.

if self.dialect == "snowflake":

# Prefer SQLAlchemy's identifier preparer when available.

try:

preparer: IdentifierPreparer = self._connection.dialect.identifier_preparer # type: ignore[attr-defined]

quoted_database = preparer.quote(database)

except Exception:

# Fallback: minimal, but correct, SQL identifier quoting.

escaped_database = database.replace('"', '""')

quoted_database = f'"{escaped_database}"'

else:

quoted_database = database

_use_database_dialect_command: dict[str, str] = {

"snowflake": f"USE DATABASE {quoted_database}",

same as befpre

Copilot · 2026-03-30T14:08:08Z

marimo/_sql/engines/sqlalchemy.py

+        schema_names = self._get_schema_names(
+            database if database is not None else ""
+        )


get_schemas(database=None) (used by get_default_database() for connection validation) currently calls _get_schema_names(""), which for Snowflake will try to execute USE DATABASE with an empty identifier via _get_inspector and fail. This can cause Snowflake connections without a URL database to be treated as invalid. Consider letting _get_inspector accept Optional[str] and short-circuit to self.inspector (or skip USE DATABASE) when database is falsy, or adjust get_default_database()’s validity check for Snowflake (e.g., SHOW DATABASES).

Suggested change

schema_names = self._get_schema_names(

database if database is not None else ""

)

if database is None:

# When no database is specified, avoid issuing a "USE DATABASE"

# statement with an empty identifier (e.g., on Snowflake).

# In this context (e.g., connection validation), we treat the

# absence of an explicit database as having no enumerated schemas.

schema_names: list[str] = []

else:

schema_names = self._get_schema_names(database)

fix: handle None database case in SQLAlchemyEngine schema retrieval

Copilot · 2026-03-30T14:08:09Z

marimo/_sql/engines/sqlalchemy.py

+            default_lower = self.default_database.lower()
+            for db in database_names:
+                if db.lower() == default_lower:
+                    return [str(db).lower()]


When the default database is present, this returns [str(db).lower()] even if the matched database name was preserved as case-sensitive (e.g., quoted/mixed-case or containing special characters). That lowercasing can make subsequent USE DATABASE fail for case-sensitive identifiers and is inconsistent with the earlier normalization rules. Return the matched db value as-is (or apply the same normalization logic used when building database_names).

Suggested change

return [str(db).lower()]

return [db]

feat: implement identifier quoting for Snowflake in SQLAlchemyEngine

…for better type safety

…onsistency

…in SQLAlchemyEngine

mauro-cerzosimo · 2026-03-31T14:46:21Z

@Light2Dark I’ve refactored and improved the code in the latest commit, focusing on identifier quoting, normalization, and handling edge cases for database/schema retrieval in SQLAlchemyEngine. All recent changes are now ready for a new review.

Copilot

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

tests/_sql/test_sqlalchemy.py

Copilot · 2026-03-31T15:04:08Z

marimo/_sql/engines/sqlalchemy.py

+            except silent_exceptions:
+                return fallback
+            except Exception:
+                getattr(LOGGER, log_level)(message, exc_info=True)


safe_execute can raise unexpectedly if log_level is not a valid logger attribute: getattr(LOGGER, log_level)(...) will raise AttributeError, which bypasses the intended fallback behavior. Consider validating log_level (e.g., map allowed strings to bound methods, or default to LOGGER.warning) so the decorator remains safe even with unexpected input.

Suggested change

getattr(LOGGER, log_level)(message, exc_info=True)

log_method = getattr(LOGGER, log_level, LOGGER.warning)

log_method(message, exc_info=True)

Copilot · 2026-03-31T15:04:09Z

marimo/_sql/engines/sqlalchemy.py

+    def _quote_identifier(self, identifier: str) -> str:
+        """Quote an identifier based on the SQL dialect's quoting rules."""
+        dialect_quoting: dict[str, tuple[re.Pattern[str], str, str]] = {
+            "snowflake": (_SNOWFLAKE_NEEDS_QUOTING_RE, '"', '"'),
+        }
+
+        if self.dialect not in dialect_quoting:
+            return identifier
+
+        pattern, open_quote, close_quote = dialect_quoting[self.dialect]
+        if pattern.search(identifier) or identifier != identifier.lower():
+            escaped = identifier.replace(
+                close_quote, close_quote + close_quote
+            )
+            return f"{open_quote}{escaped}{close_quote}"
+        return identifier


Snowflake quoting logic is inconsistent with _get_snowflake_database_names(): that method preserves case-sensitive identifiers as returned by SHOW DATABASES, including identifiers that may be all-lowercase (created via quoted identifiers). For such names, _quote_identifier currently returns the identifier unquoted (because it only quotes when identifier != identifier.lower()), which can make USE DATABASE <name> resolve to a different (uppercased) identifier or fail. Consider adjusting the normalization/quoting strategy so any case-sensitive Snowflake database name (including all-lowercase) is quoted when issuing USE DATABASE (e.g., keep raw SHOW names for execution, or track whether a name should be quoted).

Copilot · 2026-03-31T15:04:09Z

marimo/_sql/engines/sqlalchemy.py

+    def _get_database_names(self) -> list[str]:
+        """Get database names using dialect-specific queries.
+
+        Returns a single-element list with the default database when
+        the dialect has no dedicated discovery mechanism.
+        """
+        if self.dialect.lower() == "snowflake":
+            return self._get_snowflake_database_names()
+
+        return [self.default_database] if self.default_database else []


_get_database_names() treats self.default_database == "" as absent because it uses a truthiness check. get_default_database() explicitly uses empty-string to represent a detached-but-valid connection, so this change can cause get_databases() to return an empty list for valid connections (e.g., sqlite:///). Consider checking is not None instead of truthiness so the empty-string sentinel still yields a single database entry.

github-actions · 2026-03-31T15:18:32Z

🚀 Development release published. You may be able to view the changes at https://marimo.app?v=0.21.2-dev100

feat: enhance SQLAlchemy engine with safe_execute decorator and inspe…

5c6af1e

…ctor methods

mauro-cerzosimo requested a review from Light2Dark as a code owner March 30, 2026 01:05

vercel bot deployed to Preview March 30, 2026 01:06 View deployment

refactor: add section header for SQLAlchemyEngine class in sqlalchemy.py

47da79b

vercel bot deployed to Preview March 30, 2026 01:40 View deployment

Light2Dark added the enhancement New feature or request label Mar 30, 2026

Light2Dark reviewed Mar 30, 2026

View reviewed changes

mscolnick requested a review from Copilot March 30, 2026 14:02

Copilot started reviewing on behalf of mscolnick March 30, 2026 14:03 View session

Copilot AI reviewed Mar 30, 2026

View reviewed changes

Light2Dark mentioned this pull request Mar 31, 2026

Feat: Add starrocks catalog support #8856

Open

5 tasks

mauro-cerzosimo added 4 commits March 31, 2026 22:13

refactor: update safe_execute signature to use TypeVar and ParamSpec …

42b44f5

…for better type safety

feat: implement identifier quoting for Snowflake in SQLAlchemyEngine

5d5cb03

fix: handle None database case in SQLAlchemyEngine schema retrieval

1d8557c

refactor: normalize unquoted Snowflake identifiers to lowercase for c…

2c00f5e

…onsistency

vercel bot deployed to Preview March 31, 2026 14:19 View deployment

fix: ensure database identifiers are correctly quoted and normalized …

8a27f4f

…in SQLAlchemyEngine

vercel bot deployed to Preview March 31, 2026 14:50 View deployment

mscolnick requested review from Light2Dark and Copilot March 31, 2026 14:59

Copilot started reviewing on behalf of mscolnick March 31, 2026 15:00 View session

Copilot AI reviewed Mar 31, 2026

View reviewed changes

mscolnick approved these changes Mar 31, 2026

View reviewed changes

mscolnick merged commit 4635204 into marimo-team:main Mar 31, 2026
44 checks passed

-        import re
-        from sqlalchemy import inspect, text
-        _use_database_dialect_command: dict[str, str] = {
-            "snowflake": f"""USE DATABASE {f'"{database}"' if re.search(r"[^A-Za-z0-9_]", database) else database}""",
+        from sqlalchemy import inspect, text
+        from sqlalchemy.sql.compiler import IdentifierPreparer
+        # Robustly quote the database identifier for Snowflake, preserving case
+        # and correctly escaping embedded double-quotes.
+        if self.dialect == "snowflake":
+            # Prefer SQLAlchemy's identifier preparer when available.
+            try:
+                preparer: IdentifierPreparer = self._connection.dialect.identifier_preparer  # type: ignore[attr-defined]
+                quoted_database = preparer.quote(database)
+            except Exception:
+                # Fallback: minimal, but correct, SQL identifier quoting.
+                escaped_database = database.replace('"', '""')
+                quoted_database = f'"{escaped_database}"'
+        else:
+            quoted_database = database
+        _use_database_dialect_command: dict[str, str] = {
+            "snowflake": f"USE DATABASE {quoted_database}",

-        schema_names = self._get_schema_names(
-            database if database is not None else ""
-        )
+        if database is None:
+            # When no database is specified, avoid issuing a "USE DATABASE"
+            # statement with an empty identifier (e.g., on Snowflake).
+            # In this context (e.g., connection validation), we treat the
+            # absence of an explicit database as having no enumerated schemas.
+            schema_names: list[str] = []
+        else:
+            schema_names = self._get_schema_names(database)

	getattr(LOGGER, log_level)(message, exc_info=True)
	log_method = getattr(LOGGER, log_level, LOGGER.warning)
	log_method(message, exc_info=True)

Conversation

mauro-cerzosimo commented Mar 30, 2026

📝 Summary

🔍 Description of Changes

Multi-Database Support

Type Safety & Code Quality

📋 Checklist

Uh oh!

vercel bot commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Light2Dark left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mauro-cerzosimo commented Mar 31, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions bot commented Mar 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

vercel bot commented Mar 30, 2026 •

edited

Loading

Light2Dark left a comment •

edited

Loading