Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[sqlalchemy] [Feature Request] Support default values #324

Open
dhirschfeld opened this issue Jan 22, 2024 · 6 comments · May be fixed by #343
Open

[sqlalchemy] [Feature Request] Support default values #324

dhirschfeld opened this issue Jan 22, 2024 · 6 comments · May be fixed by #343
Labels
enhancement New feature or request sqlalchemy

Comments

@dhirschfeld
Copy link

dhirschfeld commented Jan 22, 2024

It appears that default values aren't supported?

supports_default_values: bool = False

Trying to create a schema with the below server_default gives a spark error:

    insert_timestamp: Mapped[datetime] = mapped_column(
        sa.DateTime(timezone=True),
        init=False,
        nullable=False,
        server_default=sa.func.current_timestamp(),
    )
DatabaseError: (databricks.sql.exc.ServerOperationError) [INTERNAL_ERROR] The Spark SQL phase planning failed with an internal error. You hit a bug in Spark or the Spark plugins you use. Please, report this bug to the corresponding communities or vendors, and provide the full stack trace.
[SQL: 
CREATE TABLE test.`ModelMetadata` (
	pkid BIGINT GENERATED ALWAYS AS IDENTITY, 
	name STRING NOT NULL, 
	version STRING NOT NULL, 
	insert_timestamp TIMESTAMP_NTZ DEFAULT CURRENT_TIMESTAMP NOT NULL, 
	PRIMARY KEY (pkid)
) USING DELTA

]
(Background on this error at: https://sqlalche.me/e/20/4xp6)

Edit:

It looks like the internal error is created by having a TIMESTAMP_NTZ type with the CURRENT_TIMESTAMP default. Using the correct TIMESTAMP type for the column then gives the table feature was not enabled error.

To work around the latter error requires setting the table properties, which AFAICS can't be done from sqlalchemy, at least not directly.

@wibbico
Copy link

wibbico commented Jan 22, 2024

try to use

sa.func.now()

@dhirschfeld
Copy link
Author

dhirschfeld commented Jan 22, 2024

Thanks to this thread I ws able to find out the magic syntax:

CREATE TABLE test.`ModelMetadata` (
	pkid BIGINT GENERATED ALWAYS AS IDENTITY, 
	name STRING NOT NULL, 
	version STRING NOT NULL, 
	insert_timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL, 
	PRIMARY KEY (pkid)
) USING DELTA
TBLPROPERTIES(
  'delta.minReaderVersion' = '1',
  'delta.minWriterVersion'='7',
  'delta.feature.allowColumnDefaults' = 'enabled'
);

...but I don't really know how to integrate that with sqlalchemy.

If that is required for defaults then I guess this projects needs to implement a custom compiler to add the required TBLPROPERTIES?

@dhirschfeld
Copy link
Author

With sa.func.now() sqlalchemy emits the below SQL:

CREATE TABLE test.`ModelMetadata` (
	pkid BIGINT GENERATED ALWAYS AS IDENTITY, 
	name STRING NOT NULL, 
	version STRING NOT NULL, 
	insert_timestamp TIMESTAMP DEFAULT now() NOT NULL, 
	PRIMARY KEY (pkid)
) USING DELTA;

...which gives the below error:

[WRONG_COLUMN_DEFAULTS_FOR_DELTA_FEATURE_NOT_ENABLED]
Failed to execute CREATE TABLE command because it assigned a column DEFAULT value, but the corresponding table feature was not enabled.
Please retry the command again after executing ALTER TABLE tableName SET TBLPROPERTIES('delta.feature.allowColumnDefaults' = 'supported').

@dhirschfeld
Copy link
Author

If that is required for defaults then I guess this projects needs to implement a custom compiler to add the required TBLPROPERTIES?

Or, even better, just enable allowColumnDefaults by default so it Just Works without having to touch the sqlalchemy compiler.

@dhirschfeld
Copy link
Author

In my specific case it appears I can hack around it with:

@event.listens_for(Base.metadata, "after_create")
def set_default_insert_timestamp(target, connection, **kw):
    for table in target.tables:
        connection.execute(DDL(f"""
        ALTER TABLE {table} SET TBLPROPERTIES(
            'delta.feature.allowColumnDefaults' = 'enabled'
        );
        """))
        connection.execute(DDL(f"""
        ALTER TABLE {table}
        ALTER COLUMN insert_timestamp
        SET DEFAULT CURRENT_TIMESTAMP;
        """))

...but that's a bit gross 🤢

It would be nice for the standard sqlalchemy server_default to Just Work with no hacks required (by the end-user!)

@susodapop
Copy link
Contributor

Hey folks thanks for the write-up and debugging! I'm pleased to see that you worked out how to hack this together using event.listens_for. The sqlalchemy events system is a powerful tool, since nearly every behaviour of a given dialect can be patched in this way. I agree that this belongs in the dialect itself.

@dhirschfeld would you shoot an email to databricks-sql-connector-maintainers@databricks.com referencing this GitHub issue so we can track it and add it to the pipeline for enhancements to the dialect?

@susodapop susodapop changed the title [Feature Request] Support default values [sqlalchemy] [Feature Request] Support default values Jan 22, 2024
@kravets-levko kravets-levko added enhancement New feature or request sqlalchemy labels Apr 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request sqlalchemy
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants