bulk insert w/ SQL Server and no-pk table in offline mode should not render SET IDENTITY INSERT #495

sqlalchemy-bot · 2018-05-21T02:16:53Z

Migrated issue, originally created by Wil Tan

In offline mode, Alembic emits IDENTITY INSERT mode control statements. However, in online mode, it does not do so.

There are a few shortcomings that trips up my use cases:

in offline mode, sometimes I bulk_insert into a table with no auto-increment key. The SET IDENTITY INSERT <table> ON statement will result in an error in MSSQL.
in online mode, sometimes I do want to insert fixed primary key values for auto-increment fields, but because Alembic does not turn on IDENTITY INSERT mode, MSSQL throws an error.

Would it make sense to change the current behaviour in alembic/ddl/mssql.py such that it emits SET IDENTITY INSERT statements if the rows data contains the table._autoincrement_column, regardless of offline/online mode?

I'm happy to work on a pull request if you agree with the approach.

The text was updated successfully, but these errors were encountered:

sqlalchemy-bot · 2018-05-21T02:18:24Z

Changes by Wil Tan:

edited description

sqlalchemy-bot · 2018-05-21T03:16:38Z

Michael Bayer (@zzzeek) wrote:

Not able to reproduce the second case:

migration:

def upgrade():
    # ### commands auto generated by Alembic - please adjust! ###

    has_autoinc = sa.Table('has_autoinc', sa.MetaData(),
        sa.Column('id', sa.Integer, primary_key=True),
        sa.Column('x', sa.Integer()),
        sa.Column('y', sa.Integer())
    )

    op.bulk_insert(has_autoinc,
        [
            {"id": 1, "x": 5, "y": 4},
            {"id": 2, "x": 7, "y": 3},
        ]
    )
    # ### end Alembic commands ###

output running in online mode:

INFO  [alembic.runtime.migration] Running upgrade 355e309c5732 -> 3c9004fc9946, rev2
INFO  [sqlalchemy.engine.base.Engine] SET IDENTITY_INSERT has_autoinc ON
INFO  [sqlalchemy.engine.base.Engine] ()
INFO  [sqlalchemy.engine.base.Engine] INSERT INTO has_autoinc (id, x, y) VALUES (?, ?, ?)
INFO  [sqlalchemy.engine.base.Engine] ((1, 5, 4), (2, 7, 3))
INFO  [sqlalchemy.engine.base.Engine] SET IDENTITY_INSERT has_autoinc OFF
INFO  [sqlalchemy.engine.base.Engine] ()
INFO  [sqlalchemy.engine.base.Engine] UPDATE alembic_version SET version_num='3c9004fc9946' WHERE alembic_version.version_num = '355e309c5732'
INFO  [sqlalchemy.engine.base.Engine] ()
INFO  [sqlalchemy.engine.base.Engine] COMMIT

in online mode, the SQLAlchemy dialect is doing the work here and it knows to turn on IDENTITY INSERT if the table has an autoincrement column and the given values contain that column.

OTOH, if you are doing it like the documentation implies:

has_autoinc = sa.table('has_autoinc',
    sa.column('id', sa.Integer),
    sa.column('x', sa.Integer()),
    sa.column('y', sa.Integer())
)

unfortunately that can't work for SQL Server because the lower-case "column()" / "table()" system has no primary key concept (or autoincrement column), thus the dialect doesn't know about a primary key and correctly doesn't turn on IDENTITY INSERT (the way you want the first case to work).

sqlalchemy-bot · 2018-05-21T04:10:57Z

Wil Tan wrote:

Thanks for the tip! I was using lower case sa.table with upper case sa.Column; changing it to sa.Table makes all the difference.
Would the first case be something you'd like to fix?

sqlalchemy-bot · 2018-05-21T14:42:38Z

Michael Bayer (@zzzeek) wrote:

sure, the first one sounds like it should be fixed as I noticed this is just a hardcoded rule in the Alembic impl in any case.

sqlalchemy-bot · 2018-05-21T14:45:17Z

Michael Bayer (@zzzeek) wrote:

I'm also not totally happy that the lowercase table(.., column()) system which is much more convenient for quickies can't be used in this kind of case. options include, getting lowercase table/column to support some rudimentary level of autoinc concepts, or maybe some kind of hint to bulk_insert() that gets passed to the SQL Server dialect. Perhaps:

op.bulk_insert(..., execution_options={"mssql_identity_insert": True})

e.g. we add the ability to send execution options to bulk_insert and we get the SQL Server dialect to allow for this hint to be explicit.

sqlalchemy-bot · 2018-05-22T00:13:58Z

Wil Tan wrote:

If this is the only use case for scrapping / changing the lowercase table / column system, it seems like overkill. After all, it's not too much to ask the developer to switch to upper case Table(.., Column()) in order to make this work, provided it is in the documentation.
Adding execution_options is useful, and I would have used it if it had been there. However, it was much better to discover the proper solution which is to use Table with metadata support.

One way that could be useful is to warn user about the mixed use of uppercase / lowercase table/column. For example, op.bulk_insert could raise an error if a TableClause (and not schema.Table) was given and it contained a schema.Column as opposed to a simple ColumnClause. That was the mistake I made.

sa.table('foo', sa.Column('abc', sa.Integer(), primary_key=True))

sqlalchemy-bot · 2018-05-28T16:34:46Z

Michael Bayer (@zzzeek) wrote:

well we can always fix one thing at a time, I'm waiting on you for a PR right? there is absolutely no hurry.

sqlalchemy-bot · 2018-05-28T23:10:01Z

Wil Tan wrote:

Sounds good, I'll submit a PR for the offline case. Do you prefer a PR in Github rather than Bitbucket?

sqlalchemy-bot · 2018-05-29T00:01:27Z

Michael Bayer (@zzzeek) wrote:

github is better. I'd ike to get off bitbucket completely but I'm waiting on github to assist in that.

sqlalchemy-bot · 2018-08-13T15:11:16Z

Michael Bayer (@zzzeek) wrote:

updated the title to the actual thing we want to fix here.

sqlalchemy-bot · 2018-08-13T15:11:16Z

Changes by Michael Bayer (@zzzeek):

changed title from "Use IDENTITY INSERT for MS SQL Server for online m" to "bulk insert w/ SQL Server and no-pk table in offli"

jlgrock · 2020-06-13T13:25:04Z

I'm not great at writing unit tests right now in python, so I haven't created a merge request, but I was able to manually test the following code replacement with and without a database connection and it properly identifies when the column has an identity column and set the SET IDENTITY_INSERT [TABLE_NAME] ON; sections. I do understand the _autoincrement_column is a private member, so there's probably a better way to go about that as well. Since I desperately needed this, my workaround was the following.

from alembic import op
import sqlalchemy as sa
from sqlalchemy.dialects import mssql
from alembic.ddl.mssql import MSSQLImpl
from sqlalchemy.schema import Sequence

def bulk_insert(self, table, rows, **kw):
    id_column = table._autoincrement_column
    
    insert_has_identity = (id_column is not None) and (
        not isinstance(id_column.default, Sequence)
    )
    if self.as_sql and insert_has_identity:
        self._exec(
            "SET IDENTITY_INSERT %s ON"
            % self.dialect.identifier_preparer.format_table(table)
        )
        super(MSSQLImpl, self).bulk_insert(table, rows, **kw)
        self._exec(
            "SET IDENTITY_INSERT %s OFF"
            % self.dialect.identifier_preparer.format_table(table)
        )
    else:
        super(MSSQLImpl, self).bulk_insert(table, rows, **kw)

MSSQLImpl.bulk_insert = bulk_insert

I tested it with the following:
a) an Identity column

table_name = 'MY_EXAMPLE'
    my_example_table = op.create_table(
        table_name,
       sa.Column('my_id', sa.INTEGER, nullable=False),
        sa.Column('my_value', sa.VARCHAR(length=4), nullable=False),
        sa.PrimaryKeyConstraint('my_id')
    )
    op.bulk_insert(
        my_example_table,
        [
              {'my_value': '800A'},
        ]
    )

b) an integer that is not an identity

table_name = 'MY_EXAMPLE'
    my_example_table = op.create_table(
        table_name,
       sa.Column('my_id', sa.INTEGER, nullable=False, autoincrement=False),
        sa.Column('my_value', sa.VARCHAR(length=4), nullable=False),
        sa.PrimaryKeyConstraint('my_id')
    )
    op.bulk_insert(
        my_example_table,
        [
              {'my_id': 2, 'my_value': '800A'},
        ]
    )

c) something that is not an integer (and thus cannot be an identity)

table_name = 'MY_EXAMPLE'
    my_example_table = op.create_table(
        table_name,
       sa.Column('my_id', mssql.UNIQUEIDENTIFIER, server_default=sa.func.newid(), nullable=False),
        sa.Column('my_value', sa.VARCHAR(length=4), nullable=False),
        sa.PrimaryKeyConstraint('my_id')
    )
    op.bulk_insert(
        my_example_table,
        [
              {'my_value': '800A'},
        ]
    )

sqlalchemy-bot added feature low priority execution model labels Nov 27, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bulk insert w/ SQL Server and no-pk table in offline mode should not render SET IDENTITY INSERT #495

bulk insert w/ SQL Server and no-pk table in offline mode should not render SET IDENTITY INSERT #495

sqlalchemy-bot commented May 21, 2018

sqlalchemy-bot commented May 21, 2018

sqlalchemy-bot commented May 21, 2018

sqlalchemy-bot commented May 21, 2018

sqlalchemy-bot commented May 21, 2018

sqlalchemy-bot commented May 21, 2018

sqlalchemy-bot commented May 22, 2018

sqlalchemy-bot commented May 28, 2018

sqlalchemy-bot commented May 28, 2018

sqlalchemy-bot commented May 29, 2018

sqlalchemy-bot commented Aug 13, 2018

sqlalchemy-bot commented Aug 13, 2018

jlgrock commented Jun 13, 2020 •

edited

bulk insert w/ SQL Server and no-pk table in offline mode should not render SET IDENTITY INSERT #495

bulk insert w/ SQL Server and no-pk table in offline mode should not render SET IDENTITY INSERT #495

Comments

sqlalchemy-bot commented May 21, 2018

sqlalchemy-bot commented May 21, 2018

sqlalchemy-bot commented May 21, 2018

sqlalchemy-bot commented May 21, 2018

sqlalchemy-bot commented May 21, 2018

sqlalchemy-bot commented May 21, 2018

sqlalchemy-bot commented May 22, 2018

sqlalchemy-bot commented May 28, 2018

sqlalchemy-bot commented May 28, 2018

sqlalchemy-bot commented May 29, 2018

sqlalchemy-bot commented Aug 13, 2018

sqlalchemy-bot commented Aug 13, 2018

jlgrock commented Jun 13, 2020 • edited

jlgrock commented Jun 13, 2020 •

edited