Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQLite on_conflict not working with ORM #4567

Open
typicalTYLER opened this issue Mar 25, 2019 · 5 comments
Open

SQLite on_conflict not working with ORM #4567

typicalTYLER opened this issue Mar 25, 2019 · 5 comments

Comments

@typicalTYLER
Copy link

I've tried several different versions of this code to try to get primary key conflicts to be ignored on the table slippy_tiles but nothing seems to take when creating the schema from ORM, even though it is supposedly fixed in #4360

I've tried renaming the kwarg in the column definitions to sqlite_on_conflict_primary_key, sqlite_on_conflict, on_conflict, editing the primary key, making a new primary key constraint, nothing seems to get it to actually put the clause in the SQL or create a schema accordingly.

import time

from sqlalchemy import Column, Integer, String, ForeignKey, Float, Boolean, PrimaryKeyConstraint
from sqlalchemy import create_engine
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker
from sqlalchemy.schema import CreateTable

Base = declarative_base()


class SearchPolygon(Base):
    __tablename__ = 'search_polygons'

    name = Column(String, primary_key=True)
    centroid_row = Column(Float, nullable=False)
    centroid_column = Column(Float, nullable=False)
    centroid_zoom = Column(Integer, nullable=False)
    inner_coords_calculated = Column(Boolean, nullable=False, default=False)


class SlippyTile(Base):
    __tablename__ = 'slippy_tiles'

    row = Column(Integer, nullable=False, primary_key=True, on_conflict_primary_key='IGNORE')
    column = Column(Integer, nullable=False, primary_key=True, on_conflict_primary_key='IGNORE')
    zoom = Column(Integer, nullable=False, primary_key=True, on_conflict_primary_key='IGNORE')
    centroid_distance = Column(Float)
    polygon_name = Column(String, ForeignKey('search_polygons.name'))


engine = create_engine('sqlite:///data/solar.db')
SlippyTile.__table__.primary_key = PrimaryKeyConstraint(SlippyTile.row.name, SlippyTile.column.name,
                                                        SlippyTile.zoom.name, on_conflict_primary_key='IGNORE')
from sqlalchemy.schema import CreateTable
print(CreateTable(SlippyTile.__table__).compile(engine))
Base.metadata.create_all(engine)
Session = sessionmaker(bind=engine)

Prints this schema:

CREATE TABLE slippy_tiles (
	"row" INTEGER NOT NULL, 
	"column" INTEGER NOT NULL, 
	zoom INTEGER NOT NULL, 
	centroid_distance FLOAT, 
	polygon_name VARCHAR, 
	PRIMARY KEY ("row", "column", zoom), 
	FOREIGN KEY(polygon_name) REFERENCES search_polygons (name)
)

Which contains no on conflict clauses

@typicalTYLER
Copy link
Author

I had thought maybe my sqlite version didn't have this feature, but it seems like it does according to the link below.

sqlite3.sqlite_version
'3.11.0'
sqlite3.version
'2.6.0'
The ON CONFLICT clause described here has been a part of SQLite since before version 3.0.0 (2004-06-18). The phrase "ON CONFLICT" is also part of UPSERT, which is an extension to INSERT added in version 3.24.0 (2018-06-04). Do not confuse these two separate uses of the "ON CONFLICT" phrase.

quoted from here:
https://sqlite.org/lang_conflict.html

@typicalTYLER
Copy link
Author

The main reason this is a problem is because I'm inserting a lot of data and checking to see if the data already exists is causing a 10x slowdown for a process that will take on the order of days already.

@zzzeek
Copy link
Member

zzzeek commented Mar 25, 2019

Hiya .

OK. So this is a hard one to go through, since the example you gave has found so many different, intuitive, and perfectly reasonable ways of achieving the behavior you are looking for, and each one of them is subtly incorrect. It's as though I asked you to run through a carwash blindfolded and you didn't get wet, if that makes sense.

So there is definitely a bug here in that first of all the documentation is not telling it like it is, but additionally, there can very easily be warnings in the DDL renderer that also indicate that it looks like you're trying to do something and we can't do it the way you are asking.

So the first thing to note is, for DDL things that are requested, SQLAlchemy is not going to look at the SQlite version in use. If you asked it to render an "ON CONFLICT" (and it's the way SQLAlchemy expected it to be asked), it's not going to silently pass on that for an older version of SQLite, because this means your program isn't going to work. This isn't an optimization, that is, it changes the behavior of the program.

Next thing. You can't affect the primary key constraint of a Table object like this:

my_table.primary_key = PrimaryKeyConstraint(...)

if you want to add the constraint after the fact, it has to be this:

my_table.append_constraint(PrimaryKeyConstraint(...))

However, you'll note in the examples, the primary key constraint is given inline. This is the most straighforward way to do it here, which when using declarative is done with the __table_args__ argument, which is documented probably a bit too much off the beaten path within the declarative docs, and probably should show that you can use the Column objects directly if that's easier, but it's here: https://docs.sqlalchemy.org/en/latest/orm/extensions/declarative/table_config.html?highlight=__table_args__

So at this point we can illustrate how your program does do what you want:

class SlippyTile(Base):
    __tablename__ = 'slippy_tiles'

    row = Column(Integer)
    column = Column(Integer)
    zoom = Column(Integer)
    centroid_distance = Column(Float)
    polygon_name = Column(String, ForeignKey('search_polygons.name'))

    __table_args__ = (
        PrimaryKeyConstraint(row, column, zoom, sqlite_on_conflict='IGNORE'),
    )

So next thing, all of these dialect specific arguments are always prefixed with the dialect name, so in the Column it would have to look like:

row = Column(Integer, nullable=False, primary_key=True, sqlite_on_conflict_primary_key='IGNORE')

that's in the docs. But what the docs need a BIG RED WARNING to say here is that, the column level sqlite_on_conflict_primary_key argument only applies to a single-column primary key. if you have a composite primary key, you have to use an explicit PrimaryKeyConstraint. This can be in the docs and the dialect can definitely emit a warning when it sees the column -level argument in a composite primary key.

So the TODO list here is:

  1. Try to add some more clues to the mapping documentation about __table_args__ since I really should have seen it at https://docs.sqlalchemy.org/en/latest/orm/mapper_config.html

  2. Add context about append_constraint() to the constraint docs at https://docs.sqlalchemy.org/en/latest/core/constraints.html?highlight=primarykeyconstraint#sqlalchemy.schema.PrimaryKeyConstraint,

  3. add "seealso" for PrimaryKeyConstraint, UniqueConstraint, CheckConstraint that points back to the canonical constraint documentation https://docs.sqlalchemy.org/en/latest/core/constraints.html?highlight=checkconstraint#unique-constraint

  4. In those sections, put "seealsos" to the ORM-level docs on how to set up constraints with declarative

  5. from finish docs for advanced mapper stuff #4 we'd get to https://docs.sqlalchemy.org/en/latest/orm/extensions/declarative/table_config.html, this section has to be way bigger and include examples that have columns in them and use all three kinds of constraints as well as Index objects, in all different ways

  6. put a big red warning at https://docs.sqlalchemy.org/en/latest/dialects/sqlite.html#on-conflict-support-for-constraints within the "primary key" part of it that the column-level version only applies to non-composite PK. For composite PK the ON CONFLICT can only be specified once so for explicitness we require use of PrimaryKeyConstraiant

  7. finally the only code change, ensure SQLite dialect warns if there is a composite PK and we detect the column-level parameters were used.

and then we're done!

@zzzeek zzzeek added bug Something isn't working documentation sqlite sql labels Mar 25, 2019
@zzzeek zzzeek added this to the 1.3.xx milestone Mar 25, 2019
@zzzeek
Copy link
Member

zzzeek commented Mar 25, 2019

for consistency, all the paragraph level docs that are inline with PrimaryKeyConstraint should move out to https://docs.sqlalchemy.org/en/latest/core/constraints.html?highlight=primarykeyconstraint#primary-key-constraint. Or the other two constraints should be reversed.

@typicalTYLER
Copy link
Author

typicalTYLER commented Mar 25, 2019

Thanks so much for the detailed response! It did feel like going through a car-wash blindfolded and not getting wet haha. I had been looking for something like __table_args__ to pass in a constraint in the documentation but wasn't able to find it.

@zzzeek zzzeek added PRs (with tests!) welcome a fix or feature which is appropriate to be implemented by volunteers and removed PRs (with tests!) welcome a fix or feature which is appropriate to be implemented by volunteers labels Jun 17, 2019
@zzzeek zzzeek modified the milestones: 1.3.xx, 1.3.x Dec 18, 2019
@zzzeek zzzeek modified the milestones: 1.3.x, 1.4.x May 26, 2021
@zzzeek zzzeek removed the bug Something isn't working label Mar 28, 2022
@zzzeek zzzeek modified the milestones: 1.4.x, 2.0.x Jan 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants