Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The ".upsert()" method is misnamed #66

Closed
simonw opened this issue Nov 12, 2019 · 15 comments
Closed

The ".upsert()" method is misnamed #66

simonw opened this issue Nov 12, 2019 · 15 comments
Labels

Comments

@simonw
Copy link
Owner

@simonw simonw commented Nov 12, 2019

This thread here is illuminating: https://stackoverflow.com/questions/3634984/insert-if-not-exists-else-update

The term UPSERT in SQLite has a specific meaning as-of 3.24.0 (2018-06-04): https://www.sqlite.org/lang_UPSERT.html

It means "behave as an UPDATE or a no-op if the INSERT would violate a uniqueness constraint". The syntax in 3.24.0+ looks like this (confusingly it does not use the term "upsert"):

INSERT INTO phonebook(name,phonenumber) VALUES('Alice','704-555-1212')
  ON CONFLICT(name) DO UPDATE SET phonenumber=excluded.phonenumber

Here's the problem: the sqlite-utils .upsert() and .upsert_all() methods don't do this. They use the following SQL:

INSERT OR REPLACE INTO [{table}] ({columns}) VALUES {rows};

If the record already exists, it will be entirely replaced by a new record - as opposed to updating any specified fields but leaving existing fields as they are (the behaviour of "upsert" in SQLite itself).

@simonw

This comment has been minimized.

Copy link
Owner Author

@simonw simonw commented Nov 12, 2019

This relates to this bug: dogsheep/github-to-sqlite#8 (comment)

@simonw

This comment has been minimized.

Copy link
Owner Author

@simonw simonw commented Nov 12, 2019

Fixing this is going to be a real pain. There's lots of code out there that uses sqlite-utils with the expectation that upsert() behaves as it currently does.

Maybe I need to introduce new terms for both of these different patterns and deprecate the existing .upsert() and .upsert_all() since their behaviour can't be changed?

Or maybe I fix this and ship sqlite-utils 2.0 with a breaking change?

@simonw

This comment has been minimized.

Copy link
Owner Author

@simonw simonw commented Nov 12, 2019

If I do implement the correct definition of .upsert() I think I'll use this pattern, since it works in versions of SQLite prior to 3.24:

INSERT OR IGNORE INTO book(id) VALUES(1001);
UPDATE book SET name = 'Programming' WHERE id = 1001;
@simonw simonw changed the title I think ".upsert()" may be mis-named! The ".upsert()" method is mis-named! Nov 13, 2019
@simonw

This comment has been minimized.

Copy link
Owner Author

@simonw simonw commented Nov 13, 2019

This warrants making a backwards compatible change, which means I'll need to bump the major version number and release 2.0.

I'm going to rename the existing upsert() and upsert_all() methods to replace() and replace_all() - then write new upsert() and upsert_all() methods that implement the correct behavior.

@simonw

This comment has been minimized.

Copy link
Owner Author

@simonw simonw commented Nov 13, 2019

Is replace() a good name here? It doesn't really convey the idea that a brand new record will be created if there isn't an existing one to replace.

@simonw

This comment has been minimized.

Copy link
Owner Author

@simonw simonw commented Nov 13, 2019

Maybe inplace() (combining "insert" and "replace")?

It could be an alias for .insert(..., replace=True)

@simonw

This comment has been minimized.

Copy link
Owner Author

@simonw simonw commented Nov 13, 2019

This is going to affect the design of the CLI subcommands as well.

@simonw

This comment has been minimized.

Copy link
Owner Author

@simonw simonw commented Nov 13, 2019

Maybe instead of inventing a new term I should tell people to use .insert(..., replace=True) directly. That matches ignore=True.

@simonw

This comment has been minimized.

Copy link
Owner Author

@simonw simonw commented Nov 13, 2019

First step: add a replace=True argument to insert() and insert_all() that does the same thing as the current upsert=True

def insert_all(
self,
records,
pk=DEFAULT,
foreign_keys=DEFAULT,
column_order=DEFAULT,
not_null=DEFAULT,
defaults=DEFAULT,
upsert=DEFAULT,

@simonw simonw changed the title The ".upsert()" method is mis-named! The ".upsert()" method is misnamed Nov 15, 2019
@simonw

This comment has been minimized.

Copy link
Owner Author

@simonw simonw commented Nov 15, 2019

Urgh this is going to be quite a bit of work, especially in the CLI module which shares an implementation for upsert and insert in a way that looks like it will have to be unwrapped.

@simonw

This comment has been minimized.

Copy link
Owner Author

@simonw simonw commented Nov 19, 2019

Thinking about this further: I believe every time I've personally used upsert in the past (either with the Python library or the CLI tool) I've actually wanted the new behaviour, where "upsert" means "update existing record with these changes, or insert a new record if one does not exist".

So I'm happy with upsert doing that, and insert --replace being added as an option that does what upsert does ta the moment.

I'll still ship it as version 2.0 since it's technically a breaking change.

@simonw

This comment has been minimized.

Copy link
Owner Author

@simonw simonw commented Dec 26, 2019

Don't forget to update the documentation. This will be quite an involved task.

@simonw

This comment has been minimized.

Copy link
Owner Author

@simonw simonw commented Dec 27, 2019

I'm going to start by ignoring the existing upsert entirely and implementing .insert(..., replace=True) and $ sqlite-utils insert --replace. Including updating the tests.

Then I'll figure out how to implement the new .upsert() / $ sqlite-utils upsert.

Then I'll update the documentation, and ship sqlite-utils 2.0.

simonw added a commit that referenced this issue Dec 27, 2019
simonw added a commit that referenced this issue Dec 27, 2019
Refs #66
simonw added a commit that referenced this issue Dec 27, 2019
simonw added a commit that referenced this issue Dec 27, 2019
Refs #66
simonw added a commit that referenced this issue Dec 30, 2019
simonw added a commit that referenced this issue Dec 30, 2019
simonw added a commit that referenced this issue Dec 30, 2019
simonw added a commit that referenced this issue Dec 30, 2019
simonw added a commit that referenced this issue Dec 30, 2019
simonw added a commit that referenced this issue Dec 30, 2019
@simonw

This comment has been minimized.

Copy link
Owner Author

@simonw simonw commented Dec 30, 2019

Last step: update changelog and ship 2.0. Then I can close this issue.

simonw added a commit that referenced this issue Dec 30, 2019
@simonw

This comment has been minimized.

Copy link
Owner Author

@simonw simonw commented Dec 31, 2019

@simonw simonw closed this Dec 31, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
1 participant
You can’t perform that action at this time.