Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IndexError when doing .insert(..., pk='id') after insert_all #554

Open
xavdid opened this issue May 22, 2023 · 1 comment
Open

IndexError when doing .insert(..., pk='id') after insert_all #554

xavdid opened this issue May 22, 2023 · 1 comment

Comments

@xavdid
Copy link

xavdid commented May 22, 2023

I believe this is related to #98.

When pk is specified by table A's insert call, it throws an index error if a different table has written a row with a higher rowid than exists in the first table. Here's a basic example:

from sqlite_utils import Database


def test_pk_for_insert(fresh_db):
    user = {"id": "abc", "name": "david"}

    fresh_db["users"].insert(user, pk="id")

    fresh_db["comments"].insert_all(
        [
            {"id": "def", "text": "ok"},
            {"id": "ghi", "text": "great"},
        ],
    )

    fresh_db["users"].insert(
        user,
        ignore=True,
        # BUG: when specifying pk on the second insert call 
        # db.py goes into a block it doesn't expect and we get the error
        pk="id",
    )


if __name__ == "__main__":
    db = Database("bug.db")
    if db["users"].exists():
        raise ValueError(
            "bug only shows on a new database - remove bug.db before running the script"
        )
    test_pk_for_insert(db)

The error is:

  File "/Users/david/projects/reddit-to-sqlite/.venv/lib/python3.11/site-packages/sqlite_utils/db.py", line 2960, in insert_chunk
    row = list(self.rows_where("rowid = ?", [self.last_rowid]))[0]
          ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^
IndexError: list index out of range

The issue is in this block:

self.last_rowid = result.lastrowid
self.last_pk = self.last_rowid
# self.last_rowid will be 0 if a "INSERT OR IGNORE" happened
if (hash_id or pk) and self.last_rowid:
row = list(self.rows_where("rowid = ?", [self.last_rowid]))[0]

relevant locals are:

  • pk: 'id'
  • result.lastrowid: 2

What's most interesting is the comment # self.last_rowid will be 0 if a "INSERT OR IGNORE" happened, which doesn't seem to be the case here.

@xavdid
Copy link
Author

xavdid commented May 22, 2023

Oh and for context - this goes away if I use .upsert instead of insert(..., ignore=True), but I don't want to update the value if it's written, just do an insert if it's new. The code is basically:

def save_items(table, items):
    db["users"].insert(build_user(items[0]), pk="id",ignore=True)
    db[table].insert_all(items)

if comments := fetch_comments():
    save_items('comments', comments)

if posts := fetch_posts():
    save_items('posts', posts)

So either comments or post could create the relevant user if those items exist. In cases where they both exist, I get this error. I need the pk because either call could create the table.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant