Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sqlite-utils insert errors should show SQL and parameters, if possible #309

Closed
scaleoutsean opened this issue Aug 9, 2021 · 6 comments
Closed
Labels
cli-tool enhancement New feature or request

Comments

@scaleoutsean
Copy link

I've tried several approaches, but this is the current one:

echo $json-line | sqlite-utils insert json.db jsontable --truncate --alter --detect-types -

In all cases, I get this error:

OverflowError: Python int too large to convert to SQLite INTEGER
Traceback (most recent call last):
  File "/home/sean/.local/bin/sqlite-utils", line 8, in <module>
    sys.exit(cli())
  File "/usr/lib/python3/dist-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/usr/lib/python3/dist-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/lib/python3/dist-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/lib/python3/dist-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/home/sean/.local/lib/python3.8/site-packages/sqlite_utils/cli.py", line 841, in insert
    insert_upsert_implementation(
  File "/home/sean/.local/lib/python3.8/site-packages/sqlite_utils/cli.py", line 780, in insert_upsert_implementation
    db[table].insert_all(
  File "/home/sean/.local/lib/python3.8/site-packages/sqlite_utils/db.py", line 2145, in insert_all
    self.insert_chunk(
  File "/home/sean/.local/lib/python3.8/site-packages/sqlite_utils/db.py", line 1957, in insert_chunk
    result = self.db.execute(query, params)
  File "/home/sean/.local/lib/python3.8/site-packages/sqlite_utils/db.py", line 257, in execute
    return self.conn.execute(sql, parameters)

I googled the error and checked SO answers and advice, all good. I changed my JSON file to not use integers so I no longer get this error. Of course, that makes using the database a bit harder, so I also tried to solve the problem by modifying DB structure (while using integers in JSON).

If change all INTEGER Data Types to something else (STRING, TEXT) and try to import again using --truncate, I still get this error. I suppose I should tell sqlite-utils which columns should use non-INTEGER Data Type rather than rely on it to check my SQL table configuration.

If that is the case, can this error be a bit more specific for easier troubleshooting - maybe tell us which which record caused the problem when that error is thrown?

My table has 60+ columns, many of which use 64-bit integers (not all records are large or known in advance), so while I can modify JSON to use strings instead of integers, it decreases usability and finding out which records have values for which SQLite integers aren't sufficient requires some work (I'm thinking about parsing all integers with jq and sorting output by length to identify those columns, but I'd prefer if sqlite-utils could tell me that).

My environment:

  • Python 3.8.10
    • sqlite-utils 3.14
    • pandas 1.3.1
    • numpy 1.21.1
    • sqlite-fts4 1.0.1
  • sqlite 3.31.1-4ubuntu0.2
@simonw
Copy link
Owner

simonw commented Aug 9, 2021

Yeah this error message could certainly be more helpful.

I thought OverflowError might be one of the SQLite exceptions: https://docs.python.org/3/library/sqlite3.html#exceptions - but it turns out it's actually reusing the Python built-in OverflowError class:

import sqlite3
db = sqlite3.connect(":memory:")
caught = []
try:
    db.execute("create table foo (number integer)")
    db.execute("insert into foo (number) values (?)", [34223049823094832094802398430298048240])
except Exception as e:
    print(e)
    caught.append(e)
isinstance(caught[0], OverflowError)

Here's where that happens in the Python sqlite3 module code: https://github.com/python/cpython/blob/058fb35b57ca8c5063d16ec818e668b3babfea65/Modules/_sqlite/util.c#L123-L124

@simonw simonw changed the title Identify columns in SQLite INTEGER overflow error sqlite-utils insert errors should show SQL and parameters, if possible Aug 9, 2021
@simonw simonw added cli-tool enhancement New feature or request labels Aug 9, 2021
@simonw
Copy link
Owner

simonw commented Aug 9, 2021

Steps to reproduce:

echo '{"v": 34223049823094832094802398430298048240}' | sqlite-utils insert /tmp/blah.db row -

@simonw
Copy link
Owner

simonw commented Aug 9, 2021

I'm going to use a bit of a dirty trick for this one: I'm going to recursively inspect the stack on an error and try to find the sql and params variables.

That way I can handle this all at the CLI layer without changing the exceptions that are being raised by the Python library.

@simonw
Copy link
Owner

simonw commented Aug 9, 2021

OverflowError: Python int too large to convert to SQLite INTEGER
>>> import sys
>>> def find_variables(tb, vars):
        to_find = list(vars)
        found = {}
        for var in to_find:
            if var in tb.tb_frame.f_locals:
                vars.remove(var)
                found[var] = tb.tb_frame.f_locals[var]
        if vars and tb.tb_next:
            found.update(find_variables(tb.tb_next, vars))
        return found
... 
>>> find_variables(sys.last_traceback, ["sql", "params"])
{'params': [34223049823094832094802398430298048240], 'sql': 'INSERT INTO [row] ([v]) VALUES (?);'}

@simonw simonw closed this as completed in 14f643d Aug 9, 2021
@simonw
Copy link
Owner

simonw commented Aug 9, 2021

Demo:

$ echo '{"v": 34223049823094832094802398430298048240}' | sqlite-utils insert /tmp/blah.db row -                   
Error: Python int too large to convert to SQLite INTEGER

sql = INSERT INTO [row] ([v]) VALUES (?);
parameters = [34223049823094832094802398430298048240]

simonw added a commit that referenced this issue Aug 9, 2021
@simonw
Copy link
Owner

simonw commented Aug 9, 2021

TIL about how the stack inspection works: https://til.simonwillison.net/python/find-local-variables-in-exception-traceback

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cli-tool enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants