New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
If a row has a primary key of null
various things break
#2145
Comments
The big challenge here is what the URL to that row page should look like. How can I encode a |
Oh wow, null primary keys are bad news... SQLite lets you insert multiple rows with the same >>> import sqlite_utils
>>> db = sqlite_utils.Database(memory=True)
>>> db["foo"].insert({"id": None, "name": "No ID"}, pk="id")
<Table foo (id, name)>
>>> db.schema
'CREATE TABLE [foo] (\n [id] TEXT PRIMARY KEY,\n [name] TEXT\n);'
>>> db["foo"].insert({"id": None, "name": "No ID"}, pk="id")
<Table foo (id, name)>
>>> db.schema
'CREATE TABLE [foo] (\n [id] TEXT PRIMARY KEY,\n [name] TEXT\n);'
>>> list(db["foo"].rows)
[{'id': None, 'name': 'No ID'}, {'id': None, 'name': 'No ID'}]
>>> list(db.query('select * from foo where id = null'))
[]
>>> list(db.query('select * from foo where id is null'))
[{'id': None, 'name': 'No ID'}, {'id': None, 'name': 'No ID'}] |
https://www.sqlite.org/lang_createtable.html#the_primary_key says:
|
So it sounds like SQLite does ensure that a So one solution here would be to detect a null primary key and switch that table over to using https://latest.datasette.io/fixtures/infinity/1 But when would we run that check? And does every row in the table get a new |
Here's a potential solution: make it so ALL Then teach the code that outputs the URL to a row page to spot if there are |
The most interesting row URL in the fixtures database right now is this one: https://latest.datasette.io/fixtures/compound_primary_key/a~2Fb,~2Ec-d |
Looking at the way these URLs work: because the components themselves in |
I just found this and panicked, thinking maybe tilde encoding is a bad idea after all! https://jkorpela.fi/tilde.html But... "Date of last update: 1999-08-27" - I think I'm OK. |
From reviewing https://simonwillison.net/2022/Mar/19/weeknotes/
That's how I chose the tilde character - but it also suggests that I could use So maybe No, that doesn't work: >>> from datasette.utils import tilde_encode
>>> tilde_encode("_")
'_' I need a character which tilde-encoding does indeed encode. |
>>> tilde_encode("~")
'~7E'
>>> tilde_encode(".")
'~2E'
>>> tilde_encode("-")
'-' I think
But... I worry about that colliding with my URL routing code that spots the difference between these:
etc. |
I could set a rule that extensions (including custom render extensions set by plugins) must not be valid integers, and teach Datasette that |
Here's the regex in question at the moment: Lines 1387 to 1390 in 943df09
|
|
Also relevant: datasette/datasette/utils/__init__.py Lines 1147 to 1153 in 943df09
|
Creating a quick test database: sqlite-utils create-table nulls.db nasty id text --pk id
sqlite-utils nulls.db 'insert into nasty (id) values (null)' |
This is hard. I tried this: def path_from_row_pks(row, pks, use_rowid, quote=True):
"""Generate an optionally tilde-encoded unique identifier
for a row from its primary keys."""
if use_rowid or any(row[pk] is None for pk in pks):
bits = [row["rowid"]]
else:
bits = [
row[pk]["value"] if isinstance(row[pk], dict) else row[pk] for pk in pks
]
if quote:
bits = [tilde_encode(str(bit)) for bit in bits]
else:
bits = [str(bit) for bit in bits]
return ",".join(bits) The But I got this error on http://127.0.0.1:8003/nulls/nasty :
Because the SQL query I ran to populate the page didn't know that it would need to select |
How expensive is it to detect if a SQLite table contains at least one |
Ran a quick benchmark on ChatGPT Code Interpreter: https://chat.openai.com/share/8357dc01-a97e-48ae-b35a-f06249935124 Conclusion from there is that this query returns fast no matter how much the table grows: SELECT EXISTS(SELECT 1 FROM "nasty" WHERE "id" IS NULL) So detecting if a table contains any null primary keys is definitely feasible without a performance hit. |
@simonw, since you're referencing "rowid" column by name, I just want to note that there may be an existing rowid column with completely different semantics (https://www.sqlite.org/lang_createtable.html#rowid), which is likely to break this logic. I don't see a good way to detect a proper "rowid" name short of checking if there is a field with that name and using the alternative ( In terms of the original issue, maybe a way to deal with it is to use rowid by default and then use primary key for WITHOUT ROWID tables (as they are guaranteed to be not null), but I suspect it may require significant changes to the API (and doesn't fully address the issue of what value to pass to indicate NULL when editing records). Would it make sense to generate a random string to indicate NULL values when editing? |
Suggestion from @asg017 is that we say that if your row has a null primary key you don't get a link to a row page for that row. Which has some precedent, because our SQL view display doesn't link to row pages at all (since they don't make sense for views): https://latest.datasette.io/fixtures/simple_view |
Another point: The new Datasette write API should refuse to insert a row with a NULL primary key. That will likely decrease the likelihood someone find themselves with NULLs in their primary keys, at least with Datasette users. Especially buggy code that uses the write API, like our |
Stumbled across this while experimenting with
datasette-write-ui
. The error I got was a 500 on the/db
page:Tracked it down to this code, which assembles the URL for a row page:
datasette/datasette/utils/__init__.py
Lines 120 to 134 in 943df09
That's because
tilde_encode
can't handleNone
:datasette/datasette/utils/__init__.py
Lines 1175 to 1178 in 943df09
The text was updated successfully, but these errors were encountered: