Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keyset pagination doesn't work correctly for compound primary keys #190

Closed
simonw opened this issue Mar 28, 2018 · 7 comments
Closed

Keyset pagination doesn't work correctly for compound primary keys #190

simonw opened this issue Mar 28, 2018 · 7 comments
Labels

Comments

@simonw
Copy link
Owner

simonw commented Mar 28, 2018

Consider https://datasette-issue-190-compound-pks.now.sh/compound-pks-9aafe8f/compound_primary_key

2018-03-28 at 3 47 pm

The next= link is to d,v:

https://datasette-issue-190-compound-pks.now.sh/compound-pks-9aafe8f/compound_primary_key?_next=d%2Cv

But that page starts with:

2018-03-28 at 3 48 pm

The next key in the sequence should be d,w. Also we should return the full a-z of the ones that start with the letter e - in this example we only return e-w, e-x, e-y and e-z

@simonw
Copy link
Owner Author

simonw commented Mar 28, 2018

This is because the SQL we are using here is:

select * from compound_primary_key where "pk1" > "d" and "pk2" > "v" order by pk1, pk2 limit 101

This is incorrect. The correct SQL syntax (according to the example on https://www.sqlite.org/rowvalue.html#scrolling_window_queries ) is:

select * from compound_primary_key where ("pk1", "pk2") > ("d", "v") order by pk1, pk2 limit 101

BUT... this uses "row values" syntax which was only added to SQLite in version 3.15.0 in October 2016: https://sqlite.org/changes.html#version_3_15_0

The version on https://datasette-issue-190-compound-pks.now.sh/compound-pks-9aafe8f?sql=select+sqlite_version%28%29%3B is 3.8.7.1 from October 2014.

@simonw
Copy link
Owner Author

simonw commented Mar 28, 2018

Without row values syntax, the necessary SQL to retrieve the next page after d, v gets a bit gnarly:

select * from compound_primary_key
where pk1 >= "d" and not (pk1 = "d" and pk2 <= "v")
order by pk1, pk2

See https://datasette-issue-190-compound-pks.now.sh/compound-pks-9aafe8f?sql=select+*+from+compound_primary_key+where+pk1+%3E%3D+%22d%22+and+not+%28pk1+%3D+%22d%22+and+pk2+%3C%3D+%22v%22%29+order+by+pk1%2C+pk2

This article was useful for figuring this out: https://use-the-index-luke.com/sql/partial-results/fetch-next-page

@simonw
Copy link
Owner Author

simonw commented Mar 28, 2018

Here's how I generated the table for testing this with 3 compound primary keys:

CREATE_SQL = '''
CREATE TABLE compound_three_primary_keys (
    pk1 varchar(30),
    pk2 varchar(30),
    pk3 varchar(30),
    content text,
    PRIMARY KEY (pk1, pk2, pk3)
);'''
alphabet = 'abcdefghijklmnopqrstuvwxyz'
for a in alphabet:
    for b in alphabet:
        for c in alphabet:
            print('''
INSERT INTO compound_three_primary_keys VALUES ('{}', '{}', '{}', '{}');
        '''.strip().format(a, b, c, '{}-{}-{}-{}-{}-{}'.format(a,b,c,a,b,c)))

@simonw
Copy link
Owner Author

simonw commented Mar 28, 2018

Here's the SQL for a next page with three compound primary keys:

https://datasette-issue-190-compound-pks.now.sh/compound-pks-8e99805?sql=select+*+from+compound_three_primary_keys%0D%0Awhere%0D%0A++%28pk1+%3E+%3Apk1%29%0D%0A++++or%0D%0A++%28pk1+%3D+%3Apk1+and+pk2+%3E+%3Apk2%29%0D%0A++++or%0D%0A++%28pk1+%3D+%3Apk1+and+pk2+%3D+%3Apk2+and+pk3+%3E+%3Apk3%29%0D%0Aorder+by+pk1%2C+pk2%2C+pk3%3B%0D%0A%0D%0A%0D%0A&pk1=a&pk2=d&pk3=v

select * from compound_three_primary_keys
where
  (pk1 > :pk1)
    or
  (pk1 = :pk1 and pk2 > :pk2)
    or
  (pk1 = :pk1 and pk2 = :pk2 and pk3 > :pk3)
order by pk1, pk2, pk3;

@simonw simonw added the bug label Mar 30, 2018
@simonw simonw closed this as completed in 31f63d1 Mar 30, 2018
@simonw
Copy link
Owner Author

simonw commented Mar 30, 2018

Re-opening this issue: my fix doesn't play nicely with extra filter arguments.

Consider this page: https://datasette-issue-190-compound-pks-not-quite-fixed.now.sh/compound-pks-8e99805/compound_three_primary_keys?content__contains=d

The next link is to ?_next=f%2Cz%2Ct&content__contains=z (that's next of f,z,t) but that gives us https://datasette-issue-190-compound-pks-not-quite-fixed.now.sh/compound-pks-8e99805/compound_three_primary_keys?_next=b%2Cx%2Cd&content__contains=d which shows a,a,d at the top.

Sure enough, the generated SQL looks like this: https://datasette-issue-190-compound-pks-not-quite-fixed.now.sh/compound-pks-8e99805?sql=select+%2A+from+compound_three_primary_keys+where+%22content%22+like+%3Ap0+and+%28%5Bpk1%5D+%3E+%3Ap0%29%0A++or%0A%28%5Bpk1%5D+%3D+%3Ap0+and+%5Bpk2%5D+%3E+%3Ap1%29%0A++or%0A%28%5Bpk1%5D+%3D+%3Ap0+and+%5Bpk2%5D+%3D+%3Ap1+and+%5Bpk3%5D+%3E+%3Ap2%29+order+by+pk1%2C+pk2%2C+pk3+limit+101&p0=%25d%25&p1=b&p2=x&p3=d

select * from compound_three_primary_keys where "content" like :p0 and ([pk1] > :p0)
  or
([pk1] = :p0 and [pk2] > :p1)
  or
([pk1] = :p0 and [pk2] = :p1 and [pk3] > :p2) order by pk1, pk2, pk3 limit 101

The parameters here are confused. The :p0 should be reserved just for the like clause - the other parameters should be p1, p2 and p3 (not p0, p1 and p2).

@simonw simonw reopened this Mar 30, 2018
@simonw simonw closed this as completed in 7365c3f Mar 30, 2018
@simonw
Copy link
Owner Author

simonw commented Mar 30, 2018

Interestingly, in deploying a copy of the database to demonstrate this final bug fix I had to use the --force argument like so:

datasette publish now --branch=master compound-pks.db --force

This is because now had already deployed a Dockerfile referencing --branch=master once already, so it thought nothing had changed and it could re-use that last deployment.

@simonw
Copy link
Owner Author

simonw commented Mar 30, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant