Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support creating descending order indexes #260

Closed
simonw opened this issue May 29, 2021 · 12 comments
Closed

Support creating descending order indexes #260

simonw opened this issue May 29, 2021 · 12 comments
Labels
documentation enhancement New feature or request

Comments

@simonw
Copy link
Owner

simonw commented May 29, 2021

SQLite lets you create indexes in reverse order, which can have a surprisingly big impact on performance, see simonw/covid-19-datasette#27

I tried doing this using sqlite-utils like so, but it's didn't work:

db["ny_times_us_counties"].create_index(["date desc"])
@simonw simonw added enhancement New feature or request documentation labels May 29, 2021
@simonw simonw changed the title Document that "date desc" works with .create_index() Support creating descending order indexes May 29, 2021
simonw added a commit to simonw/covid-19-datasette that referenced this issue May 29, 2021
@simonw
Copy link
Owner Author

simonw commented May 29, 2021

The problem here is differentiating between a column with the name date desc and wanting to create a descending index on a column called date.

This won't work:

db["ny_times_us_counties"].create_index(["date desc"], desc=True)

Because we need to be able to create compound indexes with columns with different directions - for example:

create index idx_age_desc_name on dogs (age desc, name)

@simonw
Copy link
Owner Author

simonw commented May 29, 2021

A few options:

db["dogs"].create_index([("age", "desc"), "name"])

db["dogs"].create_index([desc("age"), "name"])

db["dogs"].create_index([db.desc("age"), "name"])

The first option uses an optional tuple. The second two use a desc() function - the question is where should that live?

sqlite_utils.desc(column) or db.desc(column) are both options.

I don't like using the term desc() for "descending index" though - it feels like it should mean something more broad.

@simonw
Copy link
Owner Author

simonw commented May 29, 2021

Maybe:

db["dogs"].create_index([db.descending_index("age"), "name"])

It's a little verbose but it's for a relatively rare activity and it does make it very clear what is going on.

@simonw
Copy link
Owner Author

simonw commented May 29, 2021

I could use db.desc_index("age") to match SQLite SQL syntax, which uses desc and not descending.

@simonw
Copy link
Owner Author

simonw commented May 29, 2021

Need to solve this for the CLI tool too. Currently that works like this: https://sqlite-utils.datasette.io/en/stable/cli.html#creating-indexes

sqlite-utils create-index mydb.db mytable col1 [col2...]

Even harder to decide how to add a descending option to this. Maybe like this?

sqlite-utils create-index mydb.db mytable --direction col1 asc --direction col2 desc

It's a bit gross though! We're saying here that if a single one of the columns you are creating an index for is in reverse direction you have to use --direction to specify each end every other index.

@simonw
Copy link
Owner Author

simonw commented May 29, 2021

For the CLI version I could say that you can use a - prefix to specify reverse direction:

sqlite-utils create-index mydb.db dogs -age name

No, that doesn't work - it could get confused with a command-line flag. I guess you could do this:

sqlite-utils create-index mydb.db dogs "-age" name

This does mean that if any of your column names begin with a hyphen you can't use the CLI to add indexes to them. Is that an acceptable limitation? Users can always use sqlite-utils mydb.db "create index ..." in that case.

@simonw
Copy link
Owner Author

simonw commented May 29, 2021

Decisions: for the Python API I'm going with db.DescIndex("column") as the way to do this.

For the CLI I'm going to do the "-age" thing.

@simonw
Copy link
Owner Author

simonw commented May 29, 2021

Annoyingly the table.indexes property won't indicate if an index is in regular or reverse order - because the SQLite PRAGMA index_info(table) statement doesn't indicate that either. You have to look at the sqlite_master index definition to tell if any of the columns are in reverse order:

(Pdb) fresh_db.execute("select * from sqlite_master where type = 'index'").fetchall()
[('index', 'idx_dogs_age_name', 'dogs', 3, 'CREATE INDEX [idx_dogs_age_name]\n    ON [dogs] ([age] desc, [name])')]
(Pdb) fresh_db.execute("PRAGMA index_info('idx_dogs_age_name')").fetchall()
[(0, 2, 'age'), (1, 0, 'name')]
(Pdb) fresh_db.execute("PRAGMA index_info('idx_dogs_age_name')").description
(('seqno', None, None, None, None, None, None), ('cid', None, None, None, None, None, None), ('name', None, None, None, None, None, None))
(Pdb) dogs.indexes
[Index(seq=0, name='idx_dogs_age_name', unique=0, origin='c', partial=0, columns=['age', 'name'])]

@simonw
Copy link
Owner Author

simonw commented May 29, 2021

PRAGMA index_xinfo(table) DOES return that data:

(Pdb) [c[0] for c in fresh_db.execute("PRAGMA index_xinfo('idx_dogs_age_name')").description]
['seqno', 'cid', 'name', 'desc', 'coll', 'key']
(Pdb) fresh_db.execute("PRAGMA index_xinfo('idx_dogs_age_name')").fetchall()
[(0, 2, 'age', 1, 'BINARY', 1), (1, 0, 'name', 0, 'BINARY', 1), (2, -1, None, 0, 'BINARY', 0)]

See https://sqlite.org/pragma.html#pragma_index_xinfo

Example output: https://covid-19.datasettes.com/covid?sql=select+*+from+pragma_index_xinfo%28%27idx_ny_times_us_counties_date%27%29

@simonw
Copy link
Owner Author

simonw commented May 29, 2021

It's weird having to use Database.DescIndex - I'm going to put DescIndex in sqlite_utils.db directly and let people import it.

@simonw
Copy link
Owner Author

simonw commented May 29, 2021

This doesn't work:

sqlite-utils create-index mydb.db dogs "-age" name

But this does:

sqlite-utils create-index mydb.db dogs -- -age name

@simonw
Copy link
Owner Author

simonw commented May 29, 2021

I confirmed and it's possible to have a SQLite column with a hyphen at the start, confirmed using:

% sqlite-utils create-table demo.db demo -- id integer name text -blah integer
% sqlite-utils tables --schema demo.db -t
table    schema
-------  ---------------------
demo     CREATE TABLE [demo] (
            [id] INTEGER,
            [name] TEXT,
            [-blah] INTEGER
         )

@simonw simonw closed this as completed in 51d01da May 29, 2021
simonw added a commit that referenced this issue May 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant