Create index where the included columns would be maintained by CRDB as all the columns needed to delete records from the base table. #80244

jhatcher9999 · 2022-04-20T14:50:57Z

Is your feature request related to a problem? Please describe.
When you try to delete records from a table (i.e., DELETE FROM tbl WHERE id BETWEEN 10 AND 20), not only do you need an index that helps resolve the predicate, but the SQL optimizer also needs to retrieve all the key fields used in the primary index and all the secondary indexes. The query plan will show an INDEX scan/seek, then an INDEX JOIN back to the primary table, and then a DELETE.

This index join can be expensive. The need for this index join can be removed by creating a covering index that includes all the fields that are keys for the primary/secondary indexes. However, this is not very intuitive. Nor is it easy to maintain. For instance, if you created a covering index to help with deletes and later created another index on the table but failed to alter your "delete index" to include the fields used in your new index, then your DELETE queries would revert to having an INDEX JOIN in their execution plan.

Example:

CREATE TABLE customer ( id uuid DEFAULT gen_random_uuid(), name varchar, dob timestamp );
INSERT INTO customer ( id, name, dob ) VALUES ( DEFAULT, 'jim', '1776-07-04' );
CREATE INDEX ix_customer_name ON customer ( name );
CREATE INDEX ix_customer_dob ON customer ( dob );
root@localhost:26257/defaultdb> SELECT * FROM customer;
                   id                  | name |         dob
---------------------------------------+------+----------------------
  feb962b7-0be8-461f-9fba-1d4a5acfb510 | jim  | 1776-07-04 00:00:00
root@localhost:26257/defaultdb> EXPLAIN DELETE FROM customer WHERE id = 'feb962b7-0be8-461f-9fba-1d4a5acfb510';
                                          info
----------------------------------------------------------------------------------------
  distribution: local
  vectorized: true

  • delete
  │ from: customer
  │ auto commit
  │
  └── • filter <-  At larger data volumes, this will turn into an INDEX JOIN -- but same general idea
      │ estimated row count: 1
      │ filter: id = 'feb962b7-0be8-461f-9fba-1d4a5acfb510'
      │
      └── • scan
            estimated row count: 1 (100% of the table; stats collected 21 seconds ago)
            table: customer@primary
            spans: FULL SCAN
CREATE INDEX ix_customer_id_for_delete ON customer ( id ) INCLUDE ( name, dob );
root@localhost:26257/defaultdb> EXPLAIN DELETE FROM customer WHERE id = 'feb962b7-0be8-461f-9fba-1d4a5acfb510';
                                                info
----------------------------------------------------------------------------------------------------
  distribution: local
  vectorized: true

  • delete
  │ from: customer
  │ auto commit
  │
  └── • scan
        estimated row count: 1 (100% of the table; stats collected 38 seconds ago)
        table: customer@ix_customer_id_for_delete
        spans: [/'feb962b7-0be8-461f-9fba-1d4a5acfb510' - /'feb962b7-0be8-461f-9fba-1d4a5acfb510']

Describe the solution you'd like
I would like to suggest an index that worked like:

CREATE INDEX id_delete ON tbl STORING NEEDED DELETE COLUMNS;

where this index would handle updating itself whenever schema events occurred that necessitated adding a field to the index to support more performant DELETEs.

Describe alternatives you've considered
Maintaining these indexes manually.

Additional context
I'm testing this on CRDB 21.2.9

Jira issue: CRDB-15794

The text was updated successfully, but these errors were encountered:

michae2 · 2022-04-26T18:50:56Z

We think most of the work here would be related to schema changes.

ajwerner · 2022-05-03T14:24:23Z

This is some rather bespoke syntactic sugar. I'm putting it in the backlog. Feel free to lobby a PM about it.

jhatcher9999 added the C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) label Apr 20, 2022

rytaft added this to Triage in SQL Queries via automation Apr 20, 2022

blathers-crl bot added the T-sql-queries SQL Queries Team label Apr 20, 2022

michae2 added this to Triage in SQL Foundations via automation Apr 26, 2022

blathers-crl bot added the T-sql-schema-deprecated Use T-sql-foundations instead label Apr 26, 2022

michae2 removed this from Triage in SQL Queries Apr 26, 2022

ajwerner moved this from Triage to Backlog in SQL Foundations May 3, 2022

jlinder added sync-me and removed sync-me labels May 20, 2022

postamar moved this from Backlog to Cold storage in SQL Foundations Nov 10, 2022

healthy-pod added T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) and removed T-sql-schema-deprecated Use T-sql-foundations instead labels May 17, 2023

exalate-issue-sync bot removed the T-sql-queries SQL Queries Team label May 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create index where the included columns would be maintained by CRDB as all the columns needed to delete records from the base table. #80244

Create index where the included columns would be maintained by CRDB as all the columns needed to delete records from the base table. #80244

jhatcher9999 commented Apr 20, 2022 •

edited by cockroach-jira-scripts

michae2 commented Apr 26, 2022

ajwerner commented May 3, 2022

Create index where the included columns would be maintained by CRDB as all the columns needed to delete records from the base table. #80244

Create index where the included columns would be maintained by CRDB as all the columns needed to delete records from the base table. #80244

Comments

jhatcher9999 commented Apr 20, 2022 • edited by cockroach-jira-scripts

michae2 commented Apr 26, 2022

ajwerner commented May 3, 2022

jhatcher9999 commented Apr 20, 2022 •

edited by cockroach-jira-scripts