Skip to content

[Postgres Storage] Inefficient index usage when compacting #400

@rkistner

Description

@rkistner

The query used to compact Postgres bucket storage appears to be inefficient:

Limit  (cost=0.69..23167.11 rows=10000 width=195)
  ->  Index Scan Backward using unique_id on bucket_data  (cost=0.69..10097363.36 rows=4358620 width=195)
        Index Cond: ((group_id = 3) AND (bucket_name >= 'profile_data['::text))
        Filter: (((bucket_name = 'profile_data[~'::text) AND (op_id < '9223372036854775807'::bigint)) OR (bucket_name < 'profile_data[~'::text))

const batch = await this.db.sql`
SELECT
op,
op_id,
source_table,
table_name,
row_id,
source_key,
bucket_name
FROM
bucket_data
WHERE
group_id = ${{ type: 'int4', value: this.group_id }}
AND bucket_name >= ${{ type: 'varchar', value: bucketLower }}
AND (
(
bucket_name = ${{ type: 'varchar', value: bucketUpper }}
AND op_id < ${{ type: 'int8', value: upperOpIdLimit }}
)
OR bucket_name < ${{ type: 'varchar', value: bucketUpper }} COLLATE "C" -- Use binary comparison
)
ORDER BY
bucket_name DESC,
op_id DESC
LIMIT
${{ type: 'int4', value: this.moveBatchQueryLimit }}
`

We should properly investigate and optimize the query here.

See this Discord thread for details.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions