Skip to content

sql: revisit index scoring based on constraints #5589

@RaduBerinde

Description

@RaduBerinde

I have been playing around with various examples and found a lot of cases were the index scoring is not working out that well w.r.t choosing the index with the most restrictive spans.

The scoring is based on how many columns we can constrain, but this is not a very good criterion. If the range on the first column is not restricted to exact values, the constraints on the other columns don't have much value; e.g. /1/4/4-/100/4/5 is marginally better than /1-/100 (they both contain the full /2-/99 range); whereas an index where we get just /4/4-/4/5 would be a much better choice, even though we are constraining only two columns.

I think a better strategy would be to generate the actual spans and score based on them. We can estimate a cost for each span based the "spread" between the start and end keys, in particular on how many columns they "agree" (fixing those columns to a single value).

Metadata

Metadata

Assignees

Labels

A-sql-optimizerSQL logical planning and optimizations.C-enhancementSolution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions