Skip to content

[Feature Request] Improve index performance in scans, order by, group queries #191

@visridha

Description

@visridha

Purpose of the feature.
Current scan pushdown to the index are limited by their ability to support index scans and efficient range queries.
Improve the overall performance of these scenarios either in the current index or with alternate index implementations.

Describe the solution you'd like

Scenarios to cover:

  1. Composite Index queries: Currently Gin/Rum indexes use N trees to create composite indexes. This means queries that do a = 1 b > 5 will scan all entries for b > 5 instead of just those limited by a = 1. The changes here and here add support for more native composite functionality but is not enabled by default (and needs more thorough correctness testing).

  2. Index Scans: When doing a scan like a > 5 LIMIT 5 we should only need to enumerate 5 entries in the index. Instead we fetch all possible entries in the index and run a bitmap scan on the 5 entries. Even when an index scan is used the default RUM implementation fetches all entries to sort by CTID. There is a GUC for enable_rum_index_scan but this is off by default, and requires changes in RUM to make more efficient, but these aren't available in the RUM implementation used. These need to be considered.

  3. Order by: The index scan should support order by pushdown to the index. The planner pieces are implemented here and is plumbed to the composite op-class called out in (1) but off by default and require changes in RUM that are not available in the implementation. Order by must support asc/desc as well as composite scans.

  4. Group By: Group by expressions need to be pushed to the index as well. There is currently no support for this since the Group by is an expression and extracting an expression that can be pushed to hte index is not done by the planner.

Note that any implementation chosen must also have support for wildcard indexes by default, and handle multi-key values (cannot be 1:1 for document to terms), and handle sort/type comparison semantics. Since index terms can be larger than a PG page, we need to handle truncation and lossy indexes as well.

Describe alternatives you've considered

  • BTree & GIST indexes as-is could help with 1, 2, & 3 but they violate the multi-key concerns for order by and wildcard. However, if there are other index implementations that can be considered, they are welcome.
  • Current in progress attempts are highlighted above. This issue will track the improvements overall as they're discussed and being made.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    Status

    TBD

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions