[Feature Request] Improve index performance in scans, order by, group queries

**Purpose of the feature.**
Current scan pushdown to the index are limited by their ability to support index scans and efficient range queries.
Improve the overall performance of these scenarios either in the current index or with alternate index implementations.

**Describe the solution you'd like**

Scenarios to cover:
1) Composite Index queries: Currently Gin/Rum indexes use N trees to create composite indexes. This means queries that do `a = 1 b > 5` will scan all entries for `b > 5` instead of just those limited by `a = 1`.  The changes [here](https://github.com/microsoft/documentdb/commit/db6a3de7fd05e3e8ba37b38d567ce89e12d19b40) and [here](https://github.com/microsoft/documentdb/commit/efa2b802f68b4057611a8a85be0e3ce241f3c19c) add support for more native composite functionality but is not enabled by default (and needs more thorough correctness testing). 

2) Index Scans: When doing a scan like `a > 5 LIMIT 5` we should only need to enumerate 5 entries in the index. Instead we fetch all possible entries in the index and run a bitmap scan on the 5 entries. Even when an index scan is used the default RUM implementation fetches all entries to sort by CTID. There is a GUC for `enable_rum_index_scan` but this is off by default, and requires changes in RUM to make more efficient, but these aren't available in the RUM implementation used. These need to be considered.

3) Order by: The index scan should support order by pushdown to the index. The planner pieces are implemented [here](https://github.com/microsoft/documentdb/commit/a72e6563d63bb49fb5202d3be0a61b2f40346cb8) and is plumbed to the composite op-class called out in (1) but off by default and require changes in RUM that are not available in the implementation. Order by must support asc/desc as well as composite scans.

4) Group By: Group by expressions need to be pushed to the index as well. There is currently no support for this since the Group by is an expression and extracting an expression that can be pushed to hte index is not done by the planner.

Note that any implementation chosen must also have support for wildcard indexes by default, and handle multi-key values (cannot be 1:1 for document to terms), and handle sort/type comparison semantics. Since index terms can be larger than a PG page, we need to handle truncation and lossy indexes as well.

**Describe alternatives you've considered**
- BTree & GIST indexes as-is could help with 1, 2, & 3 but they violate the multi-key concerns for order by and wildcard. However, if there are other index implementations that can be considered, they are welcome.
- Current in progress attempts are highlighted above. This issue will track the improvements overall as they're discussed and being made.




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature Request] Improve index performance in scans, order by, group queries #191

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature Request] Improve index performance in scans, order by, group queries #191

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions