Skip to content

Commit 0ddb094

Browse files
visridhashuaitian-git
authored andcommitted
Merged PR 1690832: [Composite] Remove array term generation for composite indexes
### Does this PR have any customer impact? Not yet ### Type (Feature, Refactoring, Bugfix, DevOps, Testing, Perf, etc) Feature/POC ### Does it involve schema level changes? (Table, Column, Index, UDF, etc level changes) Yes ### Are you introducing any new config? If yes, do you have tests with and without them being set? No ### ChangeLog (Refer [Template](../oss/CHANGELOG.md)) ### Description This adds more changes for orderby support in the index 1) Fixes the correctness issue arising from indexing arrays for order by - we remove the top level array term from indexing. To handle this, we move the query operators to support querying the index correctly when the query is on an array 2) For `$elemMatch` this temporarily makes elemMatch just a full scan for now (a subsequent change wil tackle updating the logic) 3) Adds orderby to the top level schema for the internal extension 4) For exists false/true, we were pushing a lot more terms to the runtime recheck. Ensure that we're hitting the right index terms and evaluate correctly against the index. ---- #### AI description (iteration 1) #### PR Classification This pull request is a code cleanup and enhancement that removes array term generation for composite indexes and updates related index term handling. #### PR Summary The changes streamline composite index generation by eliminating dedicated array terms, and they adjust boundary, metadata, and recheck logic to ensure consistent behavior. Key updates include: - **`oss/pg_documentdb/src/opclass/bson_gin_composite_core.c`**: Removed redundant array term generation and introduced functions like `SetArrayEqualityBound` to properly handle equality bounds for arrays. - **`oss/pg_documentdb/src/opclass/gin_index_term.c`**: Updated index term serialization and metadata management (including handling of undefined values via `GenerateValueUndefinedTerm`) to align with the new composite index behavior. - **Test and expected output files**: Revised expected composite index term outputs and SQL tests to reflect the removal of array-specific terms. - **Documentation and configuration**: Updated design docs and removed the configuration flag for skipping array term generation, consolidating the new composite index strategy. <!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot -->
1 parent 4b419d9 commit 0ddb094

35 files changed

+2178
-1124
lines changed

internal/pg_documentdb_distributed/src/test/regress/expected/bson_aggregation_pipeline_tests_geonear.out

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1183,6 +1183,33 @@ SELECT * FROM documentdb_api_catalog.bson_aggregation_pipeline('db',
11831183
{ "_id" : { "$numberInt" : "17" }, "a" : { "c" : [ { "x" : { "$numberInt" : "10" }, "y" : { "$numberInt" : "10" } }, { "x" : { "$numberInt" : "20" }, "y" : { "$numberInt" : "20" } }, { "x" : { "$numberInt" : "30" }, "y" : { "$numberInt" : "30" } }, { "x" : { "$numberInt" : "40" }, "y" : { "$numberInt" : "40" } } ] }, "dist" : { "calculated" : { "$numberDouble" : "0.0" } } }
11841184
(2 rows)
11851185

1186+
ROLLBACK;
1187+
BEGIN;
1188+
set citus.enable_local_execution TO OFF;
1189+
SELECT documentdb_api.insert_one('db','agg_geonear','{ "_id": 16, "a": { "c": [[10, 10], [20, 20], [30, 30], [40, 40]]} }', NULL);
1190+
insert_one
1191+
---------------------------------------------------------------------
1192+
{ "n" : { "$numberInt" : "1" }, "ok" : { "$numberDouble" : "1.0" } }
1193+
(1 row)
1194+
1195+
SELECT documentdb_api.insert_one('db','agg_geonear','{ "_id": 17, "a": { "c": [{"x": 10, "y": 10}, {"x": 20, "y": 20}, {"x": 30, "y": 30}, {"x": 40, "y": 40}]} }', NULL);
1196+
insert_one
1197+
---------------------------------------------------------------------
1198+
{ "n" : { "$numberInt" : "1" }, "ok" : { "$numberDouble" : "1.0" } }
1199+
(1 row)
1200+
1201+
SELECT documentdb_api.insert_one('db','agg_geonear','{ "_id": 18, "a": { "c": [[]]} }', NULL);
1202+
insert_one
1203+
---------------------------------------------------------------------
1204+
{ "n" : { "$numberInt" : "1" }, "ok" : { "$numberDouble" : "1.0" } }
1205+
(1 row)
1206+
1207+
SELECT documentdb_api.insert_one('db','agg_geonear','{ "_id": 19, "a": { "c": [{}]} }', NULL);
1208+
insert_one
1209+
---------------------------------------------------------------------
1210+
{ "n" : { "$numberInt" : "1" }, "ok" : { "$numberDouble" : "1.0" } }
1211+
(1 row)
1212+
11861213
EXPLAIN VERBOSE SELECT * FROM documentdb_api_catalog.bson_aggregation_pipeline('db',
11871214
'{ "aggregate": "agg_geonear", "pipeline": [ { "$geoNear": { "near": [10, 10], "distanceField": "dist.calculated", "key": "a.c" } } , { "$addFields": { "dist.calculated": {"$round":[ { "$multiply": ["$dist.calculated", 100000] }] } } } ]}');
11881215
QUERY PLAN

internal/pg_documentdb_distributed/src/test/regress/expected/bson_query_operator_in_opt.out

Lines changed: 26 additions & 30 deletions
Large diffs are not rendered by default.

internal/pg_documentdb_distributed/src/test/regress/expected/bson_query_operator_tests_explain_index.out

Lines changed: 8 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1552,7 +1552,7 @@ EXPLAIN (COSTS OFF, ANALYZE ON, SUMMARY OFF, TIMING OFF) SELECT object_id, docum
15521552
(15 rows)
15531553

15541554
EXPLAIN (COSTS OFF, ANALYZE ON, SUMMARY OFF, TIMING OFF) SELECT object_id, document FROM documentdb_api.collection('db', 'queryoperator') WHERE document @@ '{ "a" : { "$in": [ ]} }' ORDER BY object_id;
1555-
QUERY PLAN
1555+
QUERY PLAN
15561556
---------------------------------------------------------------------
15571557
Custom Scan (Citus Adaptive) (actual rows=0 loops=1)
15581558
Task Count: 1
@@ -1564,11 +1564,9 @@ EXPLAIN (COSTS OFF, ANALYZE ON, SUMMARY OFF, TIMING OFF) SELECT object_id, docum
15641564
-> Sort (actual rows=0 loops=1)
15651565
Sort Key: object_id
15661566
Sort Method: quicksort Memory: 25kB
1567-
-> Bitmap Heap Scan on documents_1200_1200005 collection (actual rows=0 loops=1)
1568-
Recheck Cond: (document OPERATOR(documentdb_api_catalog.@*=) '{ "a" : [ ] }'::documentdb_core.bson)
1569-
-> Bitmap Index Scan on queryoperator_b_a (actual rows=0 loops=1)
1570-
Index Cond: (document OPERATOR(documentdb_api_catalog.@*=) '{ "a" : [ ] }'::documentdb_core.bson)
1571-
(14 rows)
1567+
-> Result (actual rows=0 loops=1)
1568+
One-Time Filter: false
1569+
(12 rows)
15721570

15731571
EXPLAIN (COSTS OFF, ANALYZE ON, SUMMARY OFF, TIMING OFF) SELECT object_id, document FROM documentdb_api.collection('db', 'queryoperator') WHERE document @@ '{ "a" : { "$in": [ 1, 2 ]} }' ORDER BY object_id;
15741572
QUERY PLAN
@@ -1611,7 +1609,7 @@ EXPLAIN (COSTS OFF, ANALYZE ON, SUMMARY OFF, TIMING OFF) SELECT object_id, docum
16111609
(15 rows)
16121610

16131611
EXPLAIN (COSTS OFF, ANALYZE ON, SUMMARY OFF, TIMING OFF) SELECT object_id, document FROM documentdb_api.collection('db', 'queryoperator') WHERE document @@ '{ "a" : { "$nin": [ ]} }' ORDER BY object_id;
1614-
QUERY PLAN
1612+
QUERY PLAN
16151613
---------------------------------------------------------------------
16161614
Custom Scan (Citus Adaptive) (actual rows=27 loops=1)
16171615
Task Count: 1
@@ -1624,10 +1622,10 @@ EXPLAIN (COSTS OFF, ANALYZE ON, SUMMARY OFF, TIMING OFF) SELECT object_id, docum
16241622
Sort Key: object_id
16251623
Sort Method: quicksort Memory: 28kB
16261624
-> Bitmap Heap Scan on documents_1200_1200005 collection (actual rows=27 loops=1)
1627-
Recheck Cond: (document OPERATOR(documentdb_api_catalog.@!*=) '{ "a" : [ ] }'::documentdb_core.bson)
1625+
Recheck Cond: (shard_key_value = '1200'::bigint)
16281626
Heap Blocks: exact=1
1629-
-> Bitmap Index Scan on queryoperator_b_a (actual rows=27 loops=1)
1630-
Index Cond: (document OPERATOR(documentdb_api_catalog.@!*=) '{ "a" : [ ] }'::documentdb_core.bson)
1627+
-> Bitmap Index Scan on _id_ (actual rows=27 loops=1)
1628+
Index Cond: (shard_key_value = '1200'::bigint)
16311629
(15 rows)
16321630

16331631
EXPLAIN (COSTS OFF, ANALYZE ON, SUMMARY OFF, TIMING OFF) SELECT object_id, document FROM documentdb_api.collection('db', 'queryoperator') WHERE document @@ '{ "a" : { "$nin": [ 1, 2 ]} }' ORDER BY object_id;

0 commit comments

Comments
 (0)