Skip to content

feat: Eliminate ORDER BY when index scan provides ascending order#3950

Draft
imilinovic wants to merge 16 commits intomasterfrom
feat/index-order-by
Draft

feat: Eliminate ORDER BY when index scan provides ascending order#3950
imilinovic wants to merge 16 commits intomasterfrom
feat/index-order-by

Conversation

@imilinovic
Copy link
Copy Markdown
Contributor

Git commit description, explain the changes you made here

@imilinovic
Copy link
Copy Markdown
Contributor Author

Tracking

  • [Link to Epic/Issue]

Standard development

CI Testing Labels

  • Select the appropriate CI test labels (CI -build=build-name -test=test-suite)

Documentation checklist

  • Add the documentation label
  • Add the bug / feature label
  • Add the milestone for which this feature is intended
    • If not known, set for a later milestone
  • Write a release note, including added/changed clauses
    • What has changed? What does it mean for a user? What should a user do with it? [#{{PR_number}}]({{link to the PR}})
  • [ Documentation PR link memgraph/documentation#XXXX ]
    • Is back linked to this development PR

@imilinovic imilinovic self-assigned this Mar 23, 2026
@imilinovic imilinovic added feature feature Docs - changelog only Docs - changelog only CI -build=release -test=e2e Run release build and e2e tests on push CI -build=debug -test=core Run debug build and core tests on push benchmarking mgBench and benchmarking Memgraph CI -build=debug -test=integration Run debug build and integration tests on push CI -build=coverage -test=clang_tidy and removed CI -build=debug -test=core Run debug build and core tests on push benchmarking mgBench and benchmarking Memgraph labels Mar 23, 2026
@imilinovic imilinovic added this to the mg-v3.10.0 milestone Mar 24, 2026
Remove the redundant OrderBy operator when a ScanAllByLabelProperties
index scan already returns data in the required ascending order. The
index lookup rewriter now detects when ORDER BY properties match the
index properties (with equality-skip logic for composite indices) and
all operators between OrderBy and the scan are order-preserving.

ASC only — SkipList has no reverse iterator so DESC is not supported.
Clarify why 1:N operators (Expand, Unwind, etc.) are safe for ORDER BY
elimination — they preserve input-symbol ordering and the optimization
verifies ORDER BY references the scan symbol. Add three new test cases
covering equality+range on second column, full composite with range on
first, and equality+range with ORDER BY on both columns.
Use Symbol::operator== instead of name-only comparison to avoid
incorrect ORDER BY elimination when WITH remaps variable names
(e.g. WITH a AS b, b AS a). Add TODO for future DESC support.
Produce sits between OrderBy and ScanAll (planner builds Produce
first, then wraps with OrderBy). Keep Produce in the order-preserving
walk but verify it doesn't remap our target variable name, which
WITH clauses like `WITH a AS b, b AS a` can do. Revert to name-based
symbol comparison since operator== is too strict (different symbol
positions across scopes).
Fix stale comment about Produce exclusion from order-preserving
whitelist (it's actually handled via ProduceRenamesVariable). Add
lifetime documentation for OrderByContext::op pointer. Add tests
for equality-only elimination and multi-MATCH symbol mismatch.
- Add Test 18 (WithRenamingBlocks): verifies that ORDER BY elimination
  is blocked when a WITH clause renames the scan variable (e.g.,
  WITH n AS m), exercising the ProduceRenamesVariable guard.
- Rename Test 10 from ExpandBlocks to ExpandPreservesOrder to
  accurately reflect that Expand is order-preserving and elimination
  succeeds.
Replace single pending_order_by_ optional + bool with a stack so nested
OrderBy operators (e.g. WITH ... ORDER BY ... RETURN ... ORDER BY) don't
clobber outer context. Add unit tests for label-only index and rebound
symbol edge cases. Add EXPLAIN-based e2e tests for ORDER BY elimination.
Add unit and e2e test verifying ORDER BY n.b, n.a is not eliminated
when index is (a, b). Rewrite e2e tests to match full EXPLAIN plans.
Register e2e test file in CMakeLists.txt for build directory copy.
Remove default member initializers from OrderByContext and use full
designated initializers at both construction sites for clarity.
Simplify prev_ops_ walk loop condition. Replace unreachable fallback
in ProduceRenamesVariable with DMG_ASSERT. Restructure e2e tests into
plan smoke tests and data correctness tests that verify result ordering
after ORDER BY elimination.
Revert DMG_ASSERT — the fallback is reachable when RETURN projects a
property expression (e.g. RETURN n.b AS b ORDER BY n.b) so the scan
symbol is not a direct named expression in Produce output. Fix e2e
correctness tests to use RETURN n ORDER BY n.prop so elimination fires.
…location

- Add comments explaining equality-skip logic works regardless of filter
  ordering in expression_ranges_ (filters are ordered by index column
  position, not by type)
- Add comment that ScanAllByLabelProperties handles all label+property
  scan variants (no separate Value/Range operators)
- Replace misleading ReboundSymbolSameNameViaUnwind test with
  WithoutRenamingAllowsElimination (WITH identity mapping still eliminates)
- Build PropertyPath directly from vector<PropertyId> instead of
  reserve/insert loop
- Add ExpandOrderByExpandedSymbol: ORDER BY m.prop with scan on n
- Add CompositeRangeOnFirstOrderBySecond: range on a, ORDER BY b
- Add CompositeIndexPartialCoverage: index (a,b), ORDER BY a,b,c
- Replace misleading ReboundSymbolSameNameViaUnwind with
  WithoutRenamingAllowsElimination (WITH identity mapping)
- Fix test 14 comment: "Partial composite" -> "Full composite ORDER BY"
- Fix test 17 comment: remove contradictory first sentence
Replace equality-skip logic with simple prefix matching: ORDER BY
columns must exactly match a leading prefix of index columns (e.g.,
for index (a,b,c) only ORDER BY a, a+b, or a+b+c qualify). Cases
like WHERE a=5 ORDER BY b no longer eliminate the OrderBy operator.
Replace ProduceRenamesVariable with ResolveProduceMapping that
propagates the scan symbol name upward through Produce operators,
tracking renames (e.g., WITH n AS m). This allows ORDER BY
elimination to fire through variable renames.

Handle Cypher's dual-scope semantics where ORDER BY can reference
both the input and output scope of a RETURN/WITH clause (e.g.,
RETURN n AS m ORDER BY n.prop and ORDER BY m.prop are both valid).
@imilinovic imilinovic force-pushed the feat/index-order-by branch from ed20c69 to 0c4a3a8 Compare March 25, 2026 11:39
@sonarqubecloud
Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI -build=coverage -test=clang_tidy CI -build=debug -test=core Run debug build and core tests on push CI -build=debug -test=integration Run debug build and integration tests on push CI -build=release -test=e2e Run release build and e2e tests on push Docs - changelog only Docs - changelog only feature feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant