perf(sql): apply parallel top-K to queries with SELECT projections#6993
Conversation
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
WalkthroughPeels projection wrappers to reach the base page-frame leaf, optionally steals compiled filter/bind state to build an Async TopK path, translates ORDER BY keys to base coordinates, and rewraps TopK under projection wrappers when applicable. Adds explain-plan and runtime tests covering these cases. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
c6c1548 to
436e431
Compare
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
core/src/main/java/io/questdb/griffin/SqlCodeGenerator.java (1)
6697-6715: Rename the boolean to use the required prefix.
allBaseColumnsshould use anis.../has...boolean name, e.g.hasOnlyBaseColumns.♻️ Proposed rename
- boolean allBaseColumns = true; + boolean hasOnlyBaseColumns = true; final IntList orderByBaseIndices = new IntList(listColumnFilterA.size()); for (int i = 0, n = listColumnFilterA.size(); i < n; i++) { int signed = listColumnFilterA.getQuick(i); int wrappedIdx = (signed > 0 ? signed : -signed) - 1; Function fn = wrappedFunctions.getQuick(wrappedIdx); if (!(fn instanceof ColumnFunction columnFn)) { - allBaseColumns = false; + hasOnlyBaseColumns = false; break; } int baseIdx = priorityMetadata.getBaseColumnIndex(columnFn.getColumnIndex()); if (baseIdx < 0) { - allBaseColumns = false; + hasOnlyBaseColumns = false; break; } orderByBaseIndices.add(baseIdx); } - if (allBaseColumns) { + if (hasOnlyBaseColumns) {As per coding guidelines, "When choosing a name for a boolean variable, field or method, always use the is... or has... prefix, as appropriate."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/io/questdb/griffin/SqlCodeGenerator.java` around lines 6697 - 6715, Rename the boolean local variable allBaseColumns to a prefixed boolean name (e.g., hasOnlyBaseColumns or isOnlyBaseColumns) and update all local references in this block accordingly: the declaration and the two assignments where it is set to false, plus the if check "if (allBaseColumns)". This change should be made in the loop that iterates over listColumnFilterA, uses wrappedFunctions and ColumnFunction, computes baseIdx via priorityMetadata.getBaseColumnIndex(columnFn.getColumnIndex()), and later checks orderByBaseIndices; keep all logic identical, only rename the symbol consistently within this scope.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@core/src/main/java/io/questdb/griffin/SqlCodeGenerator.java`:
- Around line 6743-6749: The virtual projection rewrap is using
virtualFactory.getMetadata() which loses the adjusted ordered metadata used for
ORDER BY on a passthrough timestamp column; update the
VirtualRecordCursorFactory construction to pass the preserved orderedMetadata
(the same metadata used on the SelectedRecord path) instead of
virtualFactory.getMetadata(), so the timestamp-index adjustment remains
available to downstream/nested operators.
In `@core/src/test/java/io/questdb/test/griffin/LimitTest.java`:
- Around line 1405-1407: The tests in LimitTest are using assertSql(...) which
only validates rows and misses cursor factory/leak properties; replace the two
assertSql(...) calls (and the similar ones at lines referencing the same test
block) with assertQueryNoLeakCheck(...) calls so the assertions also verify
cursor factory/TopK optimization behavior — specifically change the assertSql
invocations in LimitTest (the calls supplying the SQL like "select x, * from tab
where ts2 in '1970' order by ts2 desc limit 5" and "select x, * from tab where
ts2 in '2099' order by ts2 desc limit 5", plus the related group at 1428-1433)
to use assertQueryNoLeakCheck(...) with the same expected output and SQL
arguments.
---
Nitpick comments:
In `@core/src/main/java/io/questdb/griffin/SqlCodeGenerator.java`:
- Around line 6697-6715: Rename the boolean local variable allBaseColumns to a
prefixed boolean name (e.g., hasOnlyBaseColumns or isOnlyBaseColumns) and update
all local references in this block accordingly: the declaration and the two
assignments where it is set to false, plus the if check "if (allBaseColumns)".
This change should be made in the loop that iterates over listColumnFilterA,
uses wrappedFunctions and ColumnFunction, computes baseIdx via
priorityMetadata.getBaseColumnIndex(columnFn.getColumnIndex()), and later checks
orderByBaseIndices; keep all logic identical, only rename the symbol
consistently within this scope.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 04db8ff0-71c6-4c29-a902-c462c80a905b
📒 Files selected for processing (3)
core/src/main/java/io/questdb/griffin/SqlCodeGenerator.javacore/src/test/java/io/questdb/test/griffin/ExplainPlanTest.javacore/src/test/java/io/questdb/test/griffin/LimitTest.java
d56d79f to
afcf38e
Compare
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
@puzpuzpuz @bluestreak01 would you mind taking a look when you get a chance? Thanks! |
|
There seems to be a slightly simpler approach. Could you take a look? PR #6993 — Simpler approachA proposal to collapse the three near-identical filter-stealing branches in Problem with the current PRThe PR adds two new disjuncts (SelectedRecord peel, VirtualRecord peel) next to the existing direct filter-stealing branch. All three do the same high-level work:
The only thing that varies between the three branches is how an ORDER BY column index expressed in the outer metadata is translated to the column index in the base (page-frame) metadata:
Because that one varying piece is inlined into each branch, each branch carries its own copy of the steal/build/re-wrap scaffold. The author's own tradeoffs section acknowledges this:
The result: the VirtualRecord rewrap initially passed Proposed shapePush the translation onto the wrapper factory, and have one top-K branch consume it. The default on 1. A translation contract on the factory// RecordCursorFactory.java
/**
* Translates an ORDER BY column index expressed in this factory's output
* metadata to the corresponding column index in the base (page-frame) metadata.
*
* Returns the input unchanged for factories that do not re-arrange or hide
* base columns. Returns a negative value if the projected column cannot be
* resolved to a base column (e.g. a computed VirtualRecord column); callers
* must fall back to the generic sort path in that case.
*/
default int translateOrderByColumnToBase(int projectedIndex) {
return projectedIndex;
}2. Wrapper overrides// SelectedRecordCursorFactory.java
@Override
public int translateOrderByColumnToBase(int projectedIndex) {
return columnCrossIndex.getQuick(projectedIndex);
}// VirtualRecordCursorFactory.java
@Override
public int translateOrderByColumnToBase(int projectedIndex) {
Function fn = functions.getQuick(projectedIndex);
if (!(fn instanceof ColumnFunction columnFn)) {
return -1;
}
return priorityMetadata.getBaseColumnIndex(columnFn.getColumnIndex());
}3. One top-K branch in
|
afcf38e to
c84eea5
Compare
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
core/src/main/java/io/questdb/griffin/SqlCodeGenerator.java (1)
1008-1008: Useis.../has...prefixes for new boolean names.
canReachPageFrameLeafForTopKandallKeysResolvedare new boolean names but do not follow the repository convention.Proposed rename
- private static boolean canReachPageFrameLeafForTopK(RecordCursorFactory f) { + private static boolean hasReachablePageFrameLeafForTopK(RecordCursorFactory f) { @@ - if (parallelTopKEnabled && canReachPageFrameLeafForTopK(recordCursorFactory)) { + if (parallelTopKEnabled && hasReachablePageFrameLeafForTopK(recordCursorFactory)) { @@ - boolean allKeysResolved = true; + boolean hasResolvedAllKeys = true; @@ - allKeysResolved = false; + hasResolvedAllKeys = false; @@ - if (allKeysResolved) { + if (hasResolvedAllKeys) {As per coding guidelines, "When choosing a name for a boolean variable, field or method, always use the is... or has... prefix, as appropriate."
Also applies to: 6625-6654
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/io/questdb/griffin/SqlCodeGenerator.java` at line 1008, Rename boolean identifiers to use is/has prefixes: change the method canReachPageFrameLeafForTopK to isPageFrameLeafReachableForTopK (or similar is-prefixed name) and rename the boolean allKeysResolved to areAllKeysResolved or hasAllKeysResolved; update all call sites, tests, JavaDoc/comments and imports to use the new names, preserve method signature/visibility, and run a compile to catch any missed references (also apply the same renaming pattern for the related occurrences in the 6625-6654 region).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@core/src/main/java/io/questdb/cairo/sql/RecordCursorFactory.java`:
- Line 27: Change the rewrapOverTopK API to accept the metadata interface and
preserve existing wrapper chains: remove the GenericRecordMetadata import,
change RecordCursorFactory.rewrapOverTopK parameter type from
GenericRecordMetadata to RecordMetadata, and modify
SelectedRecordCursorFactory.rewrapOverTopK() and
VirtualRecordCursorFactory.rewrapOverTopK() so they first call
base.rewrapOverTopK(topK, base.getMetadata()) (i.e., recursively rewrap the
base) before applying their own wrapping logic to retain inner wrapper column
mappings/functions.
In `@core/src/main/java/io/questdb/griffin/SqlCodeGenerator.java`:
- Around line 1450-1484: Wrap the block that calls
compileWorkerFiltersConditionally(...) and constructs
AsyncTopKRecordCursorFactory in a try/catch/finally so that if an exception
occurs while evaluating constructor arguments you explicitly release the stolen
state you took ownership of after filterFactory.halfClose(): if
stolenCompiledFilter, stolenBindVarMemory, stolenBindVarFunctions, stolenFilter
or stolenFilterUsedColumnIndexes are non-null, call their appropriate cleanup
methods (close()/free()/clear() as applicable), set them to null, and also
release baseOrderedMetadata (call its release/close method) before rethrowing;
ensure the final created AsyncTopKRecordCursorFactory still receives valid
references when no exception occurs.
---
Nitpick comments:
In `@core/src/main/java/io/questdb/griffin/SqlCodeGenerator.java`:
- Line 1008: Rename boolean identifiers to use is/has prefixes: change the
method canReachPageFrameLeafForTopK to isPageFrameLeafReachableForTopK (or
similar is-prefixed name) and rename the boolean allKeysResolved to
areAllKeysResolved or hasAllKeysResolved; update all call sites, tests,
JavaDoc/comments and imports to use the new names, preserve method
signature/visibility, and run a compile to catch any missed references (also
apply the same renaming pattern for the related occurrences in the 6625-6654
region).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: c4dd53df-f910-4c00-a8d7-97e3227a0629
📒 Files selected for processing (6)
core/src/main/java/io/questdb/cairo/sql/RecordCursorFactory.javacore/src/main/java/io/questdb/griffin/SqlCodeGenerator.javacore/src/main/java/io/questdb/griffin/engine/table/SelectedRecordCursorFactory.javacore/src/main/java/io/questdb/griffin/engine/table/VirtualRecordCursorFactory.javacore/src/test/java/io/questdb/test/griffin/ExplainPlanTest.javacore/src/test/java/io/questdb/test/griffin/LimitTest.java
🚧 Files skipped from review as they are similar to previous changes (2)
- core/src/test/java/io/questdb/test/griffin/LimitTest.java
- core/src/test/java/io/questdb/test/griffin/ExplainPlanTest.java
e00b8b1 to
34ca865
Compare
|
@coderabbitai review this PR critically |
🧩 Analysis chain🏁 Script executed: Length of output: 18508 🏁 Script executed: Length of output: 17664 🏁 Script executed: Length of output: 7165 🏁 Script executed: Length of output: 10125 🏁 Script executed: Length of output: 5325 🏁 Script executed: Length of output: 8225
Critical Review: PR
|
There was a problem hiding this comment.
Actionable comments posted: 5
♻️ Duplicate comments (1)
core/src/main/java/io/questdb/griffin/SqlCodeGenerator.java (1)
1461-1495:⚠️ Potential issue | 🟠 MajorClean up helper-owned resources until
AsyncTopKRecordCursorFactoryowns them.The catch only covers worker-filter compilation. If the
AsyncTopKRecordCursorFactoryconstructor throws,baseOrderedMetadata,perWorkerFilters, and stolen filter state remain owned by this helper and are leaked.Proposed cleanup guard
- final ObjList<Function> perWorkerFilters; + ObjList<Function> perWorkerFilters = null; try { perWorkerFilters = compileWorkerFiltersConditionally( executionContext, stolenFilter, executionContext.getSharedQueryWorkerCount(), stolenFilterExpr, baseMetadata ); + + return new AsyncTopKRecordCursorFactory( + executionContext.getCairoEngine(), + configuration, + executionContext.getMessageBus(), + baseOrderedMetadata, + pageFrameLeaf, + stolenFilter, + stolenFilterUsedColumnIndexes, + perWorkerFilters, + stolenCompiledFilter, + stolenBindVarMemory, + stolenBindVarFunctions, + recordComparatorCompiler, + baseOrderByFilter, + baseMetadata, + lo, + executionContext.getSharedQueryWorkerCount() + ); } catch (Throwable th) { + Misc.free(baseOrderedMetadata); + Misc.freeObjList(perWorkerFilters); Misc.free(stolenCompiledFilter); Misc.free(stolenBindVarMemory); Misc.freeObjList(stolenBindVarFunctions); Misc.free(stolenFilter); throw th; } - - return new AsyncTopKRecordCursorFactory( - executionContext.getCairoEngine(), - configuration, - executionContext.getMessageBus(), - baseOrderedMetadata, - pageFrameLeaf, - stolenFilter, - stolenFilterUsedColumnIndexes, - perWorkerFilters, - stolenCompiledFilter, - stolenBindVarMemory, - stolenBindVarFunctions, - recordComparatorCompiler, - baseOrderByFilter, - baseMetadata, - lo, - executionContext.getSharedQueryWorkerCount() - );🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/io/questdb/griffin/SqlCodeGenerator.java` around lines 1461 - 1495, The AsyncTopKRecordCursorFactory construction can throw and currently leaks helper-owned resources (baseOrderedMetadata, perWorkerFilters, stolenFilter, stolenCompiledFilter, stolenBindVarMemory, stolenBindVarFunctions); wrap the new AsyncTopKRecordCursorFactory(...) call in a try block and in the corresponding catch/finally free those resources if the constructor throws (use Misc.free / Misc.freeObjList / Misc.free as used elsewhere), or move ownership transfer into a method that only relinquishes ownership after successful constructor return, ensuring symbols AsyncTopKRecordCursorFactory, perWorkerFilters, baseOrderedMetadata, stolenFilter, stolenCompiledFilter, stolenBindVarMemory, and stolenBindVarFunctions are properly released on exceptions.
🧹 Nitpick comments (1)
core/src/main/java/io/questdb/griffin/SqlCodeGenerator.java (1)
1008-1008: Useis.../has...names for new booleans.
canReachPageFrameLeafForTopK,parallelTopKEnabled, andallKeysResolvedviolate the repo boolean naming rule.Suggested rename
- private static boolean canReachPageFrameLeafForTopK(RecordCursorFactory f) { + private static boolean hasReachablePageFrameLeafForTopK(RecordCursorFactory f) {- final boolean parallelTopKEnabled = executionContext.isParallelTopKEnabled(); - if (parallelTopKEnabled && canReachPageFrameLeafForTopK(recordCursorFactory)) { + final boolean isParallelTopKEnabled = executionContext.isParallelTopKEnabled(); + if (isParallelTopKEnabled && hasReachablePageFrameLeafForTopK(recordCursorFactory)) {- boolean allKeysResolved = true; + boolean hasResolvedAllKeys = true;As per coding guidelines, "When choosing a name for a boolean variable, field or method, always use the is... or has... prefix, as appropriate."
Also applies to: 6635-6650
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/io/questdb/griffin/SqlCodeGenerator.java` at line 1008, Rename boolean method and variables to follow is/has convention: change method canReachPageFrameLeafForTopK to isPageFrameLeafReachableForTopK (or similar is... name), rename field/variable parallelTopKEnabled to isParallelTopKEnabled, and rename allKeysResolved to areAllKeysResolved (or hasAllKeysResolved if semantically better). Update all call sites and tests that reference canReachPageFrameLeafForTopK, parallelTopKEnabled, and allKeysResolved (and any related occurrences around 6635-6650) to use the new identifiers to keep compilation consistent and adhere to the boolean naming guideline.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@core/src/main/java/io/questdb/griffin/engine/table/SelectedRecordCursorFactory.java`:
- Around line 246-249: The current
SelectedRecordCursorFactory.translateOrderByColumnToBase calls
columnCrossIndex.getQuick(projectedIndex) without bounds checking and can throw
IndexOutOfBoundsException; change it to first check projectedIndex against
columnCrossIndex.size() and if out of range return a negative value per the
contract, otherwise retrieve the cross-index and call
base.translateOrderByColumnToBase with that value; ensure you reference
columnCrossIndex.getQuick(projectedIndex), columnCrossIndex.size() and
base.translateOrderByColumnToBase(...) in the updated implementation.
In
`@core/src/main/java/io/questdb/griffin/engine/table/VirtualRecordCursorFactory.java`:
- Around line 187-198: The method translateOrderByColumnToBase should guard
against out-of-range projectedIndex when calling
functions.getQuick(projectedIndex); add a cheap bounds check using
functions.size() (or equivalent) before calling getQuick and return -1 if
projectedIndex is negative or >= functions.size(), then proceed to cast to
ColumnFunction, call
priorityMetadata.getBaseColumnIndex(columnFn.getColumnIndex()), and return -1
for negative baseIdx or the result of base.translateOrderByColumnToBase(baseIdx)
otherwise; update the method translateOrderByColumnToBase and references to
functions.getQuick to follow this defensive behavior.
In `@core/src/main/java/io/questdb/griffin/SqlCodeGenerator.java`:
- Around line 6680-6682: The return path in SqlCodeGenerator where it does
"return projectionWrapper == null ? topK :
projectionWrapper.rewrapOverTopK(topK, orderedMetadata);" leaks the earlier
allocated orderedMetadata when projectionWrapper is null; adjust this branch so
that before returning topK you free or release orderedMetadata (the ORDER BY
metadata created earlier, not baseOrderedMetadata created by the topK helper).
Specifically, in the block handling TopK/ORDER BY, detect the
projectionWrapper==null case and call the appropriate release/dispose method on
orderedMetadata (or transfer ownership to baseOrderedMetadata if intended) so
orderedMetadata is not leaked; keep the rewrapOverTopK call unchanged for the
projectionWrapper!=null path.
In `@core/src/test/java/io/questdb/test/griffin/ExplainPlanTest.java`:
- Around line 8771-8877: Add a new test in ExplainPlanTest that mirrors the
existing wrapper-peeling cases but uses a multi-key ORDER BY (e.g., "order by
str desc, x asc limit 10") to ensure translation of all keys; create a test
method (e.g., testSelectWhereOrderByLimit_multiKeyWrapper) that calls assertPlan
with a table like "create table xx ( x long, str varchar )" and a query such as
"select x + 1 as xp, x from xx where str is not null order by str desc, x asc
limit 10" and assert the expected plan shows keys: [str desc, x asc] flowing
through SelectedRecord/VirtualRecord to the Async JIT Top K (or Sort light as
appropriate) to validate full-key translation end-to-end.
In `@core/src/test/java/io/questdb/test/griffin/LimitTest.java`:
- Around line 1391-1408: The test creates both ts and ts2 as x::timestamp so
ORDER BY ts2 vs ts isn’t exercised; change the table creation SQL in LimitTest
(the execute("create table tab as (...") that defines ts and ts2) to produce a
different ts2 (e.g. ts as x::timestamp and ts2 as (1001 - x)::timestamp or
otherwise reversed/distinct sequence) so ordering on ts2 yields different rows,
and update the corresponding expected result strings; apply the same change to
the other identical fixture occurrence that currently defines ts and ts2 the
same.
---
Duplicate comments:
In `@core/src/main/java/io/questdb/griffin/SqlCodeGenerator.java`:
- Around line 1461-1495: The AsyncTopKRecordCursorFactory construction can throw
and currently leaks helper-owned resources (baseOrderedMetadata,
perWorkerFilters, stolenFilter, stolenCompiledFilter, stolenBindVarMemory,
stolenBindVarFunctions); wrap the new AsyncTopKRecordCursorFactory(...) call in
a try block and in the corresponding catch/finally free those resources if the
constructor throws (use Misc.free / Misc.freeObjList / Misc.free as used
elsewhere), or move ownership transfer into a method that only relinquishes
ownership after successful constructor return, ensuring symbols
AsyncTopKRecordCursorFactory, perWorkerFilters, baseOrderedMetadata,
stolenFilter, stolenCompiledFilter, stolenBindVarMemory, and
stolenBindVarFunctions are properly released on exceptions.
---
Nitpick comments:
In `@core/src/main/java/io/questdb/griffin/SqlCodeGenerator.java`:
- Line 1008: Rename boolean method and variables to follow is/has convention:
change method canReachPageFrameLeafForTopK to isPageFrameLeafReachableForTopK
(or similar is... name), rename field/variable parallelTopKEnabled to
isParallelTopKEnabled, and rename allKeysResolved to areAllKeysResolved (or
hasAllKeysResolved if semantically better). Update all call sites and tests that
reference canReachPageFrameLeafForTopK, parallelTopKEnabled, and allKeysResolved
(and any related occurrences around 6635-6650) to use the new identifiers to
keep compilation consistent and adhere to the boolean naming guideline.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 95117598-7562-44ab-942c-54ee25bcf2d9
📒 Files selected for processing (6)
core/src/main/java/io/questdb/cairo/sql/RecordCursorFactory.javacore/src/main/java/io/questdb/griffin/SqlCodeGenerator.javacore/src/main/java/io/questdb/griffin/engine/table/SelectedRecordCursorFactory.javacore/src/main/java/io/questdb/griffin/engine/table/VirtualRecordCursorFactory.javacore/src/test/java/io/questdb/test/griffin/ExplainPlanTest.javacore/src/test/java/io/questdb/test/griffin/LimitTest.java
94d846c to
5481b45
Compare
9e05093 to
e74ba04
Compare
|
@puzpuzpuz thanks for the careful review. All three items addressed in 1. Critical —
|
|
Hello @puzpuzpuz addressed all the concerns and issues , Please take a look |
198b25d to
d23cbd4
Compare
|
Hello @puzpuzpuz Gentle Reminder to please review the PR. |
|
@DHRUV6029 sorry for the delay. Your PR is on my radar. Going to check the updates today. |
|
Minor review comments only. Would be great to address item 1 while we're on it. Others are nit. Minor1.
|
58a9c18 to
746678e
Compare
|
@puzpuzpuz all 5 minor items addressed in
Local regression run after the fixes: |
|
|
||
| @Override | ||
| public boolean isProjection() { | ||
| return true; |
There was a problem hiding this comment.
When isProjection() returns true, it means the following:
Returns true if the factory stands for nothing more but a projection, so that
the above factory (e.g. a parallel GROUP BY one) can steal the projection.
Projection consist of cross-indexes and metadata columns.
VirtualRecordCursorFactory can return arbitrary SQL functions, not just projected base factory's columns.
|
Critical — ReproducerCREATE TABLE trades (ts TIMESTAMP, sym SYMBOL, price DOUBLE) TIMESTAMP(ts) PARTITION BY DAY;
CREATE TABLE prices (ts TIMESTAMP, sym SYMBOL, price DOUBLE) TIMESTAMP(ts) PARTITION BY DAY;
INSERT INTO trades VALUES
('2023-01-01T09:10:00.000000Z', 'AAA', 100.0),
('2023-01-01T09:11:00.000000Z', 'BBB', 200.0),
('2023-01-01T09:12:00.000000Z', 'CCC', 300.0),
('2023-01-01T09:13:00.000000Z', 'DDD', 400.0);
INSERT INTO prices VALUES ('2023-01-01T09:00:00.000000Z', 'AAA', 1.0);
SELECT t.sym, t.price, t.ts, sum(p.price) AS window_price
FROM trades t
WINDOW JOIN prices p ON (0 = 1)
RANGE BETWEEN 1 MINUTE PRECEDING AND 1 MINUTE FOLLOWING
ORDER BY t.sym DESC LIMIT 3;
A second variant — Root cause
Pre-PR the gate wrapped Tie-in with r3180067265Same disease. The Javadoc on FixThree options. The cleanest, IMO, is the first:
Either way, please add a regression test using the SQL above (both the "ORDER BY real column" silent-rows variant and the "ORDER BY null column" assertion variant — the second catches the runtime path with |
|
Smaller items Moderate — missing regression test for
|
87d3520 to
d2c3e84
Compare
puzpuzpuz
left a comment
There was a problem hiding this comment.
The PR is almost good to go, thanks for handling all of the review feedback. Could you address the following moderate level items? The fixes are pretty straight-forward, so it shouldn't take long.
Moderate
M1 — Latent: recursive rewrapOverTopK passes pre-ORDER-BY metadata down the chain
Files: core/src/main/java/io/questdb/griffin/engine/table/SelectedRecordCursorFactory.java:218-224, core/src/main/java/io/questdb/griffin/engine/table/VirtualRecordCursorFactory.java:169-181 (in-diff)
Both overrides do:
RecordCursorFactory rewrappedBase = base.rewrapOverTopK(topK, base.getMetadata());base.getMetadata() is the wrapper's pre-ORDER-BY metadata, with the original timestamp index. Today the predicate canReachPageFrameLeafForTopK peels at most one wrapper, so the recursive call always lands on a non-projection base whose default rewrapOverTopK ignores the metadata argument — the bug is dormant. But the file comment at SqlCodeGenerator.java:1052-1056 explicitly contemplates widening the predicate. The moment a future change lets the chain run two layers deep (e.g. SR(VR(filter(pageFrame)))), the inner wrapper's getMetadata().getTimestampIndex() will lie about ordering whenever AsyncTopK's baseOrderedMetadata set the timestamp index to -1 (ORDER BY a non-timestamp column).
Fix: either thread the gate's orderedMetadata down each layer with the appropriate timestamp index, or assert !base.isProjection(); at the recursive call site so the trap is loud if the predicate is ever widened.
M2 — Comment in the keyed AsOf guard describes the wrong factory and wrong failure mode
File: core/src/main/java/io/questdb/griffin/SqlCodeGenerator.java:4420-4424 (in-diff)
The comment in the keyed AsOf branch reads:
Without it, FilteredAsOfJoinNoKeyFastRecordCursorFactory silently produces wrong results …
But this branch builds FilteredAsOfJoinFastRecordCursorFactory (line 4396), not the NoKey variant. The PR description itself notes that the keyed branch trips assert stolenCrossIndex != null — i.e. it asserts, it does not silently produce wrong results. The non-keyed branch at line 4556 correctly references the NoKey class and the silent-wrong-results path.
Fix: rewrite the keyed comment to reference FilteredAsOfJoinFastRecordCursorFactory and the assertion failure, so future readers don't conflate the two failure modes.
M3 — isProjection() Javadoc is now inconsistent with the widened semantics
File: core/src/main/java/io/questdb/cairo/sql/RecordCursorFactory.java:270-282 (out-of-diff but directly relevant)
The Javadoc still reads:
Returns true if the factory stands for nothing more but a projection… Projection consist of cross-indexes and metadata columns.
VirtualRecordCursorFactory.isProjection() now returns true even though the wrapper holds arbitrary SQL functions (not pure column references). The PR's defence is the new && slave.getColumnCrossIndex() != null guard at the two AsOf join sites, which correctly excludes VirtualRecord. Without a doc fix, future callers of isProjection() will keep assuming the old narrower contract (cross-index always non-null).
Fix: update the Javadoc to document the widened contract — something like "the wrapper hides a steal-able base; cross-index/base-column-name are guaranteed only when getColumnCrossIndex() != null".
M4 — ExtraNullColumnCursorFactory is the third isProjection()=true class and silently relies on never reaching the new gate
Files: core/src/main/java/io/questdb/griffin/engine/table/ExtraNullColumnCursorFactory.java:145, 169-171 (out-of-diff)
ExtraNullColumnCursorFactory.isProjection() returns true (pre-existing), getColumnCrossIndex() is the default null, and supportsPageFrameCursor() forwards to base. It does not override translateOrderByColumnToBase or rewrapOverTopK. Today it only wraps a window-join master (which doesn't support page-frame cursors), so canReachPageFrameLeafForTopK rejects it and the gate doesn't fire.
The arrangement is fragile: any future change that makes ExtraNull wrap a page-frame-capable base, or any base whose supportsFilterStealing() is true, will silently flow into the new gate and produce wrong results because:
- the default identity
translateOrderByColumnToBaseignores the null-padding split, and - the default
rewrapOverTopKreturns the baretopKand drops the null-padding wrapper entirely (output column count would shrink).
Fix: either gate the new path on getColumnCrossIndex() != null (mirroring the AsOf fix at lines 4420 and 4556) or have ExtraNull explicitly override both methods even with no-op semantics that fall back to Sort light.
d2c3e84 to
038d422
Compare
|
@puzpuzpuz Thanks for the deep review. Applied option 1 (your preferred): introduced a dedicated
Regression tests (LimitTest):
Both new tests fail on the prior version of this branch and pass after the fix. Full local sweep: The earlier smaller items from your 2026-05-04T08:24:13Z comment (memoizer test, Ready for another look. |
038d422 to
2b830ee
Compare
Queries of the form SELECT <projection> FROM t WHERE ... ORDER BY ...
LIMIT N previously fell through to Sort light whenever the SELECT
list contained a non-literal expression, because the projection
wrapper (SelectedRecord or VirtualRecord) hid the steal-able filter
from the parallel top-K gate in SqlCodeGenerator#generateOrderBy.
Collapse the three filter-stealing disjuncts (direct, SelectedRecord
peel, VirtualRecord peel) into one. The projection-specific pieces
move onto the wrapper factories:
- translateOrderByColumnToBase maps an ORDER BY column index from the
outer metadata to the base page-frame metadata. The default on
RecordCursorFactory returns the input unchanged. Wrapper overrides
chain through the base factory, so nested wrappers (e.g.
SelectedRecord over VirtualRecord) are handled by delegation rather
than a dedicated disjunct.
- rewrapOverTopK re-applies the projection on top of the freshly
built AsyncTopK factory. Each wrapper implements it with a single
constructor call and receives the same orderedMetadata argument,
so the two paths cannot drift.
- canPeelForTopK is a dedicated predicate that the top-K gate uses
in place of isProjection. Selected and Virtual override it to true;
every other factory (including ExtraNullColumn) inherits the false
default. This keeps isProjection at its original narrow meaning
("cross-indexes and metadata columns") for the GROUP BY / AsOf peel
consumers, and lets the top-K gate name exactly the invariant it
needs: "I can rewrite my ORDER BY into a base column and rebuild
myself over a new top-K." ExtraNullColumn cannot do either (its
null-column splice has no base counterpart), so it falls through
the gate as projectionWrapper=null and top-K iterates the splice
directly via its forwarded page-frame cursor, preserving the
spliced columns in the output.
The unified branch calls translateOrderByColumnToBase on every ORDER
BY key in a single pass: if any key cannot be resolved to a base
column (e.g. a computed VirtualRecord column), the branch bails and
Sort light handles the query. Otherwise it builds AsyncTopK on the
page-frame leaf via buildAsyncTopKOverStolenFilter and re-wraps
through rewrapOverTopK when a projection wrapper is present.
The two SqlOptimiserTest.testJoinAndUnionQuery... plan assertions are
updated to reflect the new top-K shape: SelectedRecord now sits above
the Async Top K (not below) and keys reference the base column ts
(not the projected alias LAST), matching the unified branch's design.
Tests:
- ExplainPlanTest: testSelectWhereOrderByLimit1..9 cover the bug fix,
the direct page-frame case, the SelectedRecord peel, the
VirtualRecord peel, the no-filter projection path, and the
rejection guards for ORDER BY on a computed column and two-bound
LIMIT. testSelectWhereOrderByLimit_virtualRecordPeel pins the
single-layer VirtualRecord peel; testSelectWhereOrderByLimit_memoizedPassthrough
pins the ColumnFunction.unwrap path that peels MemoizerFunction
wrappers around passthrough columns.
- LimitTest: testTopKThroughProjection and
testTopKThroughVirtualProjection exercise row-level correctness
and repeat the virtual-projection query to catch any leak in
filter, function, page-frame, or comparator state across the
steal / halfClose / transfer boundary.
testTopKDoesNotPeelExtraNullColumnSplice and the
...OrderByNullColumn variant pin the WINDOW JOIN ON (0=1) shape
that constructs ExtraNullColumnCursorFactory: the splice must
survive the gate (output keeps all four columns), and ORDER BY a
spliced null column must not assertion-fail or drop rows.
- AsOfJoinTest.testAsOfJoinWithTopDownFilteredDateaddTimestampProjection
and testAsOfJoinNoKeyWithTopDownFilteredDateaddTimestampProjection
pin the keyed and non-keyed virtual-slave-projection paths against
isProjection regressions.
Fixes questdb#6528
4ec73b4 to
13bf78d
Compare
|
Thanks for the contribution! |
Fixes #6528
Summary
Extend the parallel top-K gate in
SqlCodeGenerator#generateOrderByso it also fires when a column-projection wrapper (SelectedRecordCursorFactoryorVirtualRecordCursorFactory) sits between theORDER BY ... LIMIT Nand the filtered page-frame scan. Before this change, any non-literal SELECT list combined with aWHEREclause forced the query onto the genericSort lightpath, because the projection wrapper hid the steal-able filter from the gate.Problem
Reproduction from the issue:
SELECT * FROM tab WHERE ts2 IN '2025' ORDER BY ts2 DESC LIMIT 10SELECT x FROM tab WHERE ts2 IN '2025' ORDER BY ts2 DESC LIMIT 10SELECT x, * FROM tab WHERE ts2 IN '2025' ORDER BY ts2 DESC LIMIT 10SELECT x + 1 AS xp, ts2 FROM tab WHERE ts2 IN '2025' ORDER BY ts2 DESC LIMIT 10The first two cases already hit the top-K gate because the generated plan is
AsyncJitFilter -> PageFramewith no wrapper in between. The third and fourth cases generate an extraSelectedRecord(duplicate columns) orVirtualRecord(computed column) above the filter, and the gate'ssupportsFilterStealing()/supportsPageFrameCursor()checks look only at the immediate child, so they return false and fall through toSort light. The result set is correct, but the query materializes and sorts every matching row instead of keeping a bounded heap of N.Approach
The codebase already has a projection-peel precedent in
generateAsOfJoinandgenerateLatestBy: when a projection wrapper hides a useful optimization target, peel the wrapper, apply the optimization to the inner factory, and re-wrap.This PR collapses the direct page-frame, direct filter-stealing, and projection-peel shapes into a single unified branch in
generateOrderBy, driven by two default methods onRecordCursorFactory:translateOrderByColumnToBase(int projectedIndex)— returns the corresponding column index in the base metadata, or a negative value when the projected column cannot be resolved (e.g., a computedVirtualRecordcolumn). The default is identity;SelectedRecordCursorFactoryandVirtualRecordCursorFactoryoverride it to chain through their base, so nested wrappers resolve transparently.rewrapOverTopK(RecordCursorFactory topK, RecordMetadata orderedMetadata)— re-wraps a freshly-built top-K factory so the output shape is preserved. The default is a pass-through. Projection wrappers override to re-create themselves over the new base.A cheap structural predicate
canReachPageFrameLeafForTopKdecides whether the gate should attempt the peel at all. When it holds, a shared helperbuildAsyncTopKOverStolenFiltersteals the filter state from the inner filter factory viahalfClose(), buildsAsyncTopKRecordCursorFactoryon the page-frame leaf, and the call site re-wraps via the projection wrapper's override. Ownership of the half-closed inner base transfers to the new top-K factory, matching the AsOf/LatestBy peel precedent.The predicate and the call site peel a single projection layer. Two or more stacked projection wrappers above the filter (rare today) fall through to
Sort lightrather than recursing.Incidental fix: AsOf join peel guards (keyed and non-keyed)
Widening
isProjection()to also returntrueonVirtualRecordCursorFactoryis required by the unified branch above. That widening incidentally made two existing branches inSqlCodeGenerator#generateJoinAsofreachable forVirtualRecordslaves: the keyed peel at line 4369 and the non-keyed peel at line 4502. Both peel a slave projection into aFilteredAsOfJoin*FastRecordCursorFactoryand require a non-null column cross index — onlySelectedRecordprovides one. Before this PR,isProjection()was effectively equivalent to "isSelectedRecord", so the branches relied on that implicit invariant.The keyed branch trips
assert stolenCrossIndex != null. The non-keyed branch is worse: with assertions disabled,FilteredAsOfJoinNoKeyFastRecordCursorFactorytreats a null cross index as "no projection remap" and would silently drop theVirtualRecordlayer's column translation, producing wrong results.This PR adds an explicit
getColumnCrossIndex() != nullguard at both call sites soVirtualRecordslaves fall through to whichever later branch catches them — the same path they took before the widening. The fix is a single extra condition per branch; the AsOf branches' logic is otherwise untouched and the deferred AsOf refactor remains a follow-up (as noted during review). Two regression tests pin the fix:AsOfJoinTest.testAsOfJoinWithTopDownFilteredDateaddTimestampProjectioncovers the keyed path, andAsOfJoinTest.testAsOfJoinNoKeyWithTopDownFilteredDateaddTimestampProjectioncovers the non-keyed path with row-level assertions, so a regression surfaces both with and without-ea.Alternatives considered
AsyncTopKRecordCursorFactoryto accept a projection wrapper directly. Rejected: it would spread projection concerns into the execution engine and duplicate the column-translation logic thatSelectedRecordCursorFactoryalready owns.isProjection()consumers elsewhere in the generator.ORDER BYon computed (non-passthrough) columns. Rejected for this PR:RecordComparatorCompileroperates on the base metadata, and lifting computed keys into it is a larger, separate change. The passthrough-only constraint keeps row-level semantics identical to the non-peeled plan.VirtualRecordCursorFactory.isProjection()tofalseand keep the AsOf branch untouched. Rejected: it would force the top-K gate back toinstanceof VirtualRecordCursorFactorychecks, diverging from the unified-branch design.Changes
core/src/main/java/io/questdb/cairo/sql/RecordCursorFactory.java— two new default methods:translateOrderByColumnToBaseandrewrapOverTopK.core/src/main/java/io/questdb/griffin/engine/table/SelectedRecordCursorFactory.java— overrides both defaults;translateOrderByColumnToBaseguards bounds and chains through base;rewrapOverTopKrecursively rewraps base first, then wraps over the result.core/src/main/java/io/questdb/griffin/engine/table/VirtualRecordCursorFactory.java— overridesisProjection()to return true; overrides both defaults.translateOrderByColumnToBasereturns a negative value when the projected column is a computed expression (ColumnFunction+PriorityMetadata.getBaseColumnIndex()check), making the gate fall back toSort lighttransparently.core/src/main/java/io/questdb/griffin/SqlCodeGenerator.java— unified top-K branch ingenerateOrderBywith single-pass key translation (collapses the two former passes overlistColumnFilterA); acanReachPageFrameLeafForTopKpredicate; a sharedbuildAsyncTopKOverStolenFilterhelper with a leak guard aroundcompileWorkerFiltersConditionally. Also narrows the slave-projection peel guards ingenerateJoinAsof(lines 4369 and 4502) with an explicitgetColumnCrossIndex() != nullcheck (see Incidental fix above).core/src/test/java/io/questdb/test/griffin/ExplainPlanTest.java— plan-shape tests for single-key and multi-keyORDER BYthrough both wrappers, plus negative cases that pin theSort lightfallback when guards don't hold.core/src/test/java/io/questdb/test/griffin/LimitTest.java— row-level correctness tests through the full WAL lifecycle.ts2uses a reversed sequence ((1_001 - x)::timestamp) so a regression mistranslatingORDER BY ts2to the designated timestamptssurfaces as a row-order diff. The virtual-projection test runs five iterations to catch resource-cleanup regressions.core/src/test/java/io/questdb/test/griffin/SqlOptimiserTest.java— updates the twotestJoinAndUnionQueryWithJoinOnDesignatedTimestampColumnWithLastFunctionplan assertions to the new top-K shape:SelectedRecordnow sits aboveAsync Top K, and the key references the base column (ts) rather than the projected alias (LAST). Behavior is unchanged; only the plan arrangement reflects the unified-branch design.core/src/test/java/io/questdb/test/griffin/engine/join/AsOfJoinTest.java— addstestAsOfJoinNoKeyWithTopDownFilteredDateaddTimestampProjection, the non-keyed analog of the existing keyed regression test. Asserts row-level results so a regression in the line-4502 guard surfaces both with-ea(assertion error) and without (silent wrong results).Tradeoffs and limitations
Sort light(small result set after filtering) pay the parallel top-K worker-queue setup cost. The break-even point is dataset-dependent.ORDER BYon a computed column still usesSort light. Lifting computed keys intoRecordComparatorCompileris a larger follow-up.Sort light.isProjection()to coverVirtualRecordrequired guards ingenerateJoinAsof's slave-projection peels at lines 4369 and 4502 (see Incidental fix). No functional divergence for pre-PR inputs: factories that used to enter those branches still enter them; factories that used to skip them still skip them.Test plan
mvn -Dtest=ExplainPlanTest testpasses, including new plan-shape tests for single-key, multi-key, chained-wrapper, and negative-fallback cases.mvn -Dtest=LimitTest testpasses, including the top-K projection tests that exercisedrainWalQueue()and assert row ordering with the reversedts2sequence.mvn -Dtest=ParallelTopKFuzzTest testpasses.mvn -Dtest=OrderByAdviceTest,OrderByWithFilterTest,OrderByWithAsyncFilterTest,OrderByExpressionTest,OrderByWithIntervalFilterTest testpasses unchanged.mvn -Dtest=SqlOptimiserTest#testJoinAndUnionQueryWithJoinOnDesignatedTimestampColumnWithLastFunction,SqlOptimiserTest#testQueryPlanForJoinAndUnionQueryWithJoinOnDesignatedTimestampColumnWithLastFunction testpasses with the updated plan assertions.mvn -Dtest=AsOfJoinTest#testAsOfJoinWithTopDownFilteredDateaddTimestampProjection testpasses with the keyed slave-peel guard (line 4369).mvn -Dtest=AsOfJoinTest#testAsOfJoinNoKeyWithTopDownFilteredDateaddTimestampProjection testpasses with the non-keyed slave-peel guard (line 4502); row-level assertions catch both the-eaassertion path and the production silent-wrong-result path.Async JIT Top Kplan or move fromSort lighttoAsync JIT Top K.