perf(sql): optimize CASE WHEN on symbol columns to compare by int key instead of string#6788
perf(sql): optimize CASE WHEN on symbol columns to compare by int key instead of string#6788bluestreak01 merged 5 commits intomasterfrom
Conversation
WalkthroughThe pull request adds symbol-based switch case handling to SwitchFunctionFactory through new internal picker classes that map string keys to integer keys at runtime, enabling fast symbol comparisons. Corresponding PIVOT test expectations are updated to include symbol-specific switch expressions in value computations. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested reviewers
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
[PR Coverage check]😍 pass : 118 / 119 (99.16%) file detail
|
|
@CodeRabbit review critically |
|
I'll examine the implementation thoroughly and challenge assumptions where warranted. 🧠 Learnings used✅ Actions performedReview triggered.
|
|
🧠 Learnings used✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
🧹 Nitpick comments (2)
core/src/main/java/io/questdb/griffin/engine/functions/conditional/SwitchFunctionFactory.java (2)
605-635:SymbolPickerextendsIntFunctionsolely for lifecycle — document the contract.
getInt()throwsUnsupportedOperationException, which means the picker must never be used as a regular function in an expression tree. This is an intentional design choice to piggyback onMultiArgFunction.init()lifecycle, but if a future refactor accidentally exposes this as a column reference or aggregation argument, it would fail at runtime with an opaque error.The existing Javadoc (lines 599–604) explains the rationale, which is good. Consider also adding
@OverrideongetIntexplicitly to ensure the UOE is immediately visible (it's already annotated, just confirming it's intentional).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/io/questdb/griffin/engine/functions/conditional/SwitchFunctionFactory.java` around lines 605 - 635, Add a clear Javadoc to SymbolPicker explaining that it extends IntFunction only to reuse the lifecycle via MultiArgFunction.init() and must never be evaluated as a real function in expressions (it intentionally throws UnsupportedOperationException in getInt); keep the existing getInt() override that throws UOE and ensure the `@Override` annotation remains on getInt(), so future maintainers see the contract immediately when inspecting the SymbolPicker class and its getInt method.
670-683: Single and dual pickers should handle theresolvedKey == VALUE_NOT_FOUNDcase consistently with the multi-branch picker.In the multi-branch picker, keys not found in the symbol table are omitted from the map (line 721), so they never match. In single/dual pickers,
resolvedKeyis set toVALUE_NOT_FOUNDand the comparison proceeds normally. Per the PR description, this works becauseVALUE_NOT_FOUNDnever matches a real symbol key fromgetInt().However, this creates a subtle contract difference. If a future QuestDB version changes the symbol key encoding such that
VALUE_NOT_FOUNDcould overlap with a valid key, the single/dual pickers would silently produce wrong results while the multi-branch picker would remain correct.Consider adding a defensive comment or an
assert resolvedKey != SymbolTable.VALUE_NOT_FOUND || ...inpick()to make the invariant explicit.Also applies to: 780-790
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/io/questdb/griffin/engine/functions/conditional/SwitchFunctionFactory.java` around lines 670 - 683, The single/dual pickers (pick method in SwitchFunctionFactory using resolvedKey1/resolvedKey2) must explicitly handle the case when resolvedKey == SymbolTable.VALUE_NOT_FOUND the same way the multi-branch picker does (i.e., treat it as "not found" and never match) or assert the invariant; update pick() to either add an explicit check (skip matching when resolvedKey1/2 == SymbolTable.VALUE_NOT_FOUND) or add a defensive assertion such as assert resolvedKey1 != SymbolTable.VALUE_NOT_FOUND || resolvedKey1 == SymbolTable.VALUE_NOT_FOUND_EXPLAINER (and the same for resolvedKey2) so the intended invariant is documented and enforced; refer to resolvedKey1, resolvedKey2, pick(), and SymbolTable.VALUE_NOT_FOUND when making the change to ensure single/dual logic matches multi-branch behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In
`@core/src/main/java/io/questdb/griffin/engine/functions/conditional/SwitchFunctionFactory.java`:
- Around line 605-635: Add a clear Javadoc to SymbolPicker explaining that it
extends IntFunction only to reuse the lifecycle via MultiArgFunction.init() and
must never be evaluated as a real function in expressions (it intentionally
throws UnsupportedOperationException in getInt); keep the existing getInt()
override that throws UOE and ensure the `@Override` annotation remains on
getInt(), so future maintainers see the contract immediately when inspecting the
SymbolPicker class and its getInt method.
- Around line 670-683: The single/dual pickers (pick method in
SwitchFunctionFactory using resolvedKey1/resolvedKey2) must explicitly handle
the case when resolvedKey == SymbolTable.VALUE_NOT_FOUND the same way the
multi-branch picker does (i.e., treat it as "not found" and never match) or
assert the invariant; update pick() to either add an explicit check (skip
matching when resolvedKey1/2 == SymbolTable.VALUE_NOT_FOUND) or add a defensive
assertion such as assert resolvedKey1 != SymbolTable.VALUE_NOT_FOUND ||
resolvedKey1 == SymbolTable.VALUE_NOT_FOUND_EXPLAINER (and the same for
resolvedKey2) so the intended invariant is documented and enforced; refer to
resolvedKey1, resolvedKey2, pick(), and SymbolTable.VALUE_NOT_FOUND when making
the change to ensure single/dual logic matches multi-branch behavior.
There was a problem hiding this comment.
🧹 Nitpick comments (2)
core/src/main/java/io/questdb/griffin/engine/functions/conditional/SwitchFunctionFactory.java (2)
166-171: Critical bug: missingelsecauses symbol-keyed function to be silently discarded for static symbol tables.When
isSymbolTableStatic()is true,getSymbolKeyedFunctionis called and its result is yielded. However, the code structure without an explicitelseis misleading and actually correct at runtime sinceyieldexits the block — but only ifgetSymbolKeyedFunctiondoesn't throw. If it were to complete normally without yielding (which it can't here), the secondyieldwould execute.Actually, re-reading:
yielddoes exit the block, so the logic is functionally correct. But the lack ofelsemakes this easy to misread. Consider making the fallback explicit:♻️ Clearer control flow with explicit else
case ColumnType.SYMBOL -> { if (keyFunction instanceof SymbolFunction symFunc && symFunc.isSymbolTableStatic()) { yield getSymbolKeyedFunction(args, argPositions, position, n, symFunc, returnType, elseBranch); - } - yield getCharSequenceKeyedFunction(args, argPositions, position, n, keyFunction, returnType, elseBranch); + } else { + yield getCharSequenceKeyedFunction(args, argPositions, position, n, keyFunction, returnType, elseBranch); + } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/io/questdb/griffin/engine/functions/conditional/SwitchFunctionFactory.java` around lines 166 - 171, The case handling for ColumnType.SYMBOL in SwitchFunctionFactory currently uses a conditional that yields a symbol-keyed function but then falls through to yielding a char-sequence keyed function without an explicit else, which is hard to read and risks confusion; change the block so the symbol-table-static branch explicitly yields the result in an if/else (i.e., if (keyFunction instanceof SymbolFunction symFunc && symFunc.isSymbolTableStatic()) { yield getSymbolKeyedFunction(...); } else { yield getCharSequenceKeyedFunction(...); }) to make the control flow between getSymbolKeyedFunction and getCharSequenceKeyedFunction unambiguous.
662-668: Addsuper.init()calls to picker implementations for defensive correctness.All three picker classes override
init(SymbolTableSource, SqlExecutionContext)without callingsuper.init(...). While the baseFunction.init()is currently a no-op, omitting the super call silently breaks the contract if the base class ever adds init logic. This is particularly important given that similar multi-argument functions in the codebase (e.g.,ConcatFunctionFactory, window functions) explicitly callsuper.init(). Add the call in each of the three pickers for consistency and defensive correctness.♻️ Example fix for SymbolSwitchSinglePicker (apply similarly to DualPicker and SymbolSwitchPicker)
`@Override` public void init(SymbolTableSource symbolTableSource, SqlExecutionContext executionContext) { + super.init(symbolTableSource, executionContext); final StaticSymbolTable symbolTable = keyFunction.getStaticSymbolTable(); assert symbolTable != null; resolvedKey = symbolTable.keyOf(strKey); }Also applies to: 714–725, 773–778
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/io/questdb/griffin/engine/functions/conditional/SwitchFunctionFactory.java` around lines 662 - 668, The init methods in the three picker classes (SymbolSwitchSinglePicker, SymbolSwitchDualPicker, and SymbolSwitchPicker) should call super.init(symbolTableSource, executionContext) to preserve the Function.init contract; update each init(...) implementation (e.g., the method that currently obtains StaticSymbolTable via keyFunction.getStaticSymbolTable() and sets resolvedKey1/resolvedKey2) to first invoke super.init(symbolTableSource, executionContext) and then perform the existing symbol resolution logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In
`@core/src/main/java/io/questdb/griffin/engine/functions/conditional/SwitchFunctionFactory.java`:
- Around line 166-171: The case handling for ColumnType.SYMBOL in
SwitchFunctionFactory currently uses a conditional that yields a symbol-keyed
function but then falls through to yielding a char-sequence keyed function
without an explicit else, which is hard to read and risks confusion; change the
block so the symbol-table-static branch explicitly yields the result in an
if/else (i.e., if (keyFunction instanceof SymbolFunction symFunc &&
symFunc.isSymbolTableStatic()) { yield getSymbolKeyedFunction(...); } else {
yield getCharSequenceKeyedFunction(...); }) to make the control flow between
getSymbolKeyedFunction and getCharSequenceKeyedFunction unambiguous.
- Around line 662-668: The init methods in the three picker classes
(SymbolSwitchSinglePicker, SymbolSwitchDualPicker, and SymbolSwitchPicker)
should call super.init(symbolTableSource, executionContext) to preserve the
Function.init contract; update each init(...) implementation (e.g., the method
that currently obtains StaticSymbolTable via keyFunction.getStaticSymbolTable()
and sets resolvedKey1/resolvedKey2) to first invoke
super.init(symbolTableSource, executionContext) and then perform the existing
symbol resolution logic.
… instead of string (questdb#6788)
Summary
string constants to integer symbol keys at
init()time and compare byint at runtime, avoiding per-row string comparisons.
==),dual-branch (two
==comparisons), and multi-branch (IntObjHashMaplookup).skipped (single/dual pickers get
VALUE_NOT_FOUNDas resolved key, whichnever matches any record key; multi-branch picker omits it from the map).
CharSequence-keyed comparison.ColumnType.UNDEFINEDguard innewInstance()— thesame error is already thrown by
resolvePreferredVariadicType()before thefactory is called.
toPlan()producingswitch(key,'val1',branch1,...,else)format.multi-branch, with/without null, with missing keys), non-static symbol
fallback, bind variable as key, type mismatch, and duplicate branch detection
for float, timestamp, and symbol types.
Test plan
SwitchFunctionFactoryTest— all 56 tests passVALUE_NOT_FOUNDhandling, non-static symbol fallback,toPlan()outputEXPLAIN SELECTassertions in symbol tests🤖 Generated with Claude Code