Skip to content

perf(sql): optimize CASE WHEN on symbol columns to compare by int key instead of string#6788

Merged
bluestreak01 merged 5 commits intomasterfrom
vi_switch_sym
Feb 19, 2026
Merged

perf(sql): optimize CASE WHEN on symbol columns to compare by int key instead of string#6788
bluestreak01 merged 5 commits intomasterfrom
vi_switch_sym

Conversation

@bluestreak01
Copy link
Member

Summary

  • CASE expressions on symbol columns with static symbol tables now resolve
    string constants to integer symbol keys at init() time and compare by
    int at runtime, avoiding per-row string comparisons.
  • Three picker specializations based on branch count: single-branch (one ==),
    dual-branch (two == comparisons), and multi-branch (IntObjHashMap lookup).
  • When a WHEN value is not found in the symbol table, the branch is silently
    skipped (single/dual pickers get VALUE_NOT_FOUND as resolved key, which
    never matches any record key; multi-branch picker omits it from the map).
  • Non-static symbol tables (e.g. from casts) fall back to the existing
    CharSequence-keyed comparison.
  • Removed unreachable ColumnType.UNDEFINED guard in newInstance() — the
    same error is already thrown by resolvePreferredVariadicType() before the
    factory is called.
  • All new picker classes implement toPlan() producing
    switch(key,'val1',branch1,...,else) format.
  • Added 12 new tests covering: symbol int-key optimization (single, dual,
    multi-branch, with/without null, with missing keys), non-static symbol
    fallback, bind variable as key, type mismatch, and duplicate branch detection
    for float, timestamp, and symbol types.

Test plan

  • Run SwitchFunctionFactoryTest — all 56 tests pass
  • Verify coverage of new code paths: single/dual/multi symbol pickers,
    VALUE_NOT_FOUND handling, non-static symbol fallback, toPlan() output
  • Verify plan output via EXPLAIN SELECT assertions in symbol tests

🤖 Generated with Claude Code

@bluestreak01 bluestreak01 added SQL Issues or changes relating to SQL execution Performance Performance improvements labels Feb 18, 2026
@coderabbitai
Copy link

coderabbitai bot commented Feb 18, 2026

Walkthrough

The pull request adds symbol-based switch case handling to SwitchFunctionFactory through new internal picker classes that map string keys to integer keys at runtime, enabling fast symbol comparisons. Corresponding PIVOT test expectations are updated to include symbol-specific switch expressions in value computations.

Changes

Cohort / File(s) Summary
Symbol-based Switch Logic
core/src/main/java/io/questdb/griffin/engine/functions/conditional/SwitchFunctionFactory.java
Introduces handling for ColumnType.SYMBOL in switch logic with new getSymbolKeyedFunction method, three specialized SymbolPicker implementations (DualPicker, SinglePicker, multi-branch HashMapPicker), and plan generation support for symbol-based switching. Removes prohibition on undefined key types. Adds 273 lines.
PIVOT Test Expectations
core/src/test/java/io/questdb/test/griffin/PivotTest.java
Updates multiple PIVOT test value expressions to include symbol-specific switch statements for granular symbol-aware routing of aggregate values. Changes test expectations across several Pivot test cases.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested reviewers

  • puzpuzpuz
  • RaphDal
🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main performance optimization: CASE WHEN expressions on symbol columns now use integer key comparison instead of string comparison.
Description check ✅ Passed The description is comprehensive and directly related to the changeset, detailing the optimization strategy, picker specializations, fallback behavior, and test coverage.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch vi_switch_sym

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@glasstiger
Copy link
Contributor

[PR Coverage check]

😍 pass : 118 / 119 (99.16%)

file detail

path covered line new line coverage
🔵 io/questdb/griffin/engine/functions/conditional/SwitchFunctionFactory.java 118 119 99.16%

@bluestreak01
Copy link
Member Author

@CodeRabbit review critically

@coderabbitai
Copy link

coderabbitai bot commented Feb 18, 2026

@bluestreak01: I'll perform a critical review of this PR, focusing on:

  • Design decisions: Why three specialized pickers? Are the thresholds (single/dual/multi) optimal?
  • Edge cases: Handling of VALUE_NOT_FOUND, null values, non-static symbol tables, concurrent modifications
  • Performance implications: Memory overhead of the IntObjHashMap, initialization cost vs. runtime savings
  • Test coverage: The one uncovered line at 99.16% — what scenario is missing?
  • Correctness: Symbol key resolution timing, behavior when symbol tables are modified after init
  • Code quality: Duplication between picker implementations, error handling, plan output consistency

I'll examine the implementation thoroughly and challenge assumptions where warranted.


🧠 Learnings used
Learnt from: javier
Repo: questdb/questdb PR: 0
File: :0-0
Timestamp: 2026-02-06T18:22:28.130Z
Learning: In questdb/questdb, for PRs requested by javier, provide line-cited, push-back style reviews with concrete follow-ups (tests, diffs, and performance considerations).

Learnt from: jerrinot
Repo: questdb/questdb PR: 0
File: :0-0
Timestamp: 2026-02-04T11:18:36.492Z
Learning: For PRs by jerrinot, always use a thorough and critical review mode: question design decisions, challenge assumptions, scrutinize implementations for edge cases and potential issues, and avoid superficial validation. They prefer hard-core reviews that push back on the code.

Learnt from: javier
Repo: questdb/questdb PR: 0
File: :0-0
Timestamp: 2026-02-06T18:22:28.130Z
Learning: In questdb/questdb, javier prefers hard‑core, critical reviews that challenge design choices and edge cases; respond with deep, push-back style analyses for javier’s requests.
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai
Copy link

coderabbitai bot commented Feb 18, 2026

@bluestreak01: I'll trigger a critical review now, focusing on design decisions, edge cases, performance implications, and the uncovered line in the coverage report.


🧠 Learnings used
Learnt from: javier
Repo: questdb/questdb PR: 0
File: :0-0
Timestamp: 2026-02-06T18:22:28.130Z
Learning: In questdb/questdb, for PRs requested by javier, provide line-cited, push-back style reviews with concrete follow-ups (tests, diffs, and performance considerations).

Learnt from: javier
Repo: questdb/questdb PR: 0
File: :0-0
Timestamp: 2026-02-06T18:22:28.130Z
Learning: For questdb/questdb, javier expects a rigorous, push-back style review; always include concrete line-cited findings and propose targeted tests/fixes when reviewing javier’s PRs.

Learnt from: jerrinot
Repo: questdb/questdb PR: 0
File: :0-0
Timestamp: 2026-02-04T11:18:36.492Z
Learning: For PRs by jerrinot, always use a thorough and critical review mode: question design decisions, challenge assumptions, scrutinize implementations for edge cases and potential issues, and avoid superficial validation. They prefer hard-core reviews that push back on the code.

Learnt from: javier
Repo: questdb/questdb PR: 0
File: :0-0
Timestamp: 2026-02-06T18:22:28.130Z
Learning: In questdb/questdb, javier prefers hard‑core, critical reviews that challenge design choices and edge cases; respond with deep, push-back style analyses for javier’s requests.
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
core/src/main/java/io/questdb/griffin/engine/functions/conditional/SwitchFunctionFactory.java (2)

605-635: SymbolPicker extends IntFunction solely for lifecycle — document the contract.

getInt() throws UnsupportedOperationException, which means the picker must never be used as a regular function in an expression tree. This is an intentional design choice to piggyback on MultiArgFunction.init() lifecycle, but if a future refactor accidentally exposes this as a column reference or aggregation argument, it would fail at runtime with an opaque error.

The existing Javadoc (lines 599–604) explains the rationale, which is good. Consider also adding @Override on getInt explicitly to ensure the UOE is immediately visible (it's already annotated, just confirming it's intentional).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@core/src/main/java/io/questdb/griffin/engine/functions/conditional/SwitchFunctionFactory.java`
around lines 605 - 635, Add a clear Javadoc to SymbolPicker explaining that it
extends IntFunction only to reuse the lifecycle via MultiArgFunction.init() and
must never be evaluated as a real function in expressions (it intentionally
throws UnsupportedOperationException in getInt); keep the existing getInt()
override that throws UOE and ensure the `@Override` annotation remains on
getInt(), so future maintainers see the contract immediately when inspecting the
SymbolPicker class and its getInt method.

670-683: Single and dual pickers should handle the resolvedKey == VALUE_NOT_FOUND case consistently with the multi-branch picker.

In the multi-branch picker, keys not found in the symbol table are omitted from the map (line 721), so they never match. In single/dual pickers, resolvedKey is set to VALUE_NOT_FOUND and the comparison proceeds normally. Per the PR description, this works because VALUE_NOT_FOUND never matches a real symbol key from getInt().

However, this creates a subtle contract difference. If a future QuestDB version changes the symbol key encoding such that VALUE_NOT_FOUND could overlap with a valid key, the single/dual pickers would silently produce wrong results while the multi-branch picker would remain correct.

Consider adding a defensive comment or an assert resolvedKey != SymbolTable.VALUE_NOT_FOUND || ... in pick() to make the invariant explicit.

Also applies to: 780-790

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@core/src/main/java/io/questdb/griffin/engine/functions/conditional/SwitchFunctionFactory.java`
around lines 670 - 683, The single/dual pickers (pick method in
SwitchFunctionFactory using resolvedKey1/resolvedKey2) must explicitly handle
the case when resolvedKey == SymbolTable.VALUE_NOT_FOUND the same way the
multi-branch picker does (i.e., treat it as "not found" and never match) or
assert the invariant; update pick() to either add an explicit check (skip
matching when resolvedKey1/2 == SymbolTable.VALUE_NOT_FOUND) or add a defensive
assertion such as assert resolvedKey1 != SymbolTable.VALUE_NOT_FOUND ||
resolvedKey1 == SymbolTable.VALUE_NOT_FOUND_EXPLAINER (and the same for
resolvedKey2) so the intended invariant is documented and enforced; refer to
resolvedKey1, resolvedKey2, pick(), and SymbolTable.VALUE_NOT_FOUND when making
the change to ensure single/dual logic matches multi-branch behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In
`@core/src/main/java/io/questdb/griffin/engine/functions/conditional/SwitchFunctionFactory.java`:
- Around line 605-635: Add a clear Javadoc to SymbolPicker explaining that it
extends IntFunction only to reuse the lifecycle via MultiArgFunction.init() and
must never be evaluated as a real function in expressions (it intentionally
throws UnsupportedOperationException in getInt); keep the existing getInt()
override that throws UOE and ensure the `@Override` annotation remains on
getInt(), so future maintainers see the contract immediately when inspecting the
SymbolPicker class and its getInt method.
- Around line 670-683: The single/dual pickers (pick method in
SwitchFunctionFactory using resolvedKey1/resolvedKey2) must explicitly handle
the case when resolvedKey == SymbolTable.VALUE_NOT_FOUND the same way the
multi-branch picker does (i.e., treat it as "not found" and never match) or
assert the invariant; update pick() to either add an explicit check (skip
matching when resolvedKey1/2 == SymbolTable.VALUE_NOT_FOUND) or add a defensive
assertion such as assert resolvedKey1 != SymbolTable.VALUE_NOT_FOUND ||
resolvedKey1 == SymbolTable.VALUE_NOT_FOUND_EXPLAINER (and the same for
resolvedKey2) so the intended invariant is documented and enforced; refer to
resolvedKey1, resolvedKey2, pick(), and SymbolTable.VALUE_NOT_FOUND when making
the change to ensure single/dual logic matches multi-branch behavior.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
core/src/main/java/io/questdb/griffin/engine/functions/conditional/SwitchFunctionFactory.java (2)

166-171: Critical bug: missing else causes symbol-keyed function to be silently discarded for static symbol tables.

When isSymbolTableStatic() is true, getSymbolKeyedFunction is called and its result is yielded. However, the code structure without an explicit else is misleading and actually correct at runtime since yield exits the block — but only if getSymbolKeyedFunction doesn't throw. If it were to complete normally without yielding (which it can't here), the second yield would execute.

Actually, re-reading: yield does exit the block, so the logic is functionally correct. But the lack of else makes this easy to misread. Consider making the fallback explicit:

♻️ Clearer control flow with explicit else
             case ColumnType.SYMBOL -> {
                 if (keyFunction instanceof SymbolFunction symFunc && symFunc.isSymbolTableStatic()) {
                     yield getSymbolKeyedFunction(args, argPositions, position, n, symFunc, returnType, elseBranch);
-                }
-                yield getCharSequenceKeyedFunction(args, argPositions, position, n, keyFunction, returnType, elseBranch);
+                } else {
+                    yield getCharSequenceKeyedFunction(args, argPositions, position, n, keyFunction, returnType, elseBranch);
+                }
             }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@core/src/main/java/io/questdb/griffin/engine/functions/conditional/SwitchFunctionFactory.java`
around lines 166 - 171, The case handling for ColumnType.SYMBOL in
SwitchFunctionFactory currently uses a conditional that yields a symbol-keyed
function but then falls through to yielding a char-sequence keyed function
without an explicit else, which is hard to read and risks confusion; change the
block so the symbol-table-static branch explicitly yields the result in an
if/else (i.e., if (keyFunction instanceof SymbolFunction symFunc &&
symFunc.isSymbolTableStatic()) { yield getSymbolKeyedFunction(...); } else {
yield getCharSequenceKeyedFunction(...); }) to make the control flow between
getSymbolKeyedFunction and getCharSequenceKeyedFunction unambiguous.

662-668: Add super.init() calls to picker implementations for defensive correctness.

All three picker classes override init(SymbolTableSource, SqlExecutionContext) without calling super.init(...). While the base Function.init() is currently a no-op, omitting the super call silently breaks the contract if the base class ever adds init logic. This is particularly important given that similar multi-argument functions in the codebase (e.g., ConcatFunctionFactory, window functions) explicitly call super.init(). Add the call in each of the three pickers for consistency and defensive correctness.

♻️ Example fix for SymbolSwitchSinglePicker (apply similarly to DualPicker and SymbolSwitchPicker)
         `@Override`
         public void init(SymbolTableSource symbolTableSource, SqlExecutionContext executionContext) {
+            super.init(symbolTableSource, executionContext);
             final StaticSymbolTable symbolTable = keyFunction.getStaticSymbolTable();
             assert symbolTable != null;
             resolvedKey = symbolTable.keyOf(strKey);
         }

Also applies to: 714–725, 773–778

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@core/src/main/java/io/questdb/griffin/engine/functions/conditional/SwitchFunctionFactory.java`
around lines 662 - 668, The init methods in the three picker classes
(SymbolSwitchSinglePicker, SymbolSwitchDualPicker, and SymbolSwitchPicker)
should call super.init(symbolTableSource, executionContext) to preserve the
Function.init contract; update each init(...) implementation (e.g., the method
that currently obtains StaticSymbolTable via keyFunction.getStaticSymbolTable()
and sets resolvedKey1/resolvedKey2) to first invoke
super.init(symbolTableSource, executionContext) and then perform the existing
symbol resolution logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In
`@core/src/main/java/io/questdb/griffin/engine/functions/conditional/SwitchFunctionFactory.java`:
- Around line 166-171: The case handling for ColumnType.SYMBOL in
SwitchFunctionFactory currently uses a conditional that yields a symbol-keyed
function but then falls through to yielding a char-sequence keyed function
without an explicit else, which is hard to read and risks confusion; change the
block so the symbol-table-static branch explicitly yields the result in an
if/else (i.e., if (keyFunction instanceof SymbolFunction symFunc &&
symFunc.isSymbolTableStatic()) { yield getSymbolKeyedFunction(...); } else {
yield getCharSequenceKeyedFunction(...); }) to make the control flow between
getSymbolKeyedFunction and getCharSequenceKeyedFunction unambiguous.
- Around line 662-668: The init methods in the three picker classes
(SymbolSwitchSinglePicker, SymbolSwitchDualPicker, and SymbolSwitchPicker)
should call super.init(symbolTableSource, executionContext) to preserve the
Function.init contract; update each init(...) implementation (e.g., the method
that currently obtains StaticSymbolTable via keyFunction.getStaticSymbolTable()
and sets resolvedKey1/resolvedKey2) to first invoke
super.init(symbolTableSource, executionContext) and then perform the existing
symbol resolution logic.

@bluestreak01 bluestreak01 merged commit b08c7b7 into master Feb 19, 2026
44 checks passed
@bluestreak01 bluestreak01 deleted the vi_switch_sym branch February 19, 2026 13:43
nwoolmer pushed a commit that referenced this pull request Feb 19, 2026
maciulis pushed a commit to maciulis/questdb that referenced this pull request Feb 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Performance Performance improvements SQL Issues or changes relating to SQL execution

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants