Skip to content

[VL][MINOR] Tighten bounds check in SubstraitParser::checkWindowFunction#12086

Closed
luis4a0 wants to merge 1 commit into
apache:mainfrom
luis4a0:lpenaranda/fix-check-window-function-bounds
Closed

[VL][MINOR] Tighten bounds check in SubstraitParser::checkWindowFunction#12086
luis4a0 wants to merge 1 commit into
apache:mainfrom
luis4a0:lpenaranda/fix-check-window-function-bounds

Conversation

@luis4a0
Copy link
Copy Markdown
Contributor

@luis4a0 luis4a0 commented May 13, 2026

What changes are proposed in this pull request?

The bounds check in SubstraitParser::checkWindowFunction
(cpp/velox/substrait/SubstraitParser.cc:300) reads:

if ((pos != std::string::npos) &&
    (msg.value().size() >= targetFunction.size()) &&
    (msg.value().substr(pos + config.size(), targetFunction.size()) == targetFunction))

The size comparison msg.value().size() >= targetFunction.size() is
not enough to guarantee that the subsequent substr() call stays
within the buffer — the substring starts at offset
pos + config.size(), so the correct bound is

msg.value().size() >= pos + config.size() + targetFunction.size()

This PR fixes the bound and adds a small cpp/velox/tests/SubstraitParserTest.cc
to lock in the desired behavior.

Is the bug observable today?

No. I want to be honest about this up front. std::string::substr
clamps the requested length to the actual remaining string length,
so the buggy bound + the substr call together produce the right
answer (false) for truncated inputs by accident. None of the current
callers ("row_number", "rank", "dense_rank") exercise the
failure path in any way that produces a wrong result today.

I'm sending the fix anyway because:

  1. It's visibly wrong on inspection — reviewers later have to spend
    cycles confirming whether it's a real bug. Cleaner to fix it now.
  2. std::string::substr's clamping behaviour is the only thing
    keeping this from being a runtime fault. A future refactor to
    std::string_view::substr (different contract — throws
    std::out_of_range on bad bounds) would turn this into a real
    exception with no logical change to the surrounding code.
  3. The fix is one tightening character per line of code, with a
    clear correctness argument.

How was this patch tested?

Adds cpp/velox/tests/SubstraitParserTest.cc with five cases:

  • checkWindowFunctionWellFormedMatch — happy path.
  • checkWindowFunctionWellFormedNoMatch — well-formed payload, target doesn't match.
  • checkWindowFunctionTruncatedPayload — the regression case: payload bytes
    after window_function= are shorter than the target. Must return false
    without overrunning the buffer.
  • checkWindowFunctionEmptyPayload — empty payload, no window_function= selector at all.
  • checkWindowFunctionNoOptimization — extension with no optimization payload.

All five tests pass under both the buggy and the fixed version, because
the bug is dormant (see "Is the bug observable today?" above). I validated
this with a standalone harness that compiles the same logic against both
versions of the bounds check (gated by -DUSE_BUGGY=1):

Fixed (this PR) Buggy (current main)
checkWindowFunctionWellFormedMatch PASS ✅ PASS ✅
checkWindowFunctionWellFormedNoMatch PASS ✅ PASS ✅
checkWindowFunctionTruncatedPayload PASS ✅ PASS ✅
checkWindowFunctionEmptyPayload PASS ✅ PASS ✅
checkWindowFunctionNoOptimization PASS ✅ PASS ✅

So the test is defensive only — its purpose is to lock in the
desired behavior so any future refactor that breaks the bound check
fails loudly, rather than to demonstrate the buggy version producing
wrong output today.

If reviewers prefer not to add a test for a dormant-bug case, I'm
happy to drop it and ship the 1-line fix on its own.

Marked as draft for one round of review before flipping to ready.

The bounds check at SubstraitParser.cc:300 reads:

  if ((pos != std::string::npos) &&
      (msg.value().size() >= targetFunction.size()) &&
      (msg.value().substr(pos + config.size(), targetFunction.size()) == targetFunction))

The size comparison (msg.value().size() >= targetFunction.size()) is
not enough to guarantee that the subsequent substr() call stays within
the buffer — the substr starts at offset pos + config.size(), so the
correct bound is

  msg.value().size() >= pos + config.size() + targetFunction.size()

Today the bug is dormant: std::string::substr clamps to the actual
length and the resulting string compares unequal to the (longer)
target, so the function returns the right answer (false) for truncated
inputs by accident. There is no observable wrong behavior with any of
the current callers ("row_number", "rank", "dense_rank").

I'm sending this anyway because:

1. It's visibly wrong on inspection — reviewers later have to spend
   cycles confirming whether it's a real bug.
2. std::string::substr's clamping behavior is the only thing keeping
   this from being a runtime fault. A future refactor to
   std::string_view::substr (different contract — throws
   std::out_of_range on bad bounds) would turn this into a real
   exception with no logical change to the surrounding code.

Adds cpp/velox/tests/SubstraitParserTest.cc with five cases: a
well-formed match, a well-formed non-match, the truncated-payload
regression case, an empty payload, and an extension with no
optimization. The test passes under both the buggy and the fixed
version (because the bug is dormant) — its purpose is to lock in
the desired behavior so any future refactor that breaks the bound
check fails loudly.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions github-actions Bot added the VELOX label May 13, 2026
@luis4a0
Copy link
Copy Markdown
Contributor Author

luis4a0 commented May 13, 2026

Closing this — on second look (prompted by reviewer feedback to find a real failing case), the original bounds check is functionally correct, just stylistically odd.

std::string::substr(pos, count) clamps count to size() - pos. Since find() guarantees pos + config.size() <= size() whenever it returns non-npos, the subsequent substr(pos + config.size(), targetFunction.size()) is always safe and always returns at most the requested count of characters. The equality check against targetFunction then either matches the full intended substring or returns false on a clamped (shorter) result — never a wrong answer.

Verified empirically with an exhaustive test across 90 (message, target) pairs including various truncations: zero behavioral differences between the two bound-check variants. std::string_view::substr would also be safe (only throws when pos > size(), which can't happen given the find() contract).

So there's no bug to fix here, dormant or otherwise. Sorry for the noise; thanks for taking a look.

@luis4a0 luis4a0 closed this May 13, 2026
@luis4a0 luis4a0 deleted the lpenaranda/fix-check-window-function-bounds branch May 13, 2026 09:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant