Skip to content

branch-4.1: [fix](function) fix tokenize function incorrect result when first argument is const #62699#63735

Open
github-actions[bot] wants to merge 1 commit into
branch-4.1from
auto-pick-62699-branch-4.1
Open

branch-4.1: [fix](function) fix tokenize function incorrect result when first argument is const #62699#63735
github-actions[bot] wants to merge 1 commit into
branch-4.1from
auto-pick-62699-branch-4.1

Conversation

@github-actions
Copy link
Copy Markdown
Contributor

Cherry-picked from #62699

…ument is const (#62699)

## Proposed changes

Fix a bug in the `tokenize` function where `unpack_if_const` unwraps a
`ColumnConst` to its inner data column (which has only 1 row), but
`_do_tokenize` and `_do_tokenize_none` iterate based on the source
column's row count. This causes only 1 output row to be produced instead
of `input_rows_count` rows when the first argument is a constant.

For example, `SELECT tokenize('hello world', 'parser=english') FROM
table_with_many_rows` would previously return only 1 row instead of the
expected number of rows matching the table.

The fix wraps the result in `ColumnConst` when the source column was
const, which is the standard pattern used throughout the Doris codebase
for handling const columns in function execution.

## Further comments

Related Jira: DORIS-25296

## Checklist(Required)

1. Does it affect the results of the existing test cases (Yes/No): No
2. Does it need to update the document (Yes/No): No
3. Is there a risk of compatibility changes (Yes/No): No
@github-actions github-actions Bot requested a review from yiguolei as a code owner May 27, 2026 08:19
@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@hello-stephen
Copy link
Copy Markdown
Contributor

run buildall

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants