Skip to content

[C++] Fix remaining overflow and negative length handling issues in Gandiva string functions #49973

@abtom87

Description

@abtom87

Describe the bug, including details regarding any error messages, version, and platform.

Description:
Two issues remain from PR #49813 review:

Overflow check happens after potential overflow: In quote_utf8 and to_hex_binary, the code computes (2 * in_len) or (2 * text_len) before passing to AddWithOverflow. When the input length exceeds INT32_MAX/2, signed integer overflow occurs before the overflow check runs, causing undefined behavior. Should use MultiplyWithOverflow first, then AddWithOverflow for the additional bytes.
Negative length validation gap in concat_ws: The safe_accumulate_word() function returns false for negative lengths, but concat_ws_impl() only checks state.overflow in the loop. Negative valid lengths can slip through to concat_word() where they're passed to memcpy() as a huge size_t, causing out-of-bounds reads/writes. Need explicit negative length checks with proper error handling.

References:

#49813 (comment) (quote_utf8 overflow)
#49813 (comment) (concat_ws negative lengths)

Component(s)

C++

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions