Skip to content

[fix](columns) Add bounds checking in ColumnStr::insert_indices_from …#60776

Open
aryash45 wants to merge 3 commits intoapache:masterfrom
aryash45:fix/issue-60713-insert-indices-from-sigsegv
Open

[fix](columns) Add bounds checking in ColumnStr::insert_indices_from …#60776
aryash45 wants to merge 3 commits intoapache:masterfrom
aryash45:fix/issue-60713-insert-indices-from-sigsegv

Conversation

@aryash45
Copy link

…to prevent SIGSEGV (#60713)

The insert_indices_from function in ColumnStr accesses src_offset_data[src_offset] without validating that src_offset is within the valid range of the source column. If an out-of-bounds index is passed, this causes a SIGSEGV crash.

Changes:

  • Add early return for empty indices list
  • Add validation that source column is not empty when indices are provided
  • Add bounds checking loop to verify all indices are within [0, src_size)
  • Strengthen DCHECK in MutableBlock::add_rows to a runtime check
  • Add unit test for out-of-bounds indices scenarios

What problem does this PR solve?

Issue Number: close #60713

Problem Summary:
BE nodes crash with SIGSEGV in ColumnStr<unsigned int>::insert_indices_from during INSERT...SELECT operations. The crash occurs because indices are accessed without bounds validation, leading to out-of-bounds memory access when invalid indices are passed during data exchange between BE nodes.

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer )

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Feb 16, 2026

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

…to prevent SIGSEGV (apache#60713)

The insert_indices_from function in ColumnStr accesses src_offset_data[src_offset] without validating that src_offset is within the valid range of the source column. If an out-of-bounds index is passed, this causes a SIGSEGV crash.

Changes:

- Add early return for empty indices list

- Add validation that source column is not empty when indices are provided

- Add bounds checking loop to verify all indices are within [0, src_size)

- Strengthen DCHECK in MutableBlock::add_rows to a runtime check

- Add unit test for out-of-bounds indices scenarios
@aryash45 aryash45 force-pushed the fix/issue-60713-insert-indices-from-sigsegv branch from 6e83725 to 878c24e Compare February 16, 2026 06:38
@aryash45
Copy link
Author

aryash45 commented Feb 16, 2026

hey @morningman @zclllyybb please review my pr and tell me if any change is needed

@zclllyybb
Copy link
Contributor

hey @morningman @zclllyybb please review my pr and tell me if any change is needed

thanks for your pr~

actually I think this may be not a proper patch. for a out of bound issue, we should figure out why and who passed an invalid index. that's where we should fix.

@aryash45
Copy link
Author

thank for your reply @zclllyybb i will look into it and make the commit

void ColumnStr<T>::insert_indices_from(const IColumn& src, const uint32_t* indices_begin,
const uint32_t* indices_end) {
if (indices_begin == indices_end) {
return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in normal cases, this should not happen. Do you find the real reason why the indices not valid???

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] doris::vectorized::ColumnStr<unsigned int>::insert_indices_from

4 participants