Skip to content

GH-50311: [C++] KeyValueMetadata::Delete returns IndexError instead of crashing due to seg fault#50322

Merged
pitrou merged 3 commits into
apache:mainfrom
OmBiradar:KeyValueMetadata-catch-out-of-bounds
Jul 2, 2026
Merged

GH-50311: [C++] KeyValueMetadata::Delete returns IndexError instead of crashing due to seg fault#50322
pitrou merged 3 commits into
apache:mainfrom
OmBiradar:KeyValueMetadata-catch-out-of-bounds

Conversation

@OmBiradar

@OmBiradar OmBiradar commented Jul 1, 2026

Copy link
Copy Markdown
Contributor
  • The Delete function now catches out of bound index and does not throw a segmentation fault.

Rationale for this change

The KeyValueMetadata::Delete(int64_t index) never checked for out of bounds value of index, & if index was out of bounds, then a seg fault was thrown by the program and it aborted.

What changes are included in this PR?

The KeyValueMetadata::Delete(int64_t index) now returns a IndexError for out of bounds values of index

Are these changes tested?

Yes, CI tests pass on my fork

Are there any user-facing changes?

No API changes

Copilot AI review requested due to automatic review settings July 1, 2026 15:54
@OmBiradar OmBiradar requested a review from pitrou as a code owner July 1, 2026 15:54
@github-actions

github-actions Bot commented Jul 1, 2026

Copy link
Copy Markdown

⚠️ GitHub issue #50311 has been automatically assigned in GitHub to PR creator.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a crash in the C++ arrow::KeyValueMetadata::Delete(int64_t index) implementation by adding bounds checking so out-of-range indices return Status::IndexError instead of triggering undefined behavior.

Changes:

  • Added an out-of-bounds guard to KeyValueMetadata::Delete(int64_t) using ARROW_PREDICT_FALSE and std::cmp_greater_equal.
  • Returned Status::IndexError when index < 0 or index >= size.

Comment thread cpp/src/arrow/util/key_value_metadata.cc Outdated
Comment thread cpp/src/arrow/util/key_value_metadata.cc
@OmBiradar OmBiradar force-pushed the KeyValueMetadata-catch-out-of-bounds branch 2 times, most recently from c2efac9 to 046bb2e Compare July 2, 2026 06:07
Copilot AI review requested due to automatic review settings July 2, 2026 06:07

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

Comment thread cpp/src/arrow/util/key_value_metadata.cc Outdated
* The Delete function now catches out of bound index
and does not throw a segmentation fault.

Signed-off-by: OmBiradar <ombiradar04@gmail.com>
@OmBiradar OmBiradar force-pushed the KeyValueMetadata-catch-out-of-bounds branch from 046bb2e to 15fb3bf Compare July 2, 2026 06:16
@OmBiradar

OmBiradar commented Jul 2, 2026

Copy link
Copy Markdown
Contributor Author

The above has been a messy PR with me amending the commit mostly due to me being unsure about the exact error message.

}

Status KeyValueMetadata::Delete(int64_t index) {
if (ARROW_PREDICT_FALSE(index < 0 || std::cmp_greater_equal(index, values_.size()))) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why std::cmp_greater_equal? Just use the standard comparison operator here.

@OmBiradar OmBiradar Jul 2, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The types are int64_t and size_t, so any operator would cast the int64_t to a size_t which retults in negative numbers being converted to extremely large values (>2^31). While even if this happens the logic would work for any case of negative values as index<0 is true, I think this is the general practice for these types of comparisons? based on what I know

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Arrow codebase generally casts explicitly, i.e. static_cast<int64_t>(values_.size())

@OmBiradar OmBiradar Jul 2, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For values_.size() > 2^31 it would overflow and return a -ve int64_t. Tho it's extremely impossible to have so many elments in the first place.

I was just trying to make it secure 😅

@OmBiradar OmBiradar Jul 2, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I use static_cast<int64_t>? for consistency with the project
@pitrou

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, please use that :)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it to static_cast

Comment thread cpp/src/arrow/util/key_value_metadata.cc Outdated
@pitrou

pitrou commented Jul 2, 2026

Copy link
Copy Markdown
Member

@OmBiradar This is not really a "critical fix" as this is mostly a API usage question (the caller should have checked index bounds accordingly), can you edit the PR description?

@github-actions github-actions Bot added awaiting committer review Awaiting committer review and removed awaiting review Awaiting review labels Jul 2, 2026
@OmBiradar

OmBiradar commented Jul 2, 2026

Copy link
Copy Markdown
Contributor Author

@pitrou yes sure, did it. I was unsure about what critical fixes mean, my rational to add it was that a seg fault would be triggered and crash any program.

Copilot AI review requested due to automatic review settings July 2, 2026 08:41

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

Comment thread cpp/src/arrow/util/key_value_metadata_test.cc
Comment thread cpp/src/arrow/util/key_value_metadata_test.cc
@OmBiradar

Copy link
Copy Markdown
Contributor Author

Forgot to update the tests

@OmBiradar OmBiradar force-pushed the KeyValueMetadata-catch-out-of-bounds branch from 5ecddee to 7acf912 Compare July 2, 2026 08:49
@OmBiradar OmBiradar requested review from Copilot and pitrou July 2, 2026 08:50
@OmBiradar OmBiradar force-pushed the KeyValueMetadata-catch-out-of-bounds branch from 7acf912 to 3eee41b Compare July 2, 2026 09:39

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

Comment thread cpp/src/arrow/util/key_value_metadata.cc Outdated
Comment thread cpp/src/arrow/util/key_value_metadata_test.cc
Co-authored-by: Antoine Pitrou <pitrou@free.fr>
@OmBiradar OmBiradar force-pushed the KeyValueMetadata-catch-out-of-bounds branch from 3eee41b to 926c6c4 Compare July 2, 2026 09:44
Signed-off-by: OmBiradar <ombiradar04@gmail.com>
Copilot AI review requested due to automatic review settings July 2, 2026 11:03

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.

@pitrou pitrou merged commit 98347d2 into apache:main Jul 2, 2026
61 checks passed
@pitrou pitrou removed the awaiting committer review Awaiting committer review label Jul 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants