Skip to content

GH-47252: [C++][Compute] Fix sort_indices for temporal types in arrow::Table#50270

Merged
pitrou merged 3 commits into
apache:mainfrom
nfrmtk:fix-sort_indices-of-timestamp-keys-in-table
Jun 29, 2026
Merged

GH-47252: [C++][Compute] Fix sort_indices for temporal types in arrow::Table#50270
pitrou merged 3 commits into
apache:mainfrom
nfrmtk:fix-sort_indices-of-timestamp-keys-in-table

Conversation

@nfrmtk

@nfrmtk nfrmtk commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Rationale for this change

I was unable to use compute::SortIndices with timestamp type because of crash.

What changes are included in this PR?

Fix. The issue was that comparator for merging record batches was not converting timestamp type to its physical variant.
so it crashed on null pointer reference on checked_cast result

Are these changes tested?

yes

Are there any user-facing changes?

This PR contains a "Critical Fix". (If the changes fix either (a) a security vulnerability, (b) a bug that caused incorrect or invalid data to be produced, or (c) a bug that causes a crash (even when the API contract is upheld), please provide explanation. If not, you can remove this.)
i've already provided, i suppose

@nfrmtk nfrmtk requested a review from pitrou as a code owner June 26, 2026 14:51
Copilot AI review requested due to automatic review settings June 26, 2026 14:51
@github-actions

Copy link
Copy Markdown

⚠️ GitHub issue #47252 has been automatically assigned in GitHub to PR creator.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a crash in compute::SortIndices / Table::sort_by when sorting arrow::Table columns with temporal logical types (e.g., timestamp), by ensuring per-batch sort-key arrays are converted to their physical representation before comparator-based merges.

Changes:

  • Convert flattened per-record-batch sort-key arrays to the physical type (GetPhysicalType + GetPhysicalArray) when constructing ResolvedTableSortKey chunks.
  • Add a regression test covering Table multi-key sorting on timestamp columns across multiple JSON chunks (record batches).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
cpp/src/arrow/compute/kernels/vector_sort_internal.h Ensure table sort-key chunks use physical array types to prevent comparator downcast crashes for temporal logical types.
cpp/src/arrow/compute/kernels/vector_sort_test.cc Add regression coverage for Table sort indices on timestamp keys across multiple chunks and null placement variants.

Comment thread cpp/src/arrow/compute/kernels/vector_sort_internal.h Outdated
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 26, 2026 15:34

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.

@pitrou pitrou left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for diagnosing and fixing this @nfrmtk . I'll wait for CI and then merge.

@github-actions github-actions Bot added awaiting committer review Awaiting committer review and removed awaiting review Awaiting review labels Jun 29, 2026

@pitrou pitrou left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(just another review to add a suggestion)

Comment thread cpp/src/arrow/compute/kernels/vector_sort_test.cc Outdated
Copilot AI review requested due to automatic review settings June 29, 2026 13:56

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot was unable to review this pull request because the user who requested the review is ineligible. To be eligible to request a review, you need a paid Copilot license, or your organization must enable Copilot code review.

@pitrou pitrou changed the title GH-47252: [C++][Compute] Fix sort_indices for temporal types in arrow::Table. GH-47252: [C++][Compute] Fix sort_indices for temporal types in arrow::Table Jun 29, 2026
@pitrou

pitrou commented Jun 29, 2026

Copy link
Copy Markdown
Member

CI failures are unrelated, I'll merge.

@github-actions

Copy link
Copy Markdown

⚠️ GitHub issue #47252 has been automatically assigned in GitHub to PR creator.

@pitrou pitrou merged commit bf108d4 into apache:main Jun 29, 2026
54 of 56 checks passed
@pitrou pitrou removed the awaiting committer review Awaiting committer review label Jun 29, 2026
@conbench-apache-arrow

Copy link
Copy Markdown

After merging your PR, Conbench analyzed the 3 benchmarking runs that have been run so far on merge-commit bf108d4.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 2 possible false positives for unstable benchmarks that are known to sometimes produce them.

@nfrmtk

nfrmtk commented Jun 30, 2026

Copy link
Copy Markdown
Contributor Author

Thanks for diagnosing and fixing this @nfrmtk

welcome

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants