Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-35450: [C++] Return error when RecordBatch::ToStructArray called with mismatched column lengths #36654

Merged

Conversation

benibus
Copy link
Collaborator

@benibus benibus commented Jul 13, 2023

Rationale for this change

If a RecordBatch is created with column lengths that don't match the provided num_rows (technically invalid), then there are some circumstances where ToStructArray will successfully return an array whose length doesn't match num_rows. Instead, we should return an error.

What changes are included in this PR?

  • Add a small validation check to ToStructArray before constructing the output array
  • Add a test

Are these changes tested?

Yes (tests are included)

Are there any user-facing changes?

No

@@ -301,12 +318,8 @@ namespace {

Status ValidateBatch(const RecordBatch& batch, bool full_validation) {
for (int i = 0; i < batch.num_columns(); ++i) {
RETURN_NOT_OK(ValidateColumnLength(batch, i));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: passing array created in line 304 to ValidateColumnLength to avoid creating shared_ptr twice.

@github-actions github-actions bot added awaiting committer review Awaiting committer review and removed awaiting review Awaiting review labels Jul 13, 2023
@pitrou pitrou merged commit 9b77edd into apache:main Aug 10, 2023
36 checks passed
@pitrou pitrou removed the awaiting committer review Awaiting committer review label Aug 10, 2023
@github-actions github-actions bot added the awaiting committer review Awaiting committer review label Aug 10, 2023
@conbench-apache-arrow
Copy link

After merging your PR, Conbench analyzed the 6 benchmarking runs that have been run so far on merge-commit 9b77edd.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details.

loicalleyne pushed a commit to loicalleyne/arrow that referenced this pull request Nov 13, 2023
…called with mismatched column lengths (apache#36654)

### Rationale for this change

If a `RecordBatch` is created with column lengths that don't match the provided `num_rows` (technically invalid), then there are some circumstances where `ToStructArray` will successfully return an array whose length doesn't match `num_rows`. Instead, we should return an error.

### What changes are included in this PR?

* Add a small validation check to `ToStructArray` before constructing the output array
* Add a test

### Are these changes tested?

Yes (tests are included)

### Are there any user-facing changes?

No

* Closes: apache#35450

Authored-by: benibus <bpharks@gmx.com>
Signed-off-by: Antoine Pitrou <antoine@python.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants