Skip to content

Fix rare crash when transforming sliced nested arrays #3171

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
May 25, 2023

Conversation

dominiklohmann
Copy link
Member

@dominiklohmann dominiklohmann commented May 25, 2023

This fixes a rather complicated-to-find bug that would cause transformation operators accessing specific nested arrays to produce oversized arrays when used after a slicing operation. This most commonly happened after head and tail, but could also occur after where or the hidden rebatch operator.

When we slice an array, we don't actually modify the array itself. Instead, we just modify the validity bitmap and adjust the internal offset and length values of the array. Now there were two related problems, one being a misunderstanding and the other a bug in Apache Arrow:

  1. The misunderstanding: To combine null bitmaps, nested arrays must be explicitly accessed via arrow::StructArray::Flatten() instead of arrow::StructArray::fields(). The former performs a logical and operation on the null bitmaps of the struct array and the nested array.
  2. The bug: The arrow::FieldPath::Get(...) function returns an array pointing into the passed array or batch. The function recreates the array from the underlying child array data, thus returning the original nested array, as if the outer array were not sliced at all.

This PR fixes these two issues, and also as a bonus a bug that caused tail to return too many events when its input were lots of really small slices.

This fixes a rather complicated-to-find bug that would cause
transformation operators accessing specific nested arrays to produce
oversized arrays when used after a slicing operation. This most commonly
happened after `head` and `tail`, but could also occur after `where` or
the hidden `rebatch` operator.

When we slice an array, we don't actually modify the array itself.
Instead, we just modify the validity bitmap and adjust the internl
offset and length values of the array. Now there were two related
problems, one being an understanding and the other a bug in Apache
Arrow:
1. The misunderstanding: To combine null bitmaps, nested arrays must be
   explicitly accessed via `arrow::StructArray::Flatten()` instead of
   `arrow::StructArray::fields()`. The former performs a logical and
   operation on the null bitmaps of the struct array and the nested
   array.
2. The bug: The `arrow::FieldPath::Get(...)` function returns an array
   pointing into the passed array or batch. The function recreates the
   array from the underlying child array data, thus returning the
   original nested array, as if the outer array were not sliced at all.

This commit fixes both issues.
@dominiklohmann dominiklohmann added the bug Incorrect behavior label May 25, 2023
@dominiklohmann dominiklohmann requested a review from tobim May 25, 2023 11:20
Copy link
Member

@tobim tobim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pair reviewed in a call.

@dominiklohmann dominiklohmann enabled auto-merge May 25, 2023 11:48
This feature will soon be superseded anyways by `vast exec --timeout
'export | …'`, so let's just remove the flaky test.
@dominiklohmann dominiklohmann merged commit f171ca6 into main May 25, 2023
@dominiklohmann dominiklohmann deleted the topic/head-crash branch May 25, 2023 15:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Incorrect behavior
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants