-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[C++] ArraySpan::FillFromScalar is unsafe #35581
Comments
ArraySpan span;
span.FillFromScalar(scalar);
UseSpan(span); // Fine; the original span is still alive.
return span; // Undefined Behavior; the returned copy views
// a stack variable whose lifetime has ended. Due to RVO, the returned Copy and Move constructors haven't been |
Copying ArraySpan is not without issues (see apache#35581), and this change seems good by itself.
This is not biting us currently because AFAICT the only place we use FillFromScalar is arrow/cpp/src/arrow/compute/exec.cc Lines 319 to 330 in cd6e2a4
std::vector from which they are never moved. This is subtle too: as long as the original ArraySpan remains valid copies will be valid too, so
ArraySpan Thing::get(int i) { return vector_of_spans_[i]; } Might not segfault if the |
Scary. And as I thought more about my proposal of deleting the copy ctor of If we really care about small buffers being cheap, we should instead consider changing the |
Copying ArraySpan is not without issues (see apache#35581), and this change seems good by itself.
Given that We should probably make this clearer in the docstrings, though. |
It's more serious than that since the ArraySpan does own the scratch space, which means that copies are viewing data in a scalar and also data in the original span (which may not exist anymore) |
### Rationale for this change Copying ArraySpan is not without issues (see #35581), and this change seems good by itself as it makes it easier for the compiler to SROA [1] the whole `RunEndEncodedArraySpan` class. [1] https://www.llvm.org/docs/Passes.html#sroa-scalar-replacement-of-aggregates ### What changes are included in this PR? - Make `array_span` private in `ree_util::RunEndEncodedArraySpan` - Allow construction of `ree_util::RunEndEncodedArraySpan` with separate `offset` and `length` - Change `RunEndEncodedBuilder` to avoid a copy of the `ArraySpan` being iterated over - Prevent instantiation based on implicit conversion from `ArrayData` to `ArraySpan` ### Are these changes tested? Yes, by the existing unit tests of the various uses of `RunEndEncodedArraySpan`. ### Are there any user-facing changes? No meaningful changes other than some small tweaks in the interface of a recently added class. * Closes: #35675 Authored-by: Felipe Oliveira Carvalho <felipekde@gmail.com> Signed-off-by: Antoine Pitrou <antoine@python.org>
ArraySpan contained scratch space inside itself for storing offsets when viewing a scalar as a length=1 array. This could lead to dangling pointers in copies of the ArraySpan since copies' pointers will always refer to the original's scratch space, which may have been destroyed. This patch moves that scratch space into Scalar itself, which is more conformant to the contract of a view since the scratch space will be valid as long as the Scalar is alive, regardless of how ArraySpans are copied. * Closes: #35581 Authored-by: Benjamin Kietzman <bengilgit@gmail.com> Signed-off-by: Antoine Pitrou <antoine@python.org>
I am moving this to 14.0.0 as it is not marked as a blocker for the 13.0.0 release, let me know if it should be part of the release though. |
Describe the bug, including details regarding any error messages, version, and platform.
ArraySpan::FillFromScalar
can store pointers into the structure itself (specifically, intoArraySpan::scratch_space
) which produces an ArraySpan which easily be unsafely moved and copied:The capability to view a Scalar as an ArraySpan can be preserved and made safer by restricting access to the span to an explicitly delineated scope:
This has the pleasant side effect of reducing the size of ArraySpan by 16 bytes.
Component(s)
C++
The text was updated successfully, but these errors were encountered: