GH-50063: [C++] Validate buffer size for row-major tensors#50064
Conversation
|
|
|
|
There was a problem hiding this comment.
Pull request overview
This PR fixes a safety gap in C++ tensor validation where row-major tensors created with implicit strides could bypass the existing buffer-overrun guard, allowing tensors whose shape exceeds the backing buffer to be constructed (notably reachable via IPC ReadTensor where shape comes from metadata).
Changes:
- Run
CheckTensorStridesValidityeven when strides are implicit (row-major), using the computed row-major strides. - Add a regression test ensuring
Tensor::Makerejects row-major tensors whose shape requires more bytes than the buffer provides.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| cpp/src/arrow/tensor.cc | Ensures implicit row-major strides also trigger the buffer-size / overrun validation. |
| cpp/src/arrow/tensor_test.cc | Adds a failure-case regression test for undersized buffers with implicit row-major strides. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Thanks contributing @metsw24-max |
|
After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit ca8a194. There were no benchmark performance regressions. 🎉 The full Conbench report has more details. It also includes information about 8 possible false positives for unstable benchmarks that are known to sometimes produce them. |
Rationale for this change
ValidateTensorParametersin cpp/src/arrow/tensor.cc only runs theCheckTensorStridesValiditybuffer-overrun guard when strides are passed explicitly. With implicit (row-major) strides it computes strides for overflow but never checks the data buffer is large enough for the shape, so a tensor whose shape exceeds its buffer is accepted and later read out of bounds. This is reachable from IPCReadTensor, where the shape comes from the flatbuffer and the body size is independent of it.What changes are included in this PR?
Run
CheckTensorStridesValidityon the computed row-major strides too.Are these changes tested?
Added a case to
TestTensor.MakeFailureCases.Are there any user-facing changes?
No.
This PR contains a "Critical Fix". Crafted IPC tensor metadata (or any caller building a row-major tensor over an undersized buffer) bypassed the bounds check, enabling an out-of-bounds read.