Skip to content

Commit

Permalink
GH-39582: [C++][Acero] Increase size of Acero TempStack (#40007)
Browse files Browse the repository at this point in the history
We have had problems for a long time with a specific batch job that combines data from different sources. There is something in the data causing an Acero execution plan to hang or crash at random. The problem has been reproduced since Arrow 11.0.0, originally in Ruby, but it has also in Python. There is unfortunately no test case that reliably reproduces the issue in a release build.

However, in a debug build we can see that the batch job causes an overflow on the temp stack in arrow/cpp/src/arrow/compute/util.cc:38. Increasing the size of the stack created in the Acero QueryContext works around the issue, but a real fix should be investigated separately.

**This PR contains a "Critical Fix".**
* Closes: #39582

Lead-authored-by: Sten Larsson <sten@burtcorp.com>
Co-authored-by: Antoine Pitrou <antoine@python.org>
Signed-off-by: Antoine Pitrou <antoine@python.org>
  • Loading branch information
stenlarsson and pitrou committed Feb 26, 2024
1 parent a7ac7e0 commit 9a7662b
Show file tree
Hide file tree
Showing 2 changed files with 9 additions and 8 deletions.
2 changes: 1 addition & 1 deletion cpp/src/arrow/acero/query_context.cc
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ size_t QueryContext::max_concurrency() const { return thread_indexer_.Capacity()
Result<util::TempVectorStack*> QueryContext::GetTempStack(size_t thread_index) {
if (!tld_[thread_index].is_init) {
RETURN_NOT_OK(tld_[thread_index].stack.Init(
memory_pool(), 8 * util::MiniBatch::kMiniBatchLength * sizeof(uint64_t)));
memory_pool(), 32 * util::MiniBatch::kMiniBatchLength * sizeof(uint64_t)));
tld_[thread_index].is_init = true;
}
return &tld_[thread_index].stack;
Expand Down
15 changes: 8 additions & 7 deletions cpp/src/arrow/compute/util.cc
Original file line number Diff line number Diff line change
Expand Up @@ -32,17 +32,18 @@ using internal::CpuInfo;
namespace util {

void TempVectorStack::alloc(uint32_t num_bytes, uint8_t** data, int* id) {
int64_t old_top = top_;
top_ += PaddedAllocationSize(num_bytes) + 2 * sizeof(uint64_t);
// Stack overflow check
ARROW_DCHECK(top_ <= buffer_size_);
*data = buffer_->mutable_data() + old_top + sizeof(uint64_t);
int64_t new_top = top_ + PaddedAllocationSize(num_bytes) + 2 * sizeof(uint64_t);
// Stack overflow check (see GH-39582).
// XXX cannot return a regular Status because most consumers do not either.
ARROW_CHECK_LE(new_top, buffer_size_) << "TempVectorStack::alloc overflow";
*data = buffer_->mutable_data() + top_ + sizeof(uint64_t);
// We set 8 bytes before the beginning of the allocated range and
// 8 bytes after the end to check for stack overflow (which would
// result in those known bytes being corrupted).
reinterpret_cast<uint64_t*>(buffer_->mutable_data() + old_top)[0] = kGuard1;
reinterpret_cast<uint64_t*>(buffer_->mutable_data() + top_)[-1] = kGuard2;
reinterpret_cast<uint64_t*>(buffer_->mutable_data() + top_)[0] = kGuard1;
reinterpret_cast<uint64_t*>(buffer_->mutable_data() + new_top)[-1] = kGuard2;
*id = num_vectors_++;
top_ = new_top;
}

void TempVectorStack::release(int id, uint32_t num_bytes) {
Expand Down

0 comments on commit 9a7662b

Please sign in to comment.