NestedLoopJoinExec spill path: untracked allocation overshoots memory pool

### Describe the bug

`NestedLoopJoinExec`'s memory-limited (spill) path allocates more memory than it reserves through the `MemoryPool`. The accounting-pool framework added in #22626 surfaces this if `HEADROOM_FACTOR` in the SLT pool is tightened — see #22721, which lowers it from 8.0 → 5.0 and trips this test.

### To Reproduce

In `datafusion/sqllogictest/src/accounting_pool.rs`, set:

\`\`\`rust
const HEADROOM_FACTOR: f64 = 5.0;
\`\`\`

Then run `nested_loop_join_spill.slt`. First query at line 33:

\`\`\`sql
SET datafusion.execution.target_partitions = 1;
SET datafusion.runtime.memory_limit = '150K';

SELECT count(*) as cnt, min(v1) as mn, max(v1) as mx
FROM generate_series(1, 100000) AS t1(v1)
INNER JOIN generate_series(1, 1) AS t2(v2)
  ON (t1.v1 + t2.v2) > 0;
\`\`\`

### Expected behavior

Query succeeds with allocator usage bounded by the configured `memory_limit` (×~10% slop for legitimate untracked overhead like Tokio/Rayon thread state).

### Actual behavior

\`\`\`
External error: 1 errors in file datafusion/sqllogictest/test_files/nested_loop_join_spill.slt
1. query failed: Other Error: allocator overdraft: account balance at panic = -20245 bytes
   at nested_loop_join_spill.slt:33
\`\`\`

Peak allocator usage reaches ~770KB against a declared 150KB pool — 5.13× the budget. The first ~750KB (5×) is absorbed by the headroom factor; the additional 20245 bytes is allocator usage that was never reserved through `MemoryPool::try_grow`.

### Likely root cause

Spill path in `NestedLoopJoinExec` has at least these untracked allocation sites (need verification with targeted instrumentation):

- `generate_next_batch` buffering — `RecordBatch`es accumulated for the build side before the spill decision
- `concat_batches` at the spill boundary — copies into a coalesced batch before the IPC writer sees it
- `take_native` during the probe phase — gather kernels allocate output buffers
- IPC reader path on spill re-read — the `StreamReader`/`FileReader` decoder owns buffers that aren't accounted

The first query in the test stays small (100K × i32 = ~400KB build side), so the overshoot is whatever lives off-pool in the spill setup, not the bulk data itself.

### Additional context

- Surfaced by the allocator-vs-pool tracker added in #22626.
- Demonstration PR: #22721 (test-only: lowers `HEADROOM_FACTOR`; expected to fail until this is fixed).
- Fix should reserve memory through `MemoryReservation` before each of the sites above, not bump the headroom constant — the constant is a leak detector, not a budget.

### Component

- [x] Physical Plan

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NestedLoopJoinExec spill path: untracked allocation overshoots memory pool #22723

Describe the bug

To Reproduce

Expected behavior

Actual behavior

Likely root cause

Additional context

Component

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

NestedLoopJoinExec spill path: untracked allocation overshoots memory pool #22723

Description

Describe the bug

To Reproduce

Expected behavior

Actual behavior

Likely root cause

Additional context

Component

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions