BufferExec and AnalyzeExec should report eager evaluation

### Describe the bug

[`EvaluationType::Eager`](https://github.com/apache/datafusion/blob/5c92390921d8d667aa7cb7d56276a59ba36926f4/datafusion/physical-plan/src/execution_plan.rs#L973-L986) is documented as an operator stream that eagerly generates `RecordBatch` values in one or more spawned Tokio tasks. `BufferExec` and `AnalyzeExec` both appear to match that behavior, but their `PlanProperties` do not report eager evaluation.

In apache/datafusion@5c92390921d8d667aa7cb7d56276a59ba36926f4:

- [`BufferExec::new(...)`](https://github.com/apache/datafusion/blob/5c92390921d8d667aa7cb7d56276a59ba36926f4/datafusion/physical-plan/src/buffer.rs#L102-L105) clones the input properties and only changes `SchedulingType` to `Cooperative`, so it keeps the input evaluation type.
- [`BufferExec::execute(...)`](https://github.com/apache/datafusion/blob/5c92390921d8d667aa7cb7d56276a59ba36926f4/datafusion/physical-plan/src/buffer.rs#L181-L223) wraps the input stream in `MemoryBufferedStream::new(...)`.
- [`MemoryBufferedStream::new(...)`](https://github.com/apache/datafusion/blob/5c92390921d8d667aa7cb7d56276a59ba36926f4/datafusion/physical-plan/src/buffer.rs#L320-L331) immediately creates a `SpawnedTask` that polls the input stream into an internal queue.

Similarly:

- [`AnalyzeExec::compute_properties(...)`](https://github.com/apache/datafusion/blob/5c92390921d8d667aa7cb7d56276a59ba36926f4/datafusion/physical-plan/src/analyze.rs#L165-L174) calls `PlanProperties::new(...)`, leaving `evaluation_type` at the default `Lazy`.
- [`AnalyzeExec::execute(...)`](https://github.com/apache/datafusion/blob/5c92390921d8d667aa7cb7d56276a59ba36926f4/datafusion/physical-plan/src/analyze.rs#L232-L253) uses `RecordBatchReceiverStream::builder(...)` and calls `builder.run_input(...)` for each input partition.
- The comments in `AnalyzeExec::execute(...)` describe those futures as running input partitions in parallel on separate Tokio tasks.

This makes `EvaluationType` less reliable for physical-plan analysis. DataFusion already exposes [`need_data_exchange(plan)`](https://github.com/apache/datafusion/blob/5c92390921d8d667aa7cb7d56276a59ba36926f4/datafusion/physical-plan/src/execution_plan.rs#L1219-L1220), implemented as:

```rust
plan.properties().evaluation_type == EvaluationType::Eager
```

so stale or incomplete `EvaluationType` metadata can cause eager child-polling operators to be missed.

### To Reproduce

Inspect the physical properties of these operators:

1. Construct a `BufferExec` over a child whose `evaluation_type` is `Lazy`.
2. Check `buffer_exec.properties().evaluation_type`.
3. Construct an `AnalyzeExec` with a normal input plan.
4. Check `analyze_exec.properties().evaluation_type`.

Both report `EvaluationType::Lazy` in these cases, even though their `execute(...)` paths drive input polling from spawned tasks.

### Expected behavior

If the documented contract for `EvaluationType::Eager` is intended to identify operators that drive child stream polling from spawned Tokio tasks, then `BufferExec` and `AnalyzeExec` should set `PlanProperties::with_evaluation_type(EvaluationType::Eager)`.

`BufferExec` should probably always report eager evaluation because its buffering stream creates a background task to poll the input stream.

`AnalyzeExec` should probably report eager evaluation because it runs input partitions through `RecordBatchReceiverStream::builder(...).run_input(...)`.

If this is not the intended meaning of `EvaluationType::Eager`, then the docs for `EvaluationType` and/or `need_data_exchange(...)` should be clarified so callers know which eager child-polling operators are intentionally excluded.

### Additional context

This does not appear to be a query-result correctness issue. It is a physical-plan metadata consistency issue for optimizers and integrations that use DataFusion metadata to reason about execution topology.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BufferExec and AnalyzeExec should report eager evaluation #22708

Describe the bug

To Reproduce

Expected behavior

Additional context

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

BufferExec and AnalyzeExec should report eager evaluation #22708

Description

Describe the bug

To Reproduce

Expected behavior

Additional context

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions