-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Open
Description
The warning message in datafusion/physical-plan/src/spill/mod.rs (lines 157-162) is being triggered very aggressively in production environments, causing log noise.
warn!(
"Record batch memory usage ({actual_size} bytes) exceeds the expected limit ({max_record_batch_memory} bytes) \n\
by more than the allowed tolerance ({SPILL_BATCH_MEMORY_MARGIN} bytes).\n\
This likely indicates a bug in memory accounting during spilling.\n\
Please report this issue in https://github.com/apache/datafusion/issues/17340."
);Since this is a known issue (tracked in #17340) and doesn't affect functional correctness, we should consider one of the following approaches:
Option 1: Downgrade to debug level
Change warn! to debug! to reduce production noise while keeping the diagnostic information available for development.
Option 2: Increase tolerance margin
Adjust SPILL_BATCH_MEMORY_MARGIN from 4096 bytes to a more realistic value that accounts for expected Arrow IPC overhead. But the realistic value could be case by case and needs efforts to investigate.
I lean towards Option 1, it's the easiest way to avoid the influence.
alamb
Metadata
Metadata
Assignees
Labels
No labels