Is your feature request related to a problem or challenge?
issue: apache/datafusion-comet#3873
Currently, datafusion comet cannot trigger datafusion native operators to spill in response to memory pressure from Spark’s task memory manager. When Spark task memory manager is under pressure, it may ask one consumer to spill so another consumer in the same task can make progress. Comet can route that request into native code, but there is currently no DataFusion interface for asking the spill-capable operator to reclaim memory in response to that external request.
Describe the solution you'd like
Considering right now:
- We can add an interface for spill-capable operators to expose reclaim/spill behavior.
Describe alternatives you've considered
- Current solution is basically only do local spills in df native and spark can make its own spark consumers spill like usual.
- Do not trigger spill here in df and instead have another signal we can send to df native.
Additional context
- This may be generally useful for any spark accelerators where datafusion is embedded in a similar way to comet
Is your feature request related to a problem or challenge?
issue: apache/datafusion-comet#3873
Currently, datafusion comet cannot trigger datafusion native operators to spill in response to memory pressure from Spark’s task memory manager. When Spark task memory manager is under pressure, it may ask one consumer to spill so another consumer in the same task can make progress. Comet can route that request into native code, but there is currently no DataFusion interface for asking the spill-capable operator to reclaim memory in response to that external request.
Describe the solution you'd like
Considering right now:
Describe alternatives you've considered
Additional context