Skip to content

Stage level resource management to handle offheap/onheap memory conflict #4392

@FelixYBW

Description

@FelixYBW

Description

As Gluten currently can't fully cover spark functions, we will have to fallback some operators to Vanilla Spark, which leads an conflict of offheap and onheap memory conflict. Currently the suggested solution is that to those queries have both fallback and native operators, we need to config a large offheap memory and a large onheap memory, then the executor number should be decreased due to memory size constraint.

The other solution is to spill offheap memory to onheap memory which we have implemented but performance isn't good, it's not used.

The 3rd solution is to use Spark3.0's stage level resource management, when we detect a stage has fallback, we can fallback the full stage, then restart the executors with high onheap memory.

@zhouyuan @PHILO-HE

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions