Description
This issue is used to track related issues of MemoryManager.
Why
There are three ways of computation in Gluten before:
- whole stage computation, just like Spark whole stage codegen.
- different threads under same Spark task scope, such as PythonRunner, which create a writer thread to compute fragment's result then read writer thread's result in Spark executor task thread.
- data conversion happens in Driver, DPP result will be converted to unsafe row in Spark driver side which used in filter expression.
Above cases make lots complexity when implement resource management. Gluten use TaskResource in java side to abstract each resources need to be managed and group these into different priority to achieve release order. In native side, Gluten use TaskStorage concept which is limited as thread local to bind all native resources into this thread local storage. Therefore, all managed resources could be released by its priority when Spark task completed.
However, this way can not deal with situation 2 and 3, so Gluten use a fallback storage to manage these resources which lived out of Spark task's scope, and release these only when process end.
Actually, all the resources we need to care about is memory, manage memory in a unified way is important.
What MemoryManager can do
Define a clearly boundary, we put the resources (memory pool, allocator, etc.) into MemoryManager, initialized when enter computation part firstly and released when leaved.
In java side, initialize and hold the handle of native memory manager, destroy the handle when Spark task completed. In native side, hold unique ptr of each type of memory resources and only expose raw pointer getter API to caller, so any caller should not care about resource's lifecycle or try to hold it, just use it.
To #1 and #2, initialized in Spark task begin and released in Spark task completed by related listener. To #3, initialized at conversion begin and released at conversion end.
In progress issues
Description
This issue is used to track related issues of MemoryManager.
Why
There are three ways of computation in Gluten before:
Above cases make lots complexity when implement resource management. Gluten use TaskResource in java side to abstract each resources need to be managed and group these into different priority to achieve release order. In native side, Gluten use TaskStorage concept which is limited as thread local to bind all native resources into this thread local storage. Therefore, all managed resources could be released by its priority when Spark task completed.
However, this way can not deal with situation 2 and 3, so Gluten use a fallback storage to manage these resources which lived out of Spark task's scope, and release these only when process end.
Actually, all the resources we need to care about is memory, manage memory in a unified way is important.
What MemoryManager can do
Define a clearly boundary, we put the resources (memory pool, allocator, etc.) into MemoryManager, initialized when enter computation part firstly and released when leaved.
In java side, initialize and hold the handle of native memory manager, destroy the handle when Spark task completed. In native side, hold unique ptr of each type of memory resources and only expose raw pointer getter API to caller, so any caller should not care about resource's lifecycle or try to hold it, just use it.
To #1 and #2, initialized in Spark task begin and released in Spark task completed by related listener. To #3, initialized at conversion begin and released at conversion end.
In progress issues