forked from root-project/root
-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[df] Reduce the memory footprint of the computation graph
Every node of the computation graph needs to know which columns it has access to. This information is stored in the RColumnRegister class, which holds a map associating every column name available to a certain node with the corresponding RDefineReader. This object can become quite heavy as each column name is stored as a std::string and the readers are held by an RDefinesWithReaders object which itself is not a trivial type. For very deep computation graphs (e.g. O(10K) `Define` calls chained one after another in the same branch), just the creation of the graph can take up several GBs of memory and a large portion of the runtime is spent in the creation and subsequent destruction of such heavy objects. A similar logic is used for the map of registered variations, but the number of variations grows much slower than the number of calls to Define, so the effects of that are even more difficult to notice. This commit proposes a complete refactoring of how these objects are handled within the RDataFrame computation graph. At first, both the collection of define readers as well as the variation readers are stripped of their ownership responsibilities. RDefinesWithReaders and RVariationsWithReaders objects are created within the RColumnRegister class API, but they are registered centrally by the RLoopManager, which now manages them all via unique_ptr. The RColumnRegister class now only holds references to those objects. As a further memory optimization measure, all the strings relative to the column/variation names are also cached centrally in the RLoopManager and only views to those strings are kept in the RColumnRegister. To avoid circular references in the shared_ptr ownership of the RLoopManager itself, RColumnRegister does not own the RLoopManager anymore. The owner(s) of the RLoopManager are the nodes of the computation graph themselves (via RInterfaceBase). Now, when the last node of the computation graph is destroyed, it will also trigger the destruction of the RLoopManager. In turn, this triggers the deregistration of all the define and variation readers. Fixes root-project#14510
- Loading branch information
1 parent
22c7722
commit 37482cc
Showing
22 changed files
with
392 additions
and
233 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.