Reduce peak memory usage of kfold_varsel()
#419
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This (more precisely, commit e11edc7) reduces the peak memory usage of
kfold_varsel()
massively by avoiding that the output ofselect()
andget_submodls()
from all CV folds is stored in common (very large) objects (search_path_cv
andsubmodls_cv
). This movement of large portions of code should also be beneficial for a future parallelization across CV folds (because with this PR, we have all code that needs to be run on one worker within a single function) and in my opinion, it also enhances the readability ofkfold_varsel()
.The readability of
kfold_varsel()
-related functions is also enhanced by commits 23b274e to c64ac57 which are just refactoring commits.