Join GitHub today
[MRG+1] ENH Add working_memory global config for chunked operations #10280
We often get issues related to memory consumption and don't deal with them particularly well. Indeed, Scikit-learn should be at home on commodity hardware like developer/researcher laptops.
Some operations can be performed chunked, so that the result is computed in constant (or
It's not very helpful to provide this "how much constant memory" parameter in each function (because they're often called within nested code), so this PR instead makes a global config parameter of it. The optimisation is then transparent to the user, but still configurable.
This PR (building upon my work with @dalmia) will therefore:
changed the title from
[MRG] ENH Add working_memory global config for chunked operations
[MRG+1] ENH Add working_memory global config for chunked operations
Mar 8, 2018
referenced this pull request
May 2, 2018
Is there a plan to in the future also use this in
pairwise_distances_chunked is only useful if it can be reduced, so no, there's no point to including this in pairwise_distances. (But there might be particular distance functions that can be calculated in a chunked way to avoid n_samples * n_samples * n_features arrays, which may be what you are thinking of) If you would like to merge this, I'll happily post follow-ups to close more issues!