Excessive memory usage in v0.3.0 due to full SVD #574
Labels
Priority: High
Nasty bugs leading to incorrect results or crashes
Status: In Progress
Issues being worked on
Type: Serious Bug
Crashes, Broken code, Security Issues
Description
Running a dataset with a large number of datapoint in any dimension, e.g. 20.000 timepoints, will result in excessive memory usage, not during but just after optimization, at the result creation stage. This is because at this point the (full) singular value decomposition of the residual matrix is calculated (since the default for
numpy.linalg.svd
isfull_matrices=True
).In the context of global analysis a full SVD is almost never needed, a economic SVD is what is needed. Further optimization (e.g. making the SVD calculation optional altogether, or using a memory efficient implementation) is possible but it left as a future exercise.
What I Did
Ran the _create_svd function decorate with memory_profiler's
@profile
decorator before and after the change to the call tonumpy.linalg.svd
The same patch can be applied in the _prepare_dataset function
The text was updated successfully, but these errors were encountered: