Excessive memory usage in v0.3.0 due to full SVD #574

jsnel · 2021-02-28T09:19:35Z

glotaran version: v0.3.0
Python version: any
Operating System: any

Description

Running a dataset with a large number of datapoint in any dimension, e.g. 20.000 timepoints, will result in excessive memory usage, not during but just after optimization, at the result creation stage. This is because at this point the (full) singular value decomposition of the residual matrix is calculated (since the default for numpy.linalg.svd is full_matrices=True).

In the context of global analysis a full SVD is almost never needed, a economic SVD is what is needed. Further optimization (e.g. making the SVD calculation optional altogether, or using a memory efficient implementation) is possible but it left as a future exercise.

What I Did

Ran the _create_svd function decorate with memory_profiler's @profile decorator before and after the change to the call to numpy.linalg.svd

- l, v, r = np.linalg.svd(dataset[name])
+ l, v, r = np.linalg.svd(dataset[name], full_matrices=False)

# Before
-  1042    223.1 MiB    223.1 MiB           1       @profile
-  1043                                             def _create_svd(self, name: str, dataset: xr.Dataset):
-  1044   3276.3 MiB   3053.2 MiB           1           l, v, r = np.linalg.svd(dataset[name])
# after
+  1038    221.7 MiB    221.7 MiB           1       @profile
+  1039                                             def _create_svd(self, name: str, dataset: xr.Dataset):
+  1040    227.2 MiB      5.6 MiB           1           l, v, r = np.linalg.svd(dataset[name], full_matrices=False)

The same patch can be applied in the _prepare_dataset function

     if "data_singular_values" not in dataset:
-        l, s, r = np.linalg.svd(dataset.data)
+        l, s, r = np.linalg.svd(dataset.data, full_matrices=False)

The text was updated successfully, but these errors were encountered:

jsnel · 2021-02-28T09:39:50Z

Memory profiling results attached:
memory_profiler_results_before.txt
memory_profiler_results_after.txt

jsnel added Type: Serious Bug Crashes, Broken code, Security Issues Status: In Progress Issues being worked on Priority: High Nasty bugs leading to incorrect results or crashes labels Feb 28, 2021

jsnel self-assigned this Feb 28, 2021

jsnel added this to the v0.3.1 - maintenance release milestone Feb 28, 2021

This was referenced Feb 28, 2021

Fix for excessive memory usage when creating results #575

Merged

Fix excessive memory usage in _create_svd #576

Merged

jsnel closed this as completed in #576 Feb 28, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Excessive memory usage in v0.3.0 due to full SVD #574

Excessive memory usage in v0.3.0 due to full SVD #574

jsnel commented Feb 28, 2021

jsnel commented Feb 28, 2021

Excessive memory usage in v0.3.0 due to full SVD #574

Excessive memory usage in v0.3.0 due to full SVD #574

Comments

jsnel commented Feb 28, 2021

Description

What I Did

jsnel commented Feb 28, 2021