memory usage in scaloa #10

dksasaki · 2021-08-11T14:10:39Z

Hey guys,

I've used your objective mapping function scaloa and noticed there is a simple way to reduce memory usage.

The variables d2 and dc2 can occupy a huge memory space, so deleting them after defining both correlation and cross correlation matrices (A,C, respectively) and before inverting the matrix is useful. In one of my cases, it it frees up a few gbs of memory (of course, this depends on both grid and data).

(...)
    d2 = ((np.tile(x, (n, 1)).T - np.tile(x, (n, 1))) ** 2 +
    (np.tile(y, (n, 1)).T - np.tile(y, (n, 1))) ** 2)
    nv = len(xc)
    xc, yc = np.reshape(xc, (1, nv)), np.reshape(yc, (1, nv))
    # Squared distance between the observations and the grid points.
    dc2 = ((np.tile(xc, (n, 1)).T - np.tile(x, (nv, 1))) ** 2 +
    (np.tile(yc, (n, 1)).T - np.tile(y, (nv, 1))) ** 2)
    # Correlation matrix between stations (A) and cross correlation (stations
    # and grid points (C))
    A = (1 - err) * np.exp(-d2 / corrlen ** 2)
    C = (1 - err) * np.exp(-dc2 / corrlen ** 2)
    if 0: # NOTE: If the parameter zc is used (`scaloa2.m`)
        A = (1 - d2 / zc ** 2) * np.exp(-d2 / corrlen ** 2)
        C = (1 - dc2 / zc ** 2) * np.exp(-dc2 / corrlen ** 2)
        
    # here!!!!!!!!!   <----------
    del(d2, dc2)
        
(...)

The text was updated successfully, but these errors were encountered:

iuryt · 2021-08-17T02:50:42Z

Hi @dksasaki,

Thanks for raising this issue.

Do you mean that there is a memory leakage after running the function or just cleaning these variables before running the rest of the interpolation to reduce the peak of memory usage?

We could check how other packages usually deal with this problem and see if del is the best solution.

Me and @dantecn were also thinking about creating an option for breaking grid points into blocks to reduce performance while keeping low memory usage.

We can also just simply add an example for that on documentation.

dksasaki · 2021-08-17T13:42:54Z

Hi @iuryt,

There is no memory leakage. When the method runs, these extra matrices can contribute significantly to the memory usage making the peak in memory even worse. The del thing was just a quick-fix I added, but considering the simplicity if this solution I wonder what problems could arise from this choice.

Breaking the grid into chunks is a good idea, although the whole processing gets slower due to multiple matrix inversions. Let me know if you plan to implement it, I have written a few lines that may help.

iuryt · 2021-08-17T15:30:05Z

If you want to implement breaking into blocks, go ahead. You can add an argument like nblocks=None to scaloa and vectoa.
Despite loosing performance, I believe this is a nice way to bypass memory overload. You may also add verbose=False that can activate some progress bar for lazy interpolation.

Can you check how other packages as xarray deal with cleaning memory?
I believe @Ryukamusa may be the best person in the group to check that as well.

Once you make some of the modifications on your forked repo, you can make a pull request and relate to this issue.

Please, let me know if you have any questions, we just started the group and we are also still learning how to manage the development process here.

iuryt · 2023-07-18T00:12:35Z

It turns out that I came back here for some reason. I just think that we could make this package better by making it work with xarray, that way makes easy to paralelize or run it in slow mode using dask when needed.

dksasaki · 2023-07-18T13:41:52Z

Sorry for not replying, I also forgot about this issue. I developed a way to make this piece faster without using as much memory. I basically only consider data from a given point within a certain distance range. Not sure how to use dask and xarray with it though, but we can give it a try.

iuryt added the enhancement label Aug 17, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

memory usage in scaloa #10

memory usage in scaloa #10

dksasaki commented Aug 11, 2021 •

edited

iuryt commented Aug 17, 2021

dksasaki commented Aug 17, 2021

iuryt commented Aug 17, 2021

iuryt commented Jul 18, 2023

dksasaki commented Jul 18, 2023

memory usage in scaloa #10

memory usage in scaloa #10

Comments

dksasaki commented Aug 11, 2021 • edited

iuryt commented Aug 17, 2021

dksasaki commented Aug 17, 2021

iuryt commented Aug 17, 2021

iuryt commented Jul 18, 2023

dksasaki commented Jul 18, 2023

dksasaki commented Aug 11, 2021 •

edited