Clear unused intermediate DataFrame memory in channel averaging function #559

alex-l-kong · 2022-05-20T21:15:52Z

What is the purpose of this PR?

Closes #548. Pixel channel averaging needs to be an iterative process due to the way they are batched per FOV. These individual pixel cluster files can be extremely large, and Python does not automatically release them from memory even if the variable is overwritten on the next iteration. This means that for massive datasets, Docker could easily run out of memory. This PR should prevent this from happening.

How did you implement your changes

compute_pixel_cluster_channel_avg will need to explicitly flush out these large, unused DataFrames with del. We do this for the intermediate fov_pixel_data, sum_by_cluster, count_by_cluster, and agg_results variables. This is especially important for the middle 2, because groupby objects can be very computationally expensive.

Remaining issues

This may not be the only memory adjustment we need to make, but it's a start.

alex-l-kong · 2022-05-20T21:16:18Z

@ngreenwald can you see if this fixes your kernel dying error?

UPDATE: if so, I'll see if there are any other places we can use explicit del commands.

ngreenwald · 2022-05-20T22:28:11Z

Yeah, won't get to it until this weekend, can you check if it resolves memprofiler issues?

…

On Fri, May 20, 2022 at 2:16 PM alex-l-kong ***@***.***> wrote: @ngreenwald <https://github.com/ngreenwald> can you see if this fixes your kernel dying error? — Reply to this email directly, view it on GitHub <#559 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADJB47P7QNA6DR7TH3PEOUDVK76LDANCNFSM5WQTNQNQ> . You are receiving this because you were mentioned.Message ID: ***@***.***>

alex-l-kong · 2022-05-20T22:34:40Z

@ngreenwald it does on my end.

ngreenwald · 2022-05-22T20:12:03Z

Nope, still died

ngreenwald · 2022-05-22T21:53:33Z

I closed the docker session, restarted, and it worked. So it seems like there's some leftover memory leaks from previous cells that are run which then causes the kernel to die when that one is run. Can you run profiling on the entire pixel clustering workflow to try and figure out what's causing it?

ngreenwald · 2022-05-23T15:49:12Z

Different version of trained SOM, this time it died on the first step, "using remapping scheme." This is even though I had just started up the docker. Given that all of these different versions of the clustering pipeline are using the same data, I don't understand the randomness of it sometimes dying and sometimes not

ngreenwald · 2022-05-23T15:52:26Z

Actually, this was after I switched to a different branch. So it seems like this change successfully addressed the issue underlying the remapping step, since that error cropped up again as soon as I switched, but not the overall memory issues

alex-l-kong · 2022-05-23T16:25:03Z

@ngreenwald yeah I kind of suspected this might happen, which is why I alluded to needing to flush out the memory elsewhere using del. I just wanted to see if the remapping step memory problems were addressed; seems so, which means I'll now memprof and run del (or equivalent commands in R, if needed) on previous steps in the pipeline.

alex-l-kong · 2022-05-23T17:01:18Z

@ngreenwald identified create_pixel_matrix as another offending function and might be the worst of the bunch. These are the stats after just 15 FOVs on Candace's smaller dataset:

The reason these are much larger is because we have to store both a pixel_mat and a pixel_mat_subset in memory for a FOV. Since we now know that Python doesn't automatically clear these on the next iteration (in spite of the variable being overwritten), this will blow up memory usage.

Explicitly calling del on the no-longer-needed DataFrames and DataArrays should eliminate this problem.

…, and run_pixel_som

alex-l-kong · 2022-05-23T21:27:00Z

@ngreenwald the update should ensure large, intermediate DataFrames (especially during loops) are released. Tested it on my end without memory errors, can you see if it helps on your end?

ngreenwald · 2022-05-24T03:12:04Z

Crashed on the cluster_pixels function when computing average channel expression. I merged this in with my own branch that I'm working on, and named it combined_branch. Can you take a look, I just pushed it, to make sure all of your changes were included? If so, then there's still an issue

alex-l-kong · 2022-05-24T17:49:45Z

@ngreenwald yeah everything in combined_branch is there. Since even freeing memory during the run is an issue, we might need to come up with a different way to address this.

One option potentially is to pre-compute this average in R instead while run_pixel_som.R is processing each FOV. I'm not a huge fan of this option since it would require another intermediate file to be saved. However, it does allow for one fewer per-FOV loop, meaning one fewer time we have to read each FOV into memory one at a time.

Another option I'm trying out right now is directly invoking the Python garbage collector using gc. This generally doesn't immediately remove memory, but in certain cases with millions of objects, it can help avoid fragmented memory. Also not normally a huge fan of this option (since the garbage collector should normally know how to do its job), but in our case where we have millions of objects in memory at once, it could help ease things up a bit.

ngreenwald · 2022-05-24T18:40:18Z

Just had the kernel die on a completely new docker instance, running only the remapping function, not the matrix creation, som training, etc. This makes me think it's not an issue with leftover memory from other functions

alex-l-kong · 2022-05-24T19:01:01Z

@ngreenwald that's good to know. Let me focus on memory profiling that function to see if there's anything popping up. There wasn't anything when I ran it on my end, but I'll double-check on your branch.

ngreenwald · 2022-08-26T05:01:18Z

This issue was never replicated outside my laptop

Free large data frame memory in pixel channel averaging function

d2c082c

alex-l-kong self-assigned this May 20, 2022

alex-l-kong added 3 commits May 23, 2022 10:56

Eliminate memory bottlenecks in create_pixel_matrix, create_pixel_som…

d1529f5

…, and run_pixel_som

Change check to is not None for seg_label memory free

ae64caa

Free memory in consensus clustering too

79aeae0

Merge branch 'master' into pixel_channel_mem

a44d581

ngreenwald closed this Aug 26, 2022

ngreenwald deleted the pixel_channel_mem branch August 26, 2022 05:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clear unused intermediate DataFrame memory in channel averaging function #559

Clear unused intermediate DataFrame memory in channel averaging function #559

alex-l-kong commented May 20, 2022

alex-l-kong commented May 20, 2022 •

edited

Loading

ngreenwald commented May 20, 2022 via email

alex-l-kong commented May 20, 2022

ngreenwald commented May 22, 2022

ngreenwald commented May 22, 2022

ngreenwald commented May 23, 2022

ngreenwald commented May 23, 2022

alex-l-kong commented May 23, 2022

alex-l-kong commented May 23, 2022 •

edited

Loading

alex-l-kong commented May 23, 2022

ngreenwald commented May 24, 2022

alex-l-kong commented May 24, 2022

ngreenwald commented May 24, 2022

alex-l-kong commented May 24, 2022 •

edited

Loading

ngreenwald commented Aug 26, 2022

Clear unused intermediate DataFrame memory in channel averaging function #559

Clear unused intermediate DataFrame memory in channel averaging function #559

Conversation

alex-l-kong commented May 20, 2022

alex-l-kong commented May 20, 2022 • edited Loading

ngreenwald commented May 20, 2022 via email

alex-l-kong commented May 20, 2022

ngreenwald commented May 22, 2022

ngreenwald commented May 22, 2022

ngreenwald commented May 23, 2022

ngreenwald commented May 23, 2022

alex-l-kong commented May 23, 2022

alex-l-kong commented May 23, 2022 • edited Loading

alex-l-kong commented May 23, 2022

ngreenwald commented May 24, 2022

alex-l-kong commented May 24, 2022

ngreenwald commented May 24, 2022

alex-l-kong commented May 24, 2022 • edited Loading

ngreenwald commented Aug 26, 2022

alex-l-kong commented May 20, 2022 •

edited

Loading

alex-l-kong commented May 23, 2022 •

edited

Loading

alex-l-kong commented May 24, 2022 •

edited

Loading