You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, Thank you for writing such a wonderful & useful tool!
I am currently using this for my MERFISH data; and I noticed that with even one sample of 30,000 cells; the function for constructing the edgelist causes a memory leak since it seems to first calculate all possible distances between all cells for the spatial edgelist before filtering for the k closest neighbors. For single-cell resolution spatial data, this can quickly get out of hand; in my case, my 64G RAM was no match.
I have made a slower, but more RAM-friendly work-around by hacking together a version of the edgelist construction function that finds the neighboring cells iteratively and only retains the k closest neighbor cells as it goes.
I'm including a snippet of it here (with added option for parallelization with future), since I hope it may be a useful addition to helping to solve this leak for others using your software.
# example for 10 neighbors
# meta is a data.frame with your cell metadata information
# spatial_1 and spatial_2 are column names inside the meta data frame for the x and y coordinates
get_niche_neighbors <- function(meta, spatial_1='spatial_1', spatial_2='spatial_2', k = 10) {
library(pblapply)
library(future)
df <- data.frame(x = meta[,spatial_1], y = meta[, spatial_2])
rownames(df) <- rownames(meta)
edgelist <- pblapply(X = rownames(df),
cl = "future",
FUN = function(cell) {
# only keep the k closest cells for each iteration; don't let it save the rest
data.frame(
from = cell,
to = names(sort(setNames(sqrt(abs(df[cell, "x"] - df$x)^2 + abs(df[cell, "y"] - df$y)^2), rownames(df)))[1:(k + 1)])
)
})
edgelist <- do.call(rbind, edgelist)
return(edgelist)
}
Sorry if this would be better placed as a pull request; I am honestly not much of a developer!
The text was updated successfully, but these errors were encountered:
Hi, Thank you for writing such a wonderful & useful tool!
I am currently using this for my MERFISH data; and I noticed that with even one sample of 30,000 cells; the function for constructing the edgelist causes a memory leak since it seems to first calculate all possible distances between all cells for the spatial edgelist before filtering for the k closest neighbors. For single-cell resolution spatial data, this can quickly get out of hand; in my case, my 64G RAM was no match.
I have made a slower, but more RAM-friendly work-around by hacking together a version of the edgelist construction function that finds the neighboring cells iteratively and only retains the k closest neighbor cells as it goes.
I'm including a snippet of it here (with added option for parallelization with
future
), since I hope it may be a useful addition to helping to solve this leak for others using your software.Sorry if this would be better placed as a pull request; I am honestly not much of a developer!
The text was updated successfully, but these errors were encountered: