Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Seemingly random crashes with amp_drift_corr and convert_measurement loops #1

Closed
sginot opened this issue Nov 28, 2022 · 2 comments
Closed

Comments

@sginot
Copy link

sginot commented Nov 28, 2022

Hi Peter,

I was unable to reproduce the error I got in the first run, but had a few more crashes.
It seems to be related to memory usage indeed, as memory use by RStudio keeps on increasing while the loop runs (see screenshots below).

Adding gc() within the loop seemed to help, athough the task manager still shows memory use increasing.

Here is the code run on the raw data (in blanke-nas-1, under Public/Simon/RAW_Data_Cut):

library(forceR)
library(magrittr)
library(dplyr)
library(stringr)
library(purrr)
library(ggplot2)
library(readr)

raw_fold <- "../RAW_Data_Cut"

files <- file.path(raw_fold, list.files(raw_fold, pattern = ".dat"))

for (i in 1:length(files)) {
  convert_measurement(file = files[i], path.data = "./converted_files/")
  gc(verbose = F)
} #Loop crashes R sometimes

convert_fold <- "./converted_files"

con_files <- file.path(convert_fold, list.files(convert_fold))

ampdriftcorr_folder <- "./ampdriftcorr"

for(i in 1:length(con_files)){
  filename <- con_files[i]
  print(filename)
  print(i)
  amp_drift_corr(filename = filename,
                 tau = 9400,
                 res.reduction = 10,
                 plot.to.screen = FALSE,
                 write.data = TRUE,
                 write.PDFs = TRUE,
                 write.logs = TRUE,
                 output.folder = ampdriftcorr_folder,
                 show.progress = FALSE)
  gc()
  print("***********")
} #Loop here also crashes

FroceRcrash
ForceRrun
ForceRrun3

@Peter-T-Ruehr
Copy link
Owner

Hi Sam,

thanks for bringing this to attention. The maximum file number I ever looped through using convert_measurement() and amp_drift_corr() was around 1.6k files with a size of 1.9 GB and there were no issues, so I wasn't aware this could happen.

I have implemented a silent gc() command in the functions (will be available with the next release by parsing collect_garbage = TRUE, but using it in the loop like you do should already have the same effect). With the gc() command, R memory usage after looping through your files still turned out to be 2.39 GB (+/- the size of the files) but the loop finished without issues on my machine. Without it, I get the Error: Expected single integer \n value invalid option "error" which I cannot quite understand but may be a result of a memory issue - but I'm really not sure.

Reading a bit through R memory management on StackOverflow it seems that gc() may theoritcally free memory, but that it may not be efficiently used afterwards because the following R processes cannot fully reclaim it. This is kind of out of my hand - or at least beyond my ability to solve. I'd suggest, like you already tried, splitting the loop into several separate ones.

Sorry I can't help more with this right now.

@sginot
Copy link
Author

sginot commented Nov 28, 2022

Error: Expected single integer \n value invalid option "error" was also the error I got the first time around.

Splitting the loop does the job indeed. One could also probably use the memory.limit() function, but I am not familiar with that either...

@sginot sginot closed this as completed Nov 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants