Skip to content

Large jump in disk space usage at the start of each iteration within a loop #812

@pmac0451

Description

@pmac0451

Hello, I use the foreach, future, doFuture and future.callr packages to run some large calculations in parallel as part of a scheduled job which runs every night. Recently my job has routinely failed with an error saying the agent has run out of disk space (NB hard drive space, not RAM!). I've added more logging to the process and I can see from the logging that at the start or end (it's not 100% clear) of each iteration within a loop there is indeed a big jump in disk space usage. If it makes it to the end of all the iterations without being killed then the space is released and it can continue, but if there are too many then it runs out and the process is killed. I haven't made any changes to my code in a long time and I don't think the size of the data involved in each parallel loop has increased much recently either. There is nothing in the code that I'm knowingly running within each iteration that should leave behind any large files on disk.

Could any of the recent changes to any of the above packages result in data being written to a temporary file between each iteration, and not deleted until the end of the entire loop? And if that's intentional, is there anything I can do to stop this?

From looking at my logs I've noticed that the jump in disk space usage between each iteration is roughly the same each time (c500mb), even though the size of each iteration can vary quite a lot. From this I wonder if what it's storing is the (common) inputs to each iteration, which is roughly the right size? Is this something to do with the ability to resume futures in the even of an error and is there a way to turn this off?

The relevant parts of my script are as follows - I don't think the rest is relevant but I can provide more detail if required.

library(foreach)
library(doParallel)
library(doFuture)
library(future.callr)

largeInputDataList <- # large list of input data

future::plan(callr, workers = 4) # 4 available cores
options(doFuture.rng.onMisuse = "ignore")

result <-
  foreach(
    id = ids, # vector of 55 elements
    .errorhandling = "pass",
    .options.future = list(chunk.size = 1) # For dynamic load balancing
  ) %dofuture% {
    # Code that ultimately returns a small data frame of a handful of rows,
    # but in the meantime does some large and complex calculations using largeInputDataList
  }

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions