Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is this suitable for rslurm? #83

Open
garyzhubc opened this issue Dec 17, 2023 · 1 comment
Open

Is this suitable for rslurm? #83

garyzhubc opened this issue Dec 17, 2023 · 1 comment

Comments

@garyzhubc
Copy link

garyzhubc commented Dec 17, 2023

I have several experiments that I'd like to run one experiment on each node and each experiment is a sequence of executions with several cores. Right now my code looks like:

run_seeds <- c(1,2,3,4,5,6,7,8,9,10)

write_lines(paste("problem", "run_seed", "num_patients", "method", "name", "type", "error", sep="\t"), file=err_file_name)

# initialize loop
for (j in 1:length(run_seeds)) {
    ...
}

# start loop
for (i in range_pat) {

  print(paste("iteration",i))
  
  for (j in 1:length(run_seeds)) {

    run_seed<-run_seeds[[j]]
    set.seed(run_seed)
    ...
    write_lines(paste(problem, run_seed, i, "dst", ind_name, type, error, sep="\t"), file=err_file_name, append=TRUE)

  }
}

Is this suitable for rslurm? If so, how can I change the code? By looking at the example given https://cran.r-project.org/web/packages/rslurm/vignettes/rslurm.html, I don't necessarily want to export and rds file or generate slurm script. I'd like to run it within one slurm script. Is it doable? Or do I need to change it to the format that's acceptable to rslurm? Also, there's a certain order of the result returned by the nodes. Is it still doable?

@qdread
Copy link
Contributor

qdread commented Jan 5, 2024

Hi @garyzhubc thanks for your message. Sorry for the slow response. Yes, this is suitable for rslurm because it is a loop where each iteration of the loop does not depend on the previous iteration. The .rds file and slurm scripts that are generated by rslurm are done "automatically" without any need for you as the user to create them. What you need to do is:

  • define a function that will be executed repeatedly (in other words, make the contents of the loop into a function)
  • instead of appending to err_file_name in each iteration, make the function output the line that will be appended to the error file
  • replace the loop with a call to slurm_map() or slurm_apply()

Without a reproducible example, I cannot be sure this will exactly work for you but this is the basic framework:

my_fn <- function(run_seed, <insert any other needed inputs here>) {
  <insert code here>
  return(data.frame(problem=problem, run_seed=run_seed, num_patients=num_patients, method=method, name=name, type=type, error=error)
}

my_data <- expand.grid(run_seed = c(1,2,3,4,5,6,7,8,9,10), <insert any other needed inputs here>)

my_job <- slurm_apply(my_fn, my_data, <insert any other slurm options here>)
error_log <- get_slurm_out(my_job) # This code will run when the job finishes

This will generate the slurm scripts in a temporary directory within the current working directory, submit the job if a Slurm job scheduler is detected on your system, run the job, collect the output in the temporary directory, and then import the output to R.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants