Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

squeue cmd not found when running R targets on singularity docker image #294

Open
ailtonpcf opened this issue May 16, 2023 · 1 comment
Open

Comments

@ailtonpcf
Copy link

To whom it may concern,
Thank you for the templates to use on HPC :D

I'm triggering an R target pipeline on snakemake together with R docker image on singularity. The pipeline:

`#!/usr/bin/env R

work_dir <- "06-fungal-control"
source(here::here(paste("src", work_dir, "defaults.R", sep = "/")))

tar_option_set(packages = c("tidyverse","tarchetypes"),
format = "qs",
memory = "transient",
garbage_collection = TRUE,
storage = "worker",
retrieval = "worker")

library(future)
library(future.batchtools)

future::plan(
tweak(
future.batchtools::batchtools_slurm,
template="src/06-fungal-control/slurm.tmpl",
resources=list(
walltime=259200,#minutes
memory=62500,
ncpus=4,
ntasks=1,
partition="standard",
chunks.as.arrayjobs=TRUE)
)
)

list(
tar_target(
metadata,
read_tsv("raw/04-tedersoo-global-mycobiome/Tedersoo L, Mikryukov V, Anslan S et al. Fungi_GSMc_sample_metadata.txt")
),
tar_target(
continent_countries,
read_csv("raw/05-countries-continent/countries.csv")
),
tar_target(
subset_samples,
european_samples(metadata, continent_countries)
),
tar_target(
raw_abundance,
read_tsv("raw/04-tedersoo-global-mycobiome/Fungi_GSMc_OTU_Table.txt")
),
tar_target(
taxonomy,
get_taxonomy("raw/04-tedersoo-global-mycobiome/Tedersoo L, Mikryukov V, Anslan S et al. Fungi_GSMc_data_biom.biom")
),
tar_target(
raw_abundance_long,
long_abundance(raw_abundance, subset_samples)
)
)`

However, it doesn't work. R complains that the squeue command is not found. Here's the log:

`Date = Tue May 16 10:59:08 CEST 2023
Hostname = node069
Working Directory = /home/qi47rin/proj/02-compost-microbes/src/06-fungal-control

Number of Nodes Allocated = 1
Number of Tasks Allocated = 1
Number of Cores/Task Allocated = 1

Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 1 (use --cores to define parallelism)
Rules claiming more threads will be scaled down.
Job stats:
job count min threads max threads


get_fungal_spikein 1 1 1
targets 1 1 1
total 2 1 1

Select jobs to execute...

[Tue May 16 10:59:15 2023]
rule get_fungal_spikein:
input: src/06-fungal-control/analyze_server.R
output: logs/06-fungal-control/spike.log
jobid: 1
reason: Missing output files: logs/06-fungal-control/spike.log
resources: tmpdir=/tmp

Activating singularity image /home/qi47rin/proj/02-compost-microbes/.snakemake/singularity/8c1aaca4ec464428d6d90db9c1dc0fbf.simg
running
'/usr/local/lib/R/bin/R --no-echo --no-restore --no-save --no-restore --file=src/06-fungal-control/analyze_server.R'

here() starts at /home/qi47rin/proj/02-compost-microbes
Global env bootstraped.
here() starts at /home/qi47rin/proj/02-compost-microbes
Global env bootstraped.
✔ skip target continent_countries
✔ skip target metadata
✔ skip target subset_samples
✔ skip target taxonomy
• start target raw_abundance
✔ skip pipeline
Warning message:
In readLines(template) :
incomplete final line found on '/home/qi47rin/proj/02-compost-microbes/src/06-fungal-control/slurm.tmpl'
Error : Listing of jobs failed (exit code 127);
cmd: 'squeue --user=$USER --states=R,S,CG --noheader --format=%i -r'
output:
command not found
Error in tar_throw_run():
! ! in callr subprocess.
Caused by error:
! Listing of jobs failed (exit code 127);
cmd: 'squeue --user=$USER --states=R,S,CG --noheader --format=%i -r'
output:
command not found
Visit https://books.ropensci.org/targets/debugging.html for debugging advice.
Backtrace:

  1. └─targets::tar_make_future(workers = 4)
  2. └─targets:::callr_outer(...)
  3. └─base::tryCatch(...)
    
  4.   └─base (local) tryCatchList(expr, classes, parentenv, handlers)
    
  5.     └─base (local) tryCatchOne(expr, names, parentenv, handlers[[1L]])
    
  6.       └─value[[3L]](cond)
    
  7.         └─targets::tar_throw_run(...)
    
  8.           └─rlang::abort(...)
    

Execution halted
[Tue May 16 10:59:27 2023]
Error in rule get_fungal_spikein:
jobid: 1
output: logs/06-fungal-control/spike.log
shell:

    Rscript             --no-save             --no-restore             --verbose             src/06-fungal-control/analyze_server.R | tee logs/06-fungal-control/spike.log
    
    (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Removing output files of failed job get_fungal_spikein since they might be corrupted:
logs/06-fungal-control/spike.log
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: src/06-fungal-control/.snakemake/log/2023-05-16T105913.649364.snakemake.log`

It worked before with conda, because when you activate an environment, every app remains available. However in a container, there are problems when squeue do queries regard user id and the slurm system id. I also tried to mount slurm volumes ... but it didn't work either. Then, is there a way to avoid the squeue command when using tar_make_future to trigger jobs on slurm?

Thanks is advance,
AIlton.

@tmspvn
Copy link

tmspvn commented Jul 5, 2024

Hi, have you find a solution?

i half did:

export SINGULARITY_BINDPATH="$SINGULARITY_BINDPATH,/etc/passwd,/var/run/munge,/usr/lib64/libmunge.so.2.0.0:/usr/lib64/libmunge.so.2,/run/slurm/conf/slurm.conf:/etc/slurm/slurm.conf,/usr/lib64/slurm,/usr/bin/sbatch,/usr/bin/squeue,/usr/bin/scancel"

Edit: it runs, but in my cluster it return BatchtoolsExpiration error, I opened a discussion here but I've got no answer from the developers to date (23/7/24)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants