-
Notifications
You must be signed in to change notification settings - Fork 7
8. FAQ
Here you can find the official tutorial: https://snakemake.readthedocs.io/en/stable/tutorial/tutorial.html
This is the link to the NBIS reproducible research workshop, with a Snakemake tutorial: https://nbis-reproducible-research.readthedocs.io/en/latest/snakemake/
The pipeline runs for a very long time, so it is necessary to send the Snakemake process to the background so that you don't have to keep the terminal open until it is finished. To do that, you can use terminal multiplexers such as tmux or screen. Here is a link to a crash-course in tmux: https://robots.thoughtbot.com/a-tmux-crash-course
To run GenErode, you actually only need the very basic commands:
- check which sessions are currently running:
$ tmux ls
- start a new session with the name "mysession":
$ tmux new -s mysession
- detach from a running session: type
CTRL + b
,d
- re-attach to the session:
$ tmux a -t mysession
- kill a session:
$ tmux kill-session -t mysession
If the pipeline run failed, you will get an error message in the log file (i.e. the standard output that we redirect to a file) that reads like this: "Error in rule XYZ". Each rule generates a log file that you can find in the directory results/logs
and the subdirectories therein. If you are using a system with the slurm workload manager, you will find the slurm job ID a few lines below that that corresponds to the job run on the cluster: "cluster_jobid: Submitted batch job 1234567". Under this ID you can then identify the slurm file "slurm-1234567.out", where you will hopefully find another error message explaining why the job failed.
I changed a metadata table and now GenErode attempts to rerun everything from the start. How do I rerun GenErode only for the new samples, or only for the set of samples remaining in the metadata table and for the remaining rules?
Snakemake has changed their rerun behaviour in Snakemake version 7.8 (see https://github.com/snakemake/snakemake/issues/1694). This means that when changing metadata tables, Snakemake will now run everything from the beginning, stating "Set of input files has changed since last execution". To get around this, use --rerun-triggers mtime
in the Snakemake command when starting the pipeline from the command line. This also applies to any local changes in code or other parameters.
I'm trying to run the pipeline with the Snakemake slurm profile, but I'm getting the following error: 'snakemake: error: unrecognized arguments: --cluster-cancel=scancel'. How do I solve this?
This error is related to updates to the Snakemake slurm profile. The argument --cluster-cancel=scancel
is only compatible with Snakemake version 7, not with version 6 that previous versions of the GenErode pipeline require.
You can modify the slurm profile by hand so that it works with Snakemake version 6. To do that, please move into the slurm profile folder (e.g. called slurm
) and open the file config.yaml
. There, delete the following line (or comment it out with #
): cluster-cancel: "scancel"
. Now, the slurm profile won't try to submit jobs with the argument --cluster-cancel=scancel
anymore that caused the error.
If you are using the slurm profile with the profile configuration file, make sure that you copied the file config/slurm/profile/config.yaml
to slurm/config.yaml
after setting up the slurm profile with cookiecutter. In case parameters like number of cores and job duration for one of the slurm jobs needs to be adjusted, follow these steps:
- Adjust the number of cores under
set-threads:
for each rule and memory according to the number of cores underset-resources:
andrule:mem_mb
for each rule in the fileslurm/config.yaml
. The pre-set compute requirements for GenErode are based on a cluster with 6400 MB RAM per CPU. Adjust the duration underset-resources:
andrule:runtime
for each rule in the fileslurm/config.yaml
. This will control the number of cores, available memory and maximum run time assigned to that rule for the slurm job. - Please note that in several cases, rules were grouped together to be run as one job on the cluster. In that case, you need to adjust the parameters for the entire group ID in the file
slurm/config.yaml
. - Some rules (in the
.smk
files within theworkflow/rules/
directory) have numbers of threads specified underthreads
that correspond to the number underset-threads
in theslurm/config.yaml
file. If you change this number in theslurm/config.yaml
file, the number ofthreads
should be adjusted automatically.
Make sure the header section of the cluster configuration file contains the correct project number from which the CPU hours are taken. In case parameters like number of cores and job duration for one of the slurm jobs needs to be adjusted, follow these steps:
- Adjust the number of cores (
cpus-per-task
) and/or duration (time
) in the fileconfig/slurm/cluster.yaml
. This will control the number of cores and maximum run time assigned to that rule for the slurm job. - Please note that in several cases, rules were grouped together to be run as one job on the cluster. In that case, you need to adjust the parameters for the entire group ID in the file
config/slurm/cluster.yaml
. - Some rules (in the
.smk
files within theworkflow/rules/
directory) have numbers of threads specified underthreads
that correspond tocpus-per-task
in theconfig/slurm/cluster.yaml
file. If you change this number in theconfig/slurm/cluster.yaml
file, the number ofthreads
should be adjusted automatically.
The slurm jobs submitted by Snakemake are gone from the list of running jobs, but the pipeline is still running. What happened, and how do I fix it?
This only happens if you run GenErode without a Snakemake profile. When you are using the cluster configuration file config/slurm/cluster.yaml
with --cluster-config
, Snakemake and slurm are communicating about the jobs submitted to the slurm queue by Snakemake. Unfortunately, this does not apply to jobs that failed because they were running out of time. In that case, the jobs stop running but the Snakemake process is waiting for them, and does not throw any error message. If this is the case, you can stop the Snakemake process (CTRL + c
). To restart the pipeline, you need to double check which jobs were running out of time, change the runtime parameter for that specific rule (see "How do I change parameters for slurm jobs" and "If you are using the cluster configuration file config/slurm/cluster.yaml
"), double check that there are no results files (delete them in case there are files produced by that rule), and restart the Snakemake pipeline in dry mode. It should then attempt to run the rule that failed due to the timeout.
The pipeline rules have quite long run times to avoid this problem, but maybe some specific samples are bigger than the ones used to test the pipeline.
Each step of GenErode depends on most previous steps (except the mitogenome mapping, which depends on the fastq file processing, but is not automatically loaded for the subsequent steps). The pipeline is written in a way so that all required steps are automatically included if you set a step to True
in the config file. If you set several steps to True
at the same time, Snakemake therefore tries to include the same steps multiple times and throws this warning message.
I want to rerun the pipeline with changed parameters settings, but I get the message "nothing to be done". How do I fix that?
GenErode checks the presence of the final output file of each step to decide if it should rerun the analyses. For most steps of the pipeline, these are the output files of the MultiQC analysis. You can find them in the stats
directory of the step you were running, and therein. Deleting, moving or renaming these files will force the pipeline to rerun the analyses leading to these files, using the parameters specified in the config file.
For downstream analyses (mlRho, PCA, ROH, snpEff, gerp), delete, move or rename the final output files (tables, figures) to trigger a rerun (see next question).
Alternatively, you can add the flag -R path/to/file.out
to the Snakemake command to start the pipeline, with path/to/file.out
being the file you want to re-create.
Is it possible to rerun the pipeline with a different optional filtering step in the same directory, or will it overwrite everything? For example, would the output from a run without subsampling be overwritten when rerunning the pipeline with subsampling?
Subsampled BAM files, the resulting mlRho output and BCF files per individual have different file names than the same files without subsampling, so a rerun would not overwrite the not-subsampled files. The merged BCF files and all downstream files, however, have the same file name for any individual filtering, so they would be overwritten. If it is important to keep both versions, please rename the file that should be protected from overwriting before rerunning the pipeline (or move it to a new directory).
I want to keep intermediate files that would be otherwise automatically deleted by Snakemake (marked as "temporary"). How do I do that?
This is only recommended when you have double checked that you have enough storage space to keep intermediate files as GenErode is creating a very large number of (large) files. Also, please remove them as soon as you don't need them anymore. If you are really sure you want to prevent intermediate files from being deleted, run the pipeline from the command line, adding the flag --notemp
.
Once you want to remove all temporary files at once, you can start a run with the additional flag --delete-temp-output
. It is recommended to do a dry run first to see which files will be deleted.
Snakemake tells me that the working directory is locked by another Snakemake process. I've tried to run --unlock
but the error message remains.
GenErode is written in a way that it expects the config file to be config/config.yaml
and can't unlock the working directory if you saved it under a different name. To unlock the working directory, type snakemake --unlock --cores 1 --configfile config/my_config.yaml
(or the file name you chose).
What is the solution for the error "sbatch: error: Batch job submission failed: Socket timed out on send/recv operation"?
This is an error related to the slurm system that is used to run jobs. There is not much you can do about it, except for waiting for the current Snakemake process including all slurm jobs to finish and to re-start the pipeline.
This seemed to be a bug related to certain browsers in GenErode versions prior to 0.5.0. When trying to access the MultiQC files from a report downloaded to a Mac, this happened with Chrome and Firefox, but it worked with Safari.