-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is it possible to use "immediate-submit" and "job dependencies"? #10
Comments
Hi @kimin0402 , If I understand your enquiry correctly, you are wondering whether we could support job dependencies i.e. only submit a job if a given dependency expression is true (described in the documentation for Snakemake is effectively handling job dependencies already for you, isn't it? I am just wondering what an example use-case for this is? |
Hi @mbhall88, Yes, I was wondering whether LSF profile could support job dependencies by utilizing -w option of LSF. Snakemake is handling the job dependencies, but not without "--immediate-submit" option set true in the config file (see this link for --immediate-submit option: https://snakemake.readthedocs.io/en/stable/executing/cli.html). With this option, you can execute a snakemake script and all bash scripts created from it will be submitted to the cluster at once, and among all bash scripts, those requiring dependencies will be bsub with '-w {dependencies}' with the submit.py wrapper. Right now, when I execute a snakemake script with this LSF profile, the shell in which I executed is constantly running snakemake, waiting until the dependent job is finished. If immediate-submit is set true, all scripts will be submitted to the cluster and I can do other jobs after snakemake completes submitting all bash scripts. Those scripts requiring dependencies will be submitted with -w argument and shown as "PEND" (or "H" in PBS cluster) in a que list. |
I see. And how would you describe the dependencies? Regarding your use case where your shell is constantly running snakemake I would strongly recommend submitting the "master" snakemake process as a job rather than letting it run on the login node. See an example script I use for exactly this with all of my snakemake pipelines. |
Thank you for your example script. In the case where 'immediate submit' is not activated, this seems to be the best way to run snakemake with dependencies. However if you run this master script, and your master script has rules (A, B, C) where dependencies are set to execute A -> B -> C, my understanding is that snakemake does not submit job script for B unless job script for A is finished. In this case, if other people submitted job scripts while job A was running, there will be lots of scripts queing in between job A and B. What I want to do is submit job script A and B together, and specify dependencies for job B with -w argument of lsf. So my jobs would appear like this: and then whereas in the case with a master script submitted, job queue list would appear like this first: and then I hope my explanation clarified the question a little bit. This kind of action is possible with pbs-torque profile, so I was wondering whether lsf profile could do this too. |
Hey @mbhall88 , sorry for a late reply. Posts #7 and #13 indeed are good ideas. Especially post #7 helped me figure out how to assign different -q options to different rules. Thanks a lot. By the way, I think I figured out this dependency issue, mainly by adopting scripts from pbs-torque profile. This is what I changed: 1) Open ~/.config/snakemake/lsf/config.yaml, 1-2) add: immediate-submit: true
notemp: true 2) Edit ~/.config/snakemake/lsf/lsf-submit.py, parser=argparse.ArgumentParser(add_help=False)
parser.add_argument("--depend", help="Space separated list of ids for jobs this job should depend on.")
parser.add_argument("positional",action="append",nargs="?")
args = parser.parse_args()
depend=""
if args.depend:
depend = args.depend
depend = depend.replace(" ", " && ")
if depend:
depend = f" -w '{depend}' " 2-2) change the variable "submit_cmd" so that it includes a string variable "depend" e.g.) submit_cmd = "bsub {resources} {job_info} {queue} {dep} {jobscript}".format(
resources=resources_cmd,
job_info=jobinfo_cmd,
queue=queue_cmd,
dep=depend,
jobscript=jobscript,
) I removed cluster_cmd variable because it kept reading arguments specified by --depend option of argparse as system arguments. This seems to submit all job scripts at once with dependency specified. There might be some problems in the future but it seems to work fine for me so far now. |
@kimin0402 this is super cool!! Really neat solution. Would you be interested in creating a PR on the |
Hi,
Thank you for this wonderful profile. I was wondering whether it is possible to set "immediate-submit: true" in config.yaml file and still use job dependencies setting for LSF profile.
This kind of setting worked well in PBS profile but seems to trouble LSF profile as lsf-submit.py file cannot deal with -w options of LSF.
How can I use this profile so that my snakemake script submits multiple job command with specified dependency options?
The text was updated successfully, but these errors were encountered: