Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removing snakemake.shell from wrappers #160

Closed
wants to merge 13 commits into from
Closed

Removing snakemake.shell from wrappers #160

wants to merge 13 commits into from

Conversation

vsoch
Copy link

@vsoch vsoch commented Aug 20, 2020

This will prepare the snakemake-wrappers repository to not require the pickled Snakemake object. There were also a few snakemake.utils makedirs and I replaced those with os.makedirs. For some of the wrappers there was a line like:

from snakemake.shell import shell
shell.executable("/bin/bash")

which I'm guessing ensures we use bash and not /bin/sh, but I'm not sure how to replicate that without the special snakemake function. I would suspect it mostly works removing it, but let's discuss how to address. If it's absolutely required might want to do like:

os.system(f"/bin/bash -c 'commands here'")

but I'm hoping that's overkill, it certainly doesn't seem safe or reliable to get back a return code, and maybe we'd even totally lose the child shell.

Signed-off-by: vsoch vsochat@stanford.edu

…nakemake pickle object

Signed-off-by: vsoch <vsochat@stanford.edu>
@vsoch
Copy link
Author

vsoch commented Aug 20, 2020

Ahhh every line needs formatting, so if we have:

os.system(
        f"mv {tempdir}/{snakemake.wildcards.assembler}.fasta {snakemake.output.raw_assembly}"
        " && mv {tempdir}/contigs.fa {snakemake.output.contigs}"
    )

It needs to be

os.system(
        f"mv {tempdir}/{snakemake.wildcards.assembler}.fasta {snakemake.output.raw_assembly}"
        f" && mv {tempdir}/contigs.fa {snakemake.output.contigs}"
    )

I have a loooooooooot of work to do, lol.

Signed-off-by: vsoch <vsochat@stanford.edu>
Signed-off-by: vsoch <vsochat@stanford.edu>
Signed-off-by: vsoch <vsochat@stanford.edu>
@christopher-schroeder
Copy link
Contributor

I am not skilled enough to know if this leads to unexpected behavior. I just wanted to hint, that we prevent the f"{something}" strings for backward compatibility for python < 3.6.

@vsoch
Copy link
Author

vsoch commented Aug 21, 2020

Hmm @johanneskoester said explicitly to use them, e.g snakemake/snakemake#532 (comment). Also the setup.py of Snakemake indicates minimum Python of 3.5 so this would make sense.

@christopher-schroeder
Copy link
Contributor

-.- He should have also told the others, we are always removing them, thanks!

@vsoch
Copy link
Author

vsoch commented Aug 21, 2020

I'm going to hold off working on this further then, it could have been an oversight. I wish I had known before putting about 5 hours into this! Thanks for the insight!

@vsoch
Copy link
Author

vsoch commented Aug 21, 2020

But hey, at least we can all agree it's no longer reasonable to support Python 2 :P

@johanneskoester
Copy link
Contributor

@christopher-schroeder that advice is still correct in general. However, what @vsoch is working on here is a change in how wrappers are used. So far, Snakemake had to invoke them with its own Python binary, thereby pickling the job properties into the script. @vsoch is working on lifting this requirement, so that the job properties are put into the script via code generation. Once that is done, one can also safely switch to a later Python in the wrapper script, because it does not anymore need to work together with the Python from the Snakemake that is invoking it.

@johanneskoester johanneskoester marked this pull request as draft August 25, 2020 10:06
Signed-off-by: vsoch <vsochat@stanford.edu>
@vsoch
Copy link
Author

vsoch commented Aug 25, 2020

okay I think we might have a missing dependency install in the conda recipe - with this updated snakemake (using a library called pulp in the scheduler) installing from conda has this error for every job:

	1
Traceback (most recent call last):
  File "/usr/share/miniconda/envs/snakemake/lib/python3.8/site-packages/snakemake/__init__.py", line 631, in snakemake
    success = workflow.execute(
  File "/usr/share/miniconda/envs/snakemake/lib/python3.8/site-packages/snakemake/workflow.py", line 957, in execute
    success = scheduler.schedule()
  File "/usr/share/miniconda/envs/snakemake/lib/python3.8/site-packages/snakemake/scheduler.py", line 402, in schedule
    else self.job_selector_ilp(needrun)
  File "/usr/share/miniconda/envs/snakemake/lib/python3.8/site-packages/snakemake/scheduler.py", line 538, in job_selector_ilp
    import pulp
ModuleNotFoundError: No module named 'pulp'

This must be a new release because it wasn't needed for the last test! @johanneskoester want to bring this to your attention! In the meantime I'll install pulp for the CI.

@vsoch
Copy link
Author

vsoch commented Aug 25, 2020

I don't see pulp here https://bioconda.github.io/recipes/snakemake/README.html.

Signed-off-by: vsoch <vsochat@stanford.edu>
@FelixMoelder
Copy link
Contributor

I don't see pulp here https://bioconda.github.io/recipes/snakemake/README.html.

This should be fixed now.

Signed-off-by: vsoch <vsochat@stanford.edu>
@vsoch
Copy link
Author

vsoch commented Aug 26, 2020

Thank you @FelixMoelder ! I just removed it from here so we can test.

Signed-off-by: vsoch <vsochat@stanford.edu>
@vsoch
Copy link
Author

vsoch commented Aug 26, 2020

@FelixMoelder I added it back, removing it made all tests fail, so possibly it's not being grabbed from conda. I'll leave this for now and we can address after the wrappers are finished being updated.

@vsoch
Copy link
Author

vsoch commented Aug 26, 2020

So close! Just one failed, samtools. I need to figure out the issue - it previously used bash, but when I used /bin/bash -c it couldn't find the executable. When I removed that, it turns into a different error.

…iple lines

Signed-off-by: vsoch <vsochat@stanford.edu>
@FelixMoelder
Copy link
Contributor

@FelixMoelder I added it back, removing it made all tests fail, so possibly it's not being grabbed from conda. I'll leave this for now and we can address after the wrappers are finished being updated.

That's odd.
When creating a new environment pulp is being installed and everything works fine.
Still on the bioconda website pulp is not listed as dependency.
Hopefully there is just some delay in distribution or the PR has pinned a specific version.

@vsoch
Copy link
Author

vsoch commented Aug 26, 2020

I think we shouldn't worry for now, it's likely a delay like that. :)

@FelixMoelder
Copy link
Contributor

I think we shouldn't worry for now, it's likely a delay like that. :)

It looks like snakemake-minimal is installed and also requires pulp. I‘ll update the recipe later. Thank‘s for bringing that to the table!

@vsoch
Copy link
Author

vsoch commented Aug 26, 2020

Holy cow all tests passed! Converting from draft now...

@vsoch vsoch marked this pull request as ready for review August 26, 2020 21:46
@FelixMoelder
Copy link
Contributor

If you don't mind you could try removing pulp as dependency again. ;)
snakemake-minimal has been updated and should contain pulp.

Copy link
Contributor

@johanneskoester johanneskoester left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the wrappers now additionally need python >=3.6 in their environment.yaml.

Signed-off-by: vsoch <vsochat@stanford.edu>
@vsoch
Copy link
Author

vsoch commented Aug 27, 2020

Leaving a note for myself - several packages have dependency issues that can't be resolved with adding python. I'll need to remove this addition and rewrite the shell commands to not require format strings.

FAILED test.py::test_fastq_screen - subprocess.CalledProcessError: Command '[...
FAILED test.py::test_happy_prepy - subprocess.CalledProcessError: Command '['...
FAILED test.py::test_strelka_germline - subprocess.CalledProcessError: Comman...
FAILED test.py::test_strelka_somatic - subprocess.CalledProcessError: Command...
FAILED test.py::test_fasterq_dump - subprocess.CalledProcessError: Command '[...

This shouldn't take too long - will try to make some time toward end of today!

Signed-off-by: vsoch <vsochat@stanford.edu>
@vsoch
Copy link
Author

vsoch commented Aug 27, 2020

All set! We're green again!

@johanneskoester
Copy link
Contributor

johanneskoester commented Sep 1, 2020

Leaving a note for myself - several packages have dependency issues that can't be resolved with adding python. I'll need to remove this addition and rewrite the shell commands to not require format strings.

FAILED test.py::test_fastq_screen - subprocess.CalledProcessError: Command '[...
FAILED test.py::test_happy_prepy - subprocess.CalledProcessError: Command '['...
FAILED test.py::test_strelka_germline - subprocess.CalledProcessError: Comman...
FAILED test.py::test_strelka_somatic - subprocess.CalledProcessError: Command...
FAILED test.py::test_fasterq_dump - subprocess.CalledProcessError: Command '[...

This shouldn't take too long - will try to make some time toward end of today!

Ok, that's unfortunate. Since we anyway do code generation for the dynamically generated preamble of the wrapper script, we could generate a Python-version-independent shell function like this:

def shell(cmd):
    context = inspect.currentframe().f_back.f_locals
    return os.system(cmd.format(**context))

This should work on Python 2 as well. Further, it makes the wrappers a bit more readable again because the f before every string disappears again. @vsoch sorry for coming with this after you have converted everything :-(! I should've suggested that from the beginning.

@vsoch
Copy link
Author

vsoch commented Sep 1, 2020

@johanneskoester I already fixed the Python 2 errors above - but are you changing your mind that you want none of them to use format strings now? It’s a lot of (different) work and probably means starting over... :( So far this has easily been about 8-10 hours of work.

@jafors
Copy link
Contributor

jafors commented Sep 1, 2020

I don't have much issues with the format strings, but the Python 2 wrappers really lack a bit of readability - maybe Johannes' idea can be applied only on those wrappers?

However, if you guys decide on removing the format strings again, I could probably chime in and take some wrappers - I'm sure this is a task that can be parallelized.

@johanneskoester
Copy link
Contributor

@johanneskoester I already fixed the Python 2 errors above - but are you changing your mind that you want none of them to use format strings now? It’s a lot of (different) work and probably means starting over... :( So far this has easily been about 8-10 hours of work.

Yes, I am indeed changing my mind (for the sake of readability). Sorry for that. However, you don't need to do it, I can do it, no worries. Also, adding this code-generated shell function means that the wrappers can basically stay the same as in the master branch now. The only thing to change is to remove the line from snakemake.shell import shell in each wrapper.

@johanneskoester
Copy link
Contributor

Nevertheless, I am deeply sorry to hear that you already worked >8 hours on this. I wished there was a way to keep this, but it just feels better to have a consistent and nicely readable shell function for all python wrappers. I can take over any other future boring task and you can sit on the sofa and relax while I am working on it. Would that be ok?

@vsoch
Copy link
Author

vsoch commented Sep 1, 2020

haha, I don’t mind the tasks, it just feels sad to not use the work, or think about the opportunity cost of the time. I’m not one to sit on a couch, I don’t own one actually :p

@johanneskoester
Copy link
Contributor

haha, I don’t mind the tasks, it just feels sad to not use the work, or think about the opportunity cost of the time. I’m not one to sit on a couch, I don’t own one actually :p

If you're inside, you are always on this treadmill, right :-)?

@vsoch
Copy link
Author

vsoch commented Sep 3, 2020

Correct :)

@vsoch vsoch closed this Sep 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants