Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Announcement: Snakemake 8.0 released, including the removal of deprecated syntax #2409

Open
johanneskoester opened this issue Aug 15, 2023 · 21 comments
Labels
enhancement New feature or request

Comments

@johanneskoester
Copy link
Contributor

johanneskoester commented Aug 15, 2023

Update: Snakemake 8.0 has been released on Dec 20, 2023!
We keep this issue as it outlines the roadmap for some features that have been postponed to Snakemake 8.x.
Important: Read the migration guide when using Snakemake 8.0 for the first time with your workflows from before.

First, the major highlight is a plugin architecture which will enable a more distributed, less centralized development of Snakemake functionality. Plugin support is already done for execution backends and storage. It will come for deployment options (conda/container/...), scheduling algorithms, and probably more. This way, Snakemake opens up towards research (and publications) in all of these areas without requiring people to do this under the Snakemake organization hood. This work happens in a collaboration of @vsoch and @johanneskoester, and builds upon some marvelous ideas previously explored by a group of dutch students (@hwalinga, @Maarten0110, @je3we3, @LvKvA, also see here).

Second, I used this opportunity to redesign and clean up the Snakemake API.

Third, we have introduced a mechanism to define global conda packages for a Snakemake workflow (which will be injected into an overlay environment at runtime via https://github.com/koesterlab/conda-inject. This will allow to avoid the heavy default dependencies of the Snakemake conda package and make it very minimal and lightweight again.

Fourth, (in fact based on above mechanism) the Snakemake language will be separated (in 8.x) from the Snakemake runtime, such that workflows can pin a supported version range of the Snakemake language that can then be loaded for parsing at runtime. In the future, this will avoid to a certain extend the update of older workflows upon deprecation of language features (see below), because the language and the runtime can be updated and deployed independently of each other.

Fifth, the Snakemake language will gain some new helper functions (in 8.x) for collecting, branching, and lookup of values from config or sheets. These functions are designed to improve the readability, as they allow more semantic code instead of lots of very technical input functions.

Finally, we have removed some old functionality that has been superseded by new concepts:

  • subworkflow directive (use module directive instead)
  • version directive (use conda/container directive instead)
  • dynamic output (use checkpoints instead)

The deprecation of these concepts has been reflected since a long time already, hence we hope that the impact will be minimal.

@johanneskoester johanneskoester added the enhancement New feature or request label Aug 15, 2023
@johanneskoester johanneskoester pinned this issue Aug 15, 2023
@vsmalladi
Copy link

@johanneskoester for the language will there be a new repo to propose language changes?

@johanneskoester
Copy link
Contributor Author

Yes, that is the current plan. A new repo and a new package, called snakemake-language. The snakemake package will depend on it, but there will additionally be a new directive to pin a certain version range from within the workflow. I need to see while implementing how much independence between the runtime and the language is archievable.

@hwalinga
Copy link
Contributor

Thanks for the shoutout. It's great to see Snakemake still improving its flexibility beyond its current "feature completeness".

For anybody interested, our analysis and ideas can still be read here:

https://desosa2022.netlify.app/projects/snakemake/

I also introduced Snakemake at my job, and what I also realized is that Snakemake can benefit you more than just easier scaling and all the bunch. It is like a microservice framework on the process level, where you have to think about your software architecture boundaries not as a HTTP interface, but as a CLI interface.

@cmeesters
Copy link
Contributor

not sure what this means with respect to executors - is there a devel branch already? (Honestly, I have not seen one.)

@vsoch
Copy link
Contributor

vsoch commented Sep 12, 2023

not sure what this means with respect to executors - is there a devel branch already? (Honestly, I have not seen one.)

Not yet, haven't started yet.

@johanneskoester
Copy link
Contributor Author

See updates above: Most of the executors have been transferred into plugins (github.com/snakemake/snakemake-executor-plugin-*). The plugins are not stable yet, and will be finalized and properly tested as soon as possible.

@vsoch
Copy link
Contributor

vsoch commented Sep 20, 2023

My ETA for starting is likely early October - I got hit with two major talks and experiments, and I'm operating in full steam mode, at least until the first talk is done and the experiments are run (and I've started on the second one). Apologies to everyone for the delay! On my queue is refactor for the flux executor, and of course, batch.

@mwort
Copy link

mwort commented Sep 27, 2023

Looking forward to all these amazing new developments! I was wondering if that will also make it easier to construct workflows from pure python, e.g. by subclassing sm objects, as opposed to using the snakemake language? I've found it's one thing to persuade larger (operational) groups to adopt a new python package, but quite another (insurmountable) thing to persuade them to adopt a new language.

@cmeesters
Copy link
Contributor

... and I'm operating in full steam mode, ...

Same here, no apologies needed. I have so many "how can X be done then" questions, that I can figure that adjusting the docs alone is a lot of work.

@BEFH
Copy link

BEFH commented Sep 30, 2023

Please, please reconsider the deprecation of subworkflows. It would be a major breaking change for most of my complicated pipelines subworkflows and modules can and do coexist.

Having the simpler subworkflows allows me to use a pipeline with subworkflows as a module and have all of the subworkflows of that module have the same prefix.

Additionally, I use subworkflows to make single pipelines flexible by including rules or not depending on the config file. This change would make that more difficult or impossible. An example is here:

if qc_type['ancestry']:
    include: 'rules/ancestry.smk'

if qc_type['popstrat']:
    if config['pcair']:
        if not qc_type['relatedness']:
            warnings.warn("PCAiR requires relatedness QC. Enabling.")
            qc_type['relatedness'] = True
        include: 'rules/relatedness.smk'
        relatedness_included = True
    include: 'rules/stratification.smk'

if qc_type['relatedness'] and not relatedness_included:
    include: 'rules/relatedness.smk'
    relatedness_included = True

It also is problematic to have to rewrite all the code.

Could you consider keeping the subworkflow directive while using modules as a backend in order to maintain compatibility and features while still simplifying the codebase?

@cmeesters
Copy link
Contributor

@BEFH It is not the functionality which will be deprecated or dropped, but rather the directive, which will be dropped in favour of the module directive, only. Your code shows facultatively included rules, which is functionality inherent to modularization and is - to my understanding - uneffected by the upcoming changes.

@BEFH
Copy link

BEFH commented Oct 5, 2023 via email

@johanneskoester
Copy link
Contributor Author

johanneskoester commented Nov 8, 2023

It is, and I am sorry if it causes trouble for you. But subworkflows have been marked as deprecated for a long time. Indeed, some workflows would need refactoring, but in general the module system is a superset of subworkflows from a functional perspective, and it does not make sense to maintain two equivalent funcionalities (we already did for far more than a year). I am happy to receive an email and schedule a meeting in case you have specific issues with it or features that I missed. Also note that the module system had some bugs causing problems with rule names in nested scenarios but as far as I know they are all resolved in the upcoming Snakemake 8. If not, I am of course eager to fix them. For example, I am happy to take PRs against the main branch (which is already Snakemake 8) containing test cases where the module system currently still fails. I would like to fix all of them before the release of Snakemake 8.

@johanneskoester
Copy link
Contributor Author

@BEFH wait, I just checked your example. This is actually not subworkflows, but the include directive. That one will not be removed at all. Of course it stays and everybody uses it a lot! With subworkflows I was referring to this: https://snakemake.readthedocs.io/en/stable/snakefiles/modularization.html#snakefiles-sub-workflows

@BEFH
Copy link

BEFH commented Nov 8, 2023 via email

@johanneskoester
Copy link
Contributor Author

Indeed!

@vsoch
Copy link
Contributor

vsoch commented Nov 23, 2023

hey folks! For those previously using the Google Life Sciences executor (or with interest in using Google Batch) development is underway, and I think we are at a point where any interested snakemake developers can come in and start playing around and contributing to development. To be clear - this is not ready for any kind of testing - we are still fairly early on, but I wanted to be inclusive in this process and get more eyes on the work. There have been a lot of changes (and a lot of moving pieces).

The current branch I'm working from is this PR: snakemake/snakemake-executor-plugin-googlebatch#10 and the issue that I'm running into locally (primarily because I'm inexperienced with using snakemake, which is funny since I've developed a lot) is getting this basic hello world example to work: https://github.com/snakemake/snakemake-executor-plugin-googlebatch/tree/main/example/hello-world.

I had it working when I implemented a custom upload strategy for the workflow artifacts (akin to the life sciences executor) but @johanneskoester added a new feature to snakemake for it to handle these bundles snakemake/snakemake-executor-plugin-googlebatch#9 and I haven't gotten a dummy example of the previously working workflow running yet with this new setup. Any help would be awesome! I have credentials I can use to develop and test locally (but can't put into the CI). If someone (or someones!) wants to tag team with me on this, that would be super fun <3

I've posted this message to a GLS issue, and apologies for the redundancy. I think we can make faster progress working together / with many eyes! Happy Thanksgiving! 🥧

@metadatadriven
Copy link

Is there any estimate on the timescale for v8? We are looking into possibility of integrating with databricks storage and looks like a storage plugin is the future rather than the remote function calls?

@johanneskoester
Copy link
Contributor Author

Is there any estimate on the timescale for v8? We are looking into possibility of integrating with databricks storage and looks like a storage plugin is the future rather than the remote function calls?

Awesome! I hope to release an alpha version in the coming days. The final release still has to wait for adding back all old remote providers as plugins and for finalizing the google and azure executors.

@jlumpe
Copy link

jlumpe commented Dec 14, 2023

I am really looking forward to the API redesign. I use Snakemake heavily at my job and have been developing an automated testing framework of sorts. Currently this involves a somewhat hacky process to "sneak" information about the workflow out of Snakemake by using a special rule to dump it to a JSON file. This is then used in testing code to reconstruct a representation of the workflow and rules using my own classes, because to my knowledge their Snakemake counterparts aren't fully documented or available outside of a running workflow.

I am hoping the new API and plugin system will make it easier for external code to read and inspect the data defined in a Snakefile.

@johanneskoester johanneskoester changed the title Announcement: Upcoming Snakemake 8.0, including the removal of deprecated syntax Announcement: Snakemake 8.0 released, including the removal of deprecated syntax Dec 20, 2023
@BEFH
Copy link

BEFH commented Dec 21, 2023

Congrats! @johanneskoester, now that 8.0 is released, could you please take a look at #2515?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

9 participants