Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove of sort pipe in bwa wrappers #216

Closed
fgvieira opened this issue Oct 11, 2020 · 9 comments
Closed

Remove of sort pipe in bwa wrappers #216

fgvieira opened this issue Oct 11, 2020 · 9 comments
Labels
enhancement New feature or request

Comments

@fgvieira
Copy link
Collaborator

fgvieira commented Oct 11, 2020

Is your feature request related to a problem? Please describe.
Several bwa wrappers include a pipe to convert to bam or to sort:
bwa/mem
bwa/mem-samblaster
bwa/samse
bwa/sampe
bwa-mem2/mem
bwa-mem2/mem-samblaster

I believe this is not needed, does not make sense and is, sometimes, inconvenient.

Not needed since the pipe can be easily implemented in snakemake with other rules/wrappers. Since these are actually wrappers of two wrappers, shouldn't they actually be meta wrappers? And inconvenient when you actually want a sam file (and not a bam).

Describe the solution you'd like
Remove the optional sorting from all bwa wrappers and have them output a sam file. Optionally, they could be re-implemented as meta-wrappers.
The same would apply to bwa/mem-samblaster and bwa-mem2/mem-samblaster.

@fgvieira fgvieira added the enhancement New feature or request label Oct 11, 2020
@moldach
Copy link

moldach commented Oct 15, 2020

I've just ran into this very same issue.

It's a pain if the user wants to use alternative tools for sorting (i.e. sambamba wrapper) in their pipeline (or different versions of samtools)

@jafors
Copy link
Contributor

jafors commented Oct 15, 2020

Thanks for the input! I guess this is sth. worthy to discuss. At the moment, the bwa-mem wrapper will output an unsorted bam file when setting sort="none", using samtools view. I could imagine adding a param to not do this and directly output the resulting sam file, since I (from my side) would like to avoid a breaking change. This should allow using the wrapper both without any use of samtools and as before.

Edit: Similar for the other wrappers you mentioned.

@johanneskoester
Copy link
Contributor

johanneskoester commented Oct 15, 2020

Conversion to the optimal format (here BAM, in other cases VCF) should be part of the wrapper, because it comes almost for free, and makes workflows easier to read (hiding an unimportant technical detail). However, I agree that sorting does not need to be part of the BWA wrapper.

It should be possible to deactivate the conversion though, simply by specifying .sam instead of .bam as output.

@fgvieira
Copy link
Collaborator Author

@johanneskoester I agree that sometimes it is nice to hide unimportant technical detail, but isn't that why there are the meta wrappers? I might be missing something here but, as I understand it, the idea would be that main wrappers only do one thing (i.e. they are wrappers to a single tool/command). When it is nice to hide those technical details, then we'd have the meta-wrappers that are exactly that, a wrapper of wrappers.

In this case, I think it would make more sense to have the bwa-mem wrapper output a sam file and then have meta-wrappers called (e.g.) meta/bio/bwa/mem/bam or meta/bio/bwa/mem/sort/samtools or meta/bio/bwa/mem/sort/picard.

But I'd be happy just with the possibility of getting a sam file from bwa-mem 😄

@johanneskoester
Copy link
Contributor

Meta-Wrappers still show as more than one rule in a workflow. However, format conversion is so trivial that I think it in general rather just hampers clarity of a workflow. We do not even need an option for it. Just by specifying a .sam output file, the wrapper can switch to non-samtools-piping behavior.

In a way, those format conversion functionality just fixes usability issues of the respective tools.

@johanneskoester
Copy link
Contributor

@fgvieira do you want to clean up the bwa wrapper family accordingly in a PR?

@fgvieira
Copy link
Collaborator Author

Sure, I can change the bwa wrapper family to output sam if the extension of the output file is .sam in a PR.

Unless @jafors prefers to do it himself...

@fgvieira
Copy link
Collaborator Author

I've recently made a PR (#186) with a new combined bwa sam(pe/se) wrapper that automatically infers if the input is single-end or paired-end (basically combines bwa sampe and bwa samse into a single wrapper).

As an example, I've changed this wrapper so that the output file format is inferred from the output file extension, including when the output is sorted (both by picard and samtools). That means that we can get both sam or bam files for unsorted, samtools sorted and picard sorted.

If everybody agrees, I can do the same for the rest of the bwa wrapper family...

PS - that PR has a bunch of other wrappers, since I kinda got carried away.. 😄

@fgvieira
Copy link
Collaborator Author

Issue has been fixed on PR #186

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants