Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FR: give some control over minimap2 behaviour | write both (un)aligned BAMs #196

Closed
sklages opened this issue May 24, 2023 · 5 comments
Closed
Labels
enhancement New feature or request

Comments

@sklages
Copy link

sklages commented May 24, 2023

with versions prior to 0.3.0 we ran basecalling (with modified-bases) separated from mapping and methylation calling.
On one hand, because dorado didn't support it yet, and on the other hand to gain some flexibilty.
E.g. we have by far more CPU servers (for all non-basecalling stuff) than GPU servers, so we can more efficiently use our ressources.

Now that dorado is capable of mapping "on-the-fly" without much loss in overall speed I'd like to propose two enhancements which would make it more flexible:

  • give some control over minimap2
    In contrast to bonito dorado now keeps "supplementary alignments" ... but also emits "secondary alignments" which are of no use for some people .. so here I'd like to have control over minimap's --secondary=yes|no parameter. Other basic parameters may be of interest as well.
  • optionally output unaligned BAM along with the mapping
    In case I'd like to re-map my data on sth like hg40 or so, I don't want to convert my aligned BAM to unaligned BAM. Ideally I would direct dorado to also write unaligned BAM data (just as it would if we would omit --reference ) along with the aligned BAM.

What do you think?

@tijyojwad
Copy link
Collaborator

Hi! Thanks for your feedback and request.

give some control over minimap2

This is a very reasonable ask. We would like to expose more options, at the same time not have to re-implement all mm2 options. If you could share the subset you'd be interested in that would be great.

optionally output unaligned BAM along with the mapping

this one I'll need to discuss with our team. writing to 2 outputs is possible but not ideal since it may interfere with overall basecalling speed. I think perhaps in your use case generating fastq/unaligned bams from dorado is better, so you can use it to map against any reference?

@tijyojwad tijyojwad added the enhancement New feature or request label May 25, 2023
@iiSeymour
Copy link
Member

iiSeymour commented May 25, 2023

@sklages dorado aligner allows you to remap previously aligned calls and avoids conversion.

$ dorado aligner hg40.mmi mapped.bam > hg40.mapped.bam 

@sklages
Copy link
Author

sklages commented May 25, 2023

@iiSeymour - cool, I wasn't aware of that. dorado aligner --help isn't very precise here, reads requires any "HTS format". I didn't think about alignments though :-) That is the best solution.

@tijyojwad - my second point makes no sense anymore when dorado aligner accepts aligned BAM as source of reads.
As for the parameters, just a few are interesting (for me), when run as part of dorado basecalling job:

--secondary=yes|no
-N INT	Output at most INT secondary alignments [5].

Maybe -K ..

.. at least from my current perspective :-)

@wilsonte-umich
Copy link

I'd also like more minimap2 control. In addition to --secondary, I routinely adjust the -r (bandwidth) option.

@iiSeymour
Copy link
Member

@sklages @wilsonte-umich we've included these options to dorado aligner in v0.4.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants