Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

An option to output SEQ field for secondary alignment #687

Merged
merged 5 commits into from
Apr 21, 2023

Conversation

mikolmogorov
Copy link
Contributor

Hi!

I have been using minimap2 as a part of the Flye pipeline. I am using secondary alignments during consensus/polishing to account for possible duplications in disjointigs and improve the base quality of long repeats. Storing SEQ field for secondary alignments in SAM/BAM files makes the alignment file parsing much easier, since all separate alignments could be processed independently.

An existing -Y option solves the problem, but it also forces supplementary alignments to use soft clipping, which in some datasets dramatically increases alignment size. I therefore added a new option -secondary-seq that enables output of SEQ for secondary alignments, and uses hard clipping for both supplementary and secondary alignments.

The option was extensively tested as a part of the Flye pipeline for almost a year. I however have not tested outside of genome assembly setting.

Best,
Mikhail

@tillea
Copy link

tillea commented May 5, 2022

It would be really great if this patch could be applied to enable building flye smoothly.

@mikolmogorov
Copy link
Contributor Author

Hi all,

Will be happy to make this pull up-to-date with the master, once you confirm that you are interested in adopting. The conflict is due to new command line options, so it is an easy fix.

@olechnwin
Copy link

@fenderglass,
I don't know how the pull works. But, I just posted the same request here.
It would be great if sequences for secondary mapping can be turned on. Thanks!

@cmdcolin
Copy link

just saw this PR but I made a program that tries to help add SEQ to secondary alignments as a post-processing step though this PR would be interesting too! https://github.com/cmdcolin/secondary_rewriter

@lh3 lh3 merged commit 704fbc6 into lh3:master Apr 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants