Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clipoverlap on raw output straight from bwa #51

Closed
bjkelly opened this issue Aug 31, 2018 · 3 comments
Closed

clipoverlap on raw output straight from bwa #51

bjkelly opened this issue Aug 31, 2018 · 3 comments

Comments

@bjkelly
Copy link

bjkelly commented Aug 31, 2018

In a more recent update, the primary way we use clipoverlap was broken by --readname enforcing alphanumeric sorting of readnames. BWA's raw SAM output is readname sorted in that readpairs (with of course the same name) are next to each other in the order read from the fastq files but not in alphanumeric sorted order.

The issue is order of operations and how you pipe commands together. As it stands with the latest versions, we have to align, then coordinate sort, then clipoverlap and mark completely clipped reads as unmapped, with then essentially ruins the proper coordinate sorted order. What we'd like to do, and what make the most sense computationally, would be to align, pipe that raw output in to clipoverlap while marking completely clipped reads as unmapped, then coordinate sort and index.

Would it be possible for the --readname option to drop its alphanumeric enforcement and go back to just readpairs must be adjacent to each other? Or add a --bwasam option for streaming straight from BWA?

Thanks!
Ben

@mktrost
Copy link
Contributor

mktrost commented Sep 3, 2018

I added an option to turn off the ReadName validation. That fix is in the current master branch.
It was fixed in commit: 9d3eec0
If you update to the current version, and specify "--noRNValidate" it should not enforce alphanumeric sorting. Alternatively, you can modify bamUtil/src/ClipOverlap.cpp and remove the call to "samIn.setSortedValidation(SamFile::QUERY_NAME);"

I hope that fixes your issue. Let me know if you have any questions.

@bjkelly
Copy link
Author

bjkelly commented Sep 4, 2018

Thank you!

We'll update, test it out, and follow up with how its working.

Ben

@bjkelly
Copy link
Author

bjkelly commented Sep 5, 2018

Your update seems to be working as advertised. Thank you!

@bjkelly bjkelly closed this as completed Sep 5, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants