You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In a more recent update, the primary way we use clipoverlap was broken by --readname enforcing alphanumeric sorting of readnames. BWA's raw SAM output is readname sorted in that readpairs (with of course the same name) are next to each other in the order read from the fastq files but not in alphanumeric sorted order.
The issue is order of operations and how you pipe commands together. As it stands with the latest versions, we have to align, then coordinate sort, then clipoverlap and mark completely clipped reads as unmapped, with then essentially ruins the proper coordinate sorted order. What we'd like to do, and what make the most sense computationally, would be to align, pipe that raw output in to clipoverlap while marking completely clipped reads as unmapped, then coordinate sort and index.
Would it be possible for the --readname option to drop its alphanumeric enforcement and go back to just readpairs must be adjacent to each other? Or add a --bwasam option for streaming straight from BWA?
Thanks!
Ben
The text was updated successfully, but these errors were encountered:
I added an option to turn off the ReadName validation. That fix is in the current master branch.
It was fixed in commit: 9d3eec0
If you update to the current version, and specify "--noRNValidate" it should not enforce alphanumeric sorting. Alternatively, you can modify bamUtil/src/ClipOverlap.cpp and remove the call to "samIn.setSortedValidation(SamFile::QUERY_NAME);"
I hope that fixes your issue. Let me know if you have any questions.
In a more recent update, the primary way we use clipoverlap was broken by --readname enforcing alphanumeric sorting of readnames. BWA's raw SAM output is readname sorted in that readpairs (with of course the same name) are next to each other in the order read from the fastq files but not in alphanumeric sorted order.
The issue is order of operations and how you pipe commands together. As it stands with the latest versions, we have to align, then coordinate sort, then clipoverlap and mark completely clipped reads as unmapped, with then essentially ruins the proper coordinate sorted order. What we'd like to do, and what make the most sense computationally, would be to align, pipe that raw output in to clipoverlap while marking completely clipped reads as unmapped, then coordinate sort and index.
Would it be possible for the --readname option to drop its alphanumeric enforcement and go back to just readpairs must be adjacent to each other? Or add a --bwasam option for streaming straight from BWA?
Thanks!
Ben
The text was updated successfully, but these errors were encountered: