-
Notifications
You must be signed in to change notification settings - Fork 194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
properly handle pairing information in surjection #25
Comments
I encountered this issue today trying to use vg to align to the HGVM bake-off graphs. From looking at the SAM spec, it seems that it may be legal to set RNEXT to the next read's name while leaving PNEXT as 0 ("unavailable"), since it's potentially expensive to find the other alignment and work out where it mapped. |
That seems reasonable. We may need to make a second pass to resolve things. On Fri, Aug 7, 2015 at 8:01 PM, adamnovak notifications@github.com wrote:
|
Sorry, what I proposed is completely wrong, because I don't understand SAM. In SAM, the QNAME field specifies the name of the fragment, not of the read. So the two ends of a paired end read are linked together by sharing a QNAME, and there's no need to actually fill in the RNEXT and PNEXT fields unless you want to be efficient. RNEXT doesn't hold the name of the next read on the fragment, but rather the reference contig against which the next read on the fragment was aligned. If it's the same as the reference for the current fragment it can be "=". It looks like what really needs to happen is that Interestingly, right now paired-end information isn't read from BAM, and the "/1" and "/2" aren't added, so if you throw in a paired-end BAM you get multiple VG alignments with the same name. |
@adamnovak you haven't addressed this have you? Is it still a problem on your end? |
I don't think I actually need to surject paired reads for anything I am doing. @glennhickey may want it if he wants to run a read-pair-aware BAM-based variant caller as a control. I haven't checked recently, but I think the handling of the BAM fragment names (not giving paired reads the same fragment name) is still wrong. With the right fragment names you can go through and reconstruct the mate-finding fields, but without it you have a BAM file that doesn't express pairing at all. |
I think the problem is how the reads are pulled out of the bam. The
|
We now do this. |
This is now included in the GAM stream but not handled by surject.
The text was updated successfully, but these errors were encountered: