-
Notifications
You must be signed in to change notification settings - Fork 213
Description
1. What were you trying to do?
I have a diploid graph which contains the MAT and PAT haplotypes of a cell line along with CHM13. I'm trying to surject all graph placements to the haplotypes of the cell line in order to convert alignments from graph space to linear space.
2. What actually happened?
I observe that whenever the nodes (nodes to which the read is aligned in the graph) represent a segdup region such that the node represents two different regions of the same path, vg surject only surjects to one of the two intervals with the default option. The other alternate alignment is only produced when I turn on the --supplementary option.
So I get surjections that look like this with --supplementary
S32_9020 16 KOLF2.1J#2#chr3_2 61483584 60 AS:i:15557
S32_9020 272 KOLF2.1J#1#chr3_1 61480544 60 AS:i:15425
and like this with --supplementary
S32_9020 16 KOLF2.1J#2#chr3_2 61483584 60 AS:i:15557
S32_9020 2064 KOLF2.1J#2#chr3_2 62260247 60 AS:i:15557
S32_9020 272 KOLF2.1J#1#chr3_1 61480544 60 AS:i:15425
3. What did you want to happen?
I would want to get multiple surjections to the same path from a single graph placement without turning on the --supplementary option as right now a complete read alignment is getting tagged as supplementary along with chimeric alignments.
5. What data and command can the vg dev team use to make the problem happen?
Command for surject:
vg surject -x kolf2.1j-dg-sample.gbz --into-ref KOLF2.1J --multimap --progress --bam-output S32_9020.gam > S32_9020_without_u.bam
All files present in: /private/groups/patenlab/sagorika/read_simulation/surject_issue
6. What does running vg version say?
vg version v1.69.0-172-g0beef571a "Bologna"