-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
iVAR inappropriately shifts position of reverse mapped reads for ONT data #175
Comments
hey - thank you for providing all the info and files, we'll take a look! |
hi @cmaceves! do you have any timeline on when a fix for this issue might be released? Thank you! |
hey @jessebyoder ! this is in progress now, but I'd estimate 2-3 weeks until a fix is pushed. |
hey @Alan-Collins - I just pushed a fix for this to the branch issue_175. If you wouldn't mind testing it to confirm you're getting the desired output, that would be great! |
Hi @cmaceves, I tested the branch issue_175 but still see the same issue. Did you not see the issue with that branch? Perhaps I have made a mistake in testing? I built the version of ivar on the issue_175 branch by modifying the dockerfile on your repo's main branch to clone the branch you specified instead of master. i.e., line 29 of the dockerfile was changed to I'm still seeing the reverse reads shifted so that they don't line up with the reference correctly. You can see it in the attached IGV screenshot showing the ivar output in the top track and the input bam in the bottom track. I ran the same commands I specified in the original issue using the same files. |
I wonder if the issue might have something to do with this reverse_cigar function? I haven't worked in C++ before to be able to work through the code to figure out exactly what it is doing and why iVAR would need to reverse the CIGAR string. However, I wonder if soft clipping the right end of the read (3' relative to the reference), which adds soft clips to the right end of the CIGAR is then being treated as if the read's left end (5' relative to the reference) had been shifted because the cigar was processed in reverse? |
hey, apologies for the delay, I'm looking into this now! @Alan-Collins |
we've pushed additional fixes to the same branch, if you could test it out that would be great! @Alan-Collins |
Hi Chrissy, Thanks! It looks like the shifting issue is fixed in the latest commit on that branch. However, it doesn't look like ivar is trimming the primer of the right end of reverse mapped reads. Would you prefer I opened a different issue for this behavior? Some description of what I mean (using the same files that I described in the first comment on this issue): Looking at the depth histogram for an amplicon before and after iVAR, you can see that the left of the amplicon, where a primer is found, has been removed, but the right side less so. Looking at the read pileup, it looks like iVAR is trimming both primers from forward reads, but only the left primer from reverse reads. A small number of reads are remaining untrimmed in both directions. |
Hey @Alan-Collins , on a quick glance, it seems like the reverse primers are trimmed from the right of the reverse reads according to the BED file. If you could you please point to specific coordinates or sets of reads where the primers are not trimmed, that would really help us debug this issue. |
My apologies. I didn't look closely enough. There isn't a strand-based lack of trimming. However, I am surprised to see so many reads that still map to primer binding regions. Specifically, after running iVAR, most of the remaining reads still map to a primer region
Is this the expected behavior or shouldn't iVAR be trimming all reads to remove primer sequences? An example: After running iVAR, 3 reads map to the second primer in the file linked in my original post. The primer entry in the BED file is The first 4 columns of the SAM entries for those reads are SRR21789993.85486 0 NC_045512.2 418 i.e., all three reads map to the first position of the primer onwards and contain the whole forward primer. |
Oh no worries! Thank you for going over this with us. In the example with pos 418 you pointed out, the primer actually starts at position 419 (BED file format is 0-based and samtools view is 1-based; please see figure(s) below). For illumina data, we typically use an exact match to trim those reads that start within the primer region. We did however, build a fuzzier version of this approach where we allow for the read to start a few bases before the start of the primer. Please see the example below with Exact match,
Fuzzy match with
|
Great! Thank you for explaining and for the work to resolve this issue! |
When processing ONT reads, iVar shifts the position of the read for reverse mapped reads. This results in incorrect alignment of the reads with the reference assembly. iVar seems to only change the position of reads that are clipped.
Steps to reproduce:
For the below example, ONT reads (SRR21789993) were mapped to a reference (NC_045512.2) using minimap2 version 2.27-r1193 and were then processed with samtools version 1.19.2 and iVar version 1.4.2.
minimap2 -ax map-ont wuhan.fa SRR21789993.fastq.gz
ivar trim -i SRR21789993.sorted.bam -m 30 -q 5 -e -b neb_vss2a.primer.bed -p ivar > ivar.log
The bed file used is here.The issue can be clearly seen in IGV. The blow screenshot shows a region of mapping with the original BAM in the top track and the BAM output by iVar in the bottom track.
After running iVar, reads now map to a region just 3' of the area with coverage in the top track. The reads mapping to this 3' region are all mapped in reverse. Further down, it can be seen that most (but not all) reverse mapped reads both extend further 3' than they should and do not align well with the reference,
Examining BAM entry for some of these mismapped reads indicates that they are now mapping further 3' in the reference and their cigar strings are modified. For example:
Original BAM
samtools view -N <(echo SRR21789993.141467) SRR21789993.sorted.bam| cut -f1,4,6
SRR21789993.141467 12646 53M1I140M1I217M1D112M1D1M1D3M1D14M57S
iVar output BAM
samtools view -N <(echo SRR21789993.141467) ivar.sorted.bam| cut -f1,4,6
SRR21789993.141467 12875 53M1I140M1I173M231S
This read should not have been moved as its CIGAR string begins with "M". However, it was moved to account for added soft clipping on the end of the CIGAR string.
Some reverse mapped reads were not shifted. These seem to be cases where iVar did not introduce any soft clipping. For example:
samtools view -N <(echo SRR21789993.118934) SRR21789993.sorted.bam| cut -f1,4,6
SRR21789993.118934 12633 66S19M7D134M1D32M1I7M2D3M1D93M1D164M1I5M2D73M
samtools view -N <(echo SRR21789993.118934) ivar.sorted.bam | cut -f1,4,6
SRR21789993.118934 12633 66S19M7D134M1D32M1I7M2D3M1D93M1D164M1I5M2D73M
The text was updated successfully, but these errors were encountered: