Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ultra long reads mapping to 1 nt #43

Closed
noncodo opened this issue Oct 25, 2017 · 5 comments
Closed

ultra long reads mapping to 1 nt #43

noncodo opened this issue Oct 25, 2017 · 5 comments
Labels

Comments

@noncodo
Copy link

noncodo commented Oct 25, 2017

I've mapped some ONT reads using -k15 and discovered that some of the (very long) reads don't align well (see below). minimap2 aligns them to the right place, but with a score of 'inf' and only at one nucleotide (!). Also, the read positions start and stop at 0 (just like when the reads don't map).
Manual inspection (i.e. BLAST) suggests that these reads should map to the chromosomes (albeit at around 82% identity). Other ultra long reads form the same run with similar base calling scores map correctly.
Is this normal/expected behaviour? I tried changing kmer and chaining parameters to no avail.

Sample output:
htsbox samview -p minimap2.srtd.bam | grep inf
9cb8f5d7-ab9c-48ea-88bd-2477e02d7b60 702394 0 0 - chr20 64444167 17799575 17799575 inf 60 mm:125885;ins:0,0;del:0,0;score:718346
eb5d9390-96a6-42ce-ae87-b99e649e6d80 658312 658312 658312 + chr4 190214555 94057298 94057298 inf 60 mm:115078;ins:0,0;del:0,0;score:695835

@lh3
Copy link
Owner

lh3 commented Oct 25, 2017

What htsbox version you are using and what tool you used to sort the bam? For 700kb reads, minimap2 is likely to produce a cigar string with >65535 operators. In this case, the cigar is written to a tag. It seems that the tag has been lost during processing.

Long cigar is something I am actively working on to convince others. For now, please use precompiled samtools-1.6 here. It works with long cigar. Let me know if that works.

@lh3 lh3 added the question label Oct 25, 2017
@lh3
Copy link
Owner

lh3 commented Oct 25, 2017

Download link to the modified samtools-1.6:

http://lh3lh3.users.sourceforge.net/data/cigar-64k/samtools-1.6+CG_x64-linux.tar.bz2

@noncodo
Copy link
Author

noncodo commented Oct 25, 2017

Upgrading my rusty apps solved it, thanks! (forgot to mention I used minimap2's -L option)
FYI:
9cb8f5d7-ab9c-48ea-88bd-2477e02d7b60 702394 12 702393 - chr20 64444167 17799575 18516354 626055 751935 60 mm:i:41170 io:i:22564 in:i:35156 do:i:27595 dn:i:49554 AS:i:718429

@lh3
Copy link
Owner

lh3 commented Oct 25, 2017

For the time being, only the latest htsbox and the samtools at the above link fully work with extra long reads. For older tools, sorting, merging and flagstat all work, but depth/coverage and retrieval by region will silently ignore these extra long reads.

I am actively pushing the same solution to all tools such that they all work properly.

@lh3
Copy link
Owner

lh3 commented Oct 25, 2017

Forgot to mention: adding -L or not has no effect to the latest htsbox and the modified samtools. -L is mainly meant for old tools that can't work with extra long reads.

@lh3 lh3 closed this as completed Oct 25, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants