New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple empty hits? #164
Comments
When you have a huge database like "nt", minimap2 will split it into multiple parts and align all queries against each part independently. For most parts, minimap2 will print unmapped records. This is a limitation of minimap2. I think @hasindu2008 has a preliminary solution to this issue. See #141. |
Thanks Heng, I specifically gave the job 4x as much RAM as the size of nt (uncompressed) so technically no reason for it to split the index. is the creation of a multi-part index (in memory) turned on by default? |
Yes, multi-part is turned on by default. You can increase option |
You can try the merging solution I implemented from the following method. It is still preliminary and only works for single ended reads at the moment. It basically merges the results from a multi-part index to achieve a considerably similar output from a single-part index. I have done testing only on some simulated pacbio reads. Make sure that you use the option --multi-prefix to enable mergine |
The latest minimap2 has an --split-prefix option: minimap2 -a --split-prefix tmp ref.fa query.fa With this option, minimap2 won't write a unmapped record multiple times. |
Hello
I'm using minimap2 to map against NCBI nt. The SAM output contains multiple entries for the same query that are basically "no hit" i.e. column 3 is a *.
The last read I checked had 27 such entries (the run hadn't finished) which were identical.
Is this a feature? What am I missing? Why would minimap2 output multiple "no hit" lines for the same read?
I have checked and the input only has one copy of this read.
It's not unique to this read, it happens to all of them
Cheers
Mick
The text was updated successfully, but these errors were encountered: