-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible false negatives in query #5
Comments
Hope the new version works. |
Thanks for your help! I'll let you know how things go |
Since commit b1720a1 was only applied to the classic index, can I assume that my compact indexes are correct? |
Yes, compacts are built out of classic indexes. |
I reran my script and it seems like the results are much closer, but still a bit off. Now, when I query, I get the following
The greater number of matches for |
Did you add the I saw mantis mirrors lexicographically larger k-mers. COBS doesnt by default atm. |
Ok, that does indeed fix the problem, I had forgotten to re-enable it during my testing. Thanks for your help! I'll close this issue then! |
…ingmann#5) When combining classic indices, for each batch the combinations of rows from each constituent index are written to an output block. The output block is reused for next batch. As we use bitwise OR operation to combine rows from the constituent indices, the output block should be reset to all 0s before being reused. Otherwise, previous set bits will be carried over to next batch and accumulating false positives till the end of the batch processing loop.
Hi,
I've been using COBS in a pipeline I'm working on, but I've noticed what appear to be false negatives in COBS' querying results.
I've used the following script to build compressed COBS and Mantis indexes for the attached input sequences
When I query with Mantis, I get
Whereas when I query the file with COBS, I get
If I exclude
PREFIX2
(the second input file), I get the following result in COBSSo it seems like the addition of extra samples leads to a reduction in the number of reported matches. I observe the same behavior if I construct a classic index as well. I've also done some tests with larger data sets where no matches are reported in cases where Mantis reports several.
Overall, the reported numbers are much lower than those reported by Mantis, so I'm not sure how to interpret these results.
inputs.tar.gz
queries.tar.gz
Please let me know if there's any other info I can provide to help look into this.
Best,
Harun
The text was updated successfully, but these errors were encountered: