-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why is my abundace estimation value zero for all classifications? How can this be turned on? #158
Comments
Also very intrigued by this, went over the docs and I dont understand how to turn it on. |
Yes, makes sense. I faced the same with nanopore data, minion v9.4.1. |
Hi sir, how to resolve this ? Thank you |
even i have same issue |
I've seen the same thing. One species has >80% of all reads assigned to it, according to the output TSV, yet its abundance is still listed as 0.0, just like every other row. |
I have the same issue. I don't known why. |
Can you check whether there these are unique assignment or not? Thanks. |
Yes. I am sure there are unique reads here. |
Has anyone been able to resolve this? I am having the same issue with nanopore reads. |
I'm checking on this issue. The abundance estimation is on by default. Does any of the read's assignment to the subspecies(leaf) level? Can you show me a few lines of the report file? Thanks. |
Hi, are there any updates with this issue? I seem to be getting zero abundances for every species. Here are the commands I have been using:
You can download the output here: https://www.dropbox.com/s/a5j415ixyts9lox/Archive.zip?dl=0 |
Thanks for sharing the files. I'll look into this. |
You are using the p_compressed+h+v index, however the seqId column from the output is not in the form of cid|XXX from the compression. I guess the index you are using is actually p+h+v. Could you please check whether the index is correct? |
Hi, thank you for the response. Sorry, I have sent you the data I have classified with custom database that I built from Bacteria and Archaea genomes. The commands I used to build the database are:
I think that succeeded, because as you can see the kraken report gives reasonable classification. I am now sending new data: https://www.dropbox.com/s/cefkjfz0a4kq1ig/Data.zip?dl=0
Here is the dataset I've been using. It's quite large so I am sending it separately: |
@mourisl Hi, one more thing. I don't know if it can help. But my integration.fastq dataset also works with custom index that I created, so the problem might not be in the indexes. Here are the results of the classification with the custom index: https://www.dropbox.com/s/iojc2br7q17ru1m/integration_custom.zip?dl=0 |
@mourisl Thank You. |
@jmaricb You can directly use the abundance from kreport. For the multiple-assigned reads, the count will be added to their lowest ancestor in the taxonomy tree. You can also use "--no-lca" in kreport, which add the count to a strain in the fraction of the number of assignment. |
@mourisl In the report (let's say kreport), this read will be assigned to lowest ancestor of these three tax ids (106654, 470, 2420300), which is Acinetobacter (tax id = 469)? Am I right? Does this mean that only reads that map to single species will be assigned to that species? Thank You. |
@jmaricb Yes, that is the default behavior of kreport. You can use "--no-lca" in centrifuge-kreport to put fraction of a read to the species. Note that, Centrifuge already assigns a read to its lowest common ancestor if it assigned to too many species (-k option). |
@mourisl May I just know one last thing. How do you calculate count fractions for each species from multiple assignments when you use --no-lca? |
@jmaricb If a read is assigned to 4 species, the the four species' abundance will add 0.25. |
Thank You for you help. |
I am also using a compressed index (p_compressed hosted on the site) with nanopore reads, and am getting an abundance of 0. I am building a custom index of bacteria from refseq to test if the compressed indexes are the problem, but was wondering of there is anything else you would recommend trying? Sample ouput -
Sample report -
|
I have the same issue. The abundance value always get 0 when I use the latest verion centrifuge and h+p+v+c database analysis nanopore data. Could you help me correct it, thank you. |
I am having issue with the abundance estimation; getting 0 abundances for most of the species except one species (with abundance value: 1). In the centrifuge_report.txt, there are species with high abundance however, centrifuge_report.tsv shows abundance as 0. Here are the centrifuge commands I have been using: centrifuge-build -p 24 --conversion-table $REF_SEQ_DIR/accession2taxid_cent.map --taxonomy-tree $REF_SEQ_DIR/nodes.dmp --name-table $REF_SEQ_DIR/names.dmp $DB.fa $DB > $DB.log centrifuge -p 24 -x $DB -q in.fq > out.txt centrifuge-kreport -x $DB out.txt > centrifuge_report.txt How to get proper(non-zero) abundance values? Would appreciate any help. Thank you! |
Hi i have exactly that same issue which has not been resolved. The abundance is also zero. |
same issue with the latest Centrifuge. |
I just fixed an issue with estimating average genome sizes, which was also related to the abundance estimation procedure. Could you please try the new version and check whether the abundance values become normal? You don't need to rebuild the index. |
The problem still have in current version,only few cloumn have abundace value |
Unfortunately, still having the same issue. All abundances stay equal to 0.0 and no iteration was performed. |
I can reproduce the zero abundance issue on one of the data sets. I'm working on it now, and it seems more complex than I thought. |
@mourisl Thank you for considering the issue (and all your nice work with centrifuge) |
No description provided.
The text was updated successfully, but these errors were encountered: