Why is my abundace estimation value zero for all classifications? How can this be turned on? #158

harisankar991 · 2018-11-21T06:04:35Z

No description provided.

chilltrout · 2019-02-12T23:54:38Z

Also very intrigued by this, went over the docs and I dont understand how to turn it on.
Im assuming abundance is always on because I have gotten numbers populated with a very high coverage genome on pacbio data but never on my nanopore data

harisankarsadasivan · 2019-02-13T01:14:26Z

Yes, makes sense. I faced the same with nanopore data, minion v9.4.1.

shashibioinfo · 2019-02-23T07:59:51Z

Hi sir,
even i had the same issue
i have analyzed by minION nanopore data using centrifuge tool the output files shows abundance as zero.

how to resolve this ?
please help me any valuable suggestions will be appreciated

Thank you

shashibioinfo · 2019-02-23T08:00:43Z

Yes, makes sense. I faced the same with nanopore data, minion v9.4.1.

even i have same issue
if you have solved this can you please help me to resolve the issue
Thank you

ExplodingCabbage · 2019-06-04T18:49:18Z

I've seen the same thing. One species has >80% of all reads assigned to it, according to the output TSV, yet its abundance is still listed as 0.0, just like every other row.

guokai8 · 2019-10-24T12:10:48Z

I have the same issue. I don't known why.

mourisl · 2019-10-24T19:50:48Z

Can you check whether there these are unique assignment or not? Thanks.

guokai8 · 2019-10-25T04:52:02Z

Yes. I am sure there are unique reads here.

Aiswarya-prasad · 2019-12-17T16:00:45Z

Has anyone been able to resolve this? I am having the same issue with nanopore reads.

mourisl · 2020-01-07T20:31:18Z

I'm checking on this issue. The abundance estimation is on by default. Does any of the read's assignment to the subspecies(leaf) level? Can you show me a few lines of the report file? Thanks.

jmaricb · 2020-04-07T11:16:20Z

Hi, are there any updates with this issue? I seem to be getting zero abundances for every species. Here are the commands I have been using:

centrifuge \ -x data/classifiers-DB/centrifuge/p_compressed+h+v \ -p 8 \ -f data/reads-fastq/ONT/communities-synthetic/integration_dataset.fasta \ -S out \ --report-file report

After that I also used this command to get kraken style report: centrifuge-kreport \ -x data/classifiers-DB/centrifuge/p_compressed+h+v \ out > kraken_report

You can download the output here: https://www.dropbox.com/s/a5j415ixyts9lox/Archive.zip?dl=0
You can see that there are species in the kraken_report file that have high abundance, and also in the report file you can see that there are species with high number of unique reads, but the abundance is still zero for all the rows.

mourisl · 2020-04-07T14:15:04Z

Thanks for sharing the files. I'll look into this.

mourisl · 2020-04-07T16:28:21Z

You are using the p_compressed+h+v index, however the seqId column from the output is not in the form of cid|XXX from the compression. I guess the index you are using is actually p+h+v. Could you please check whether the index is correct?

jmaricb · 2020-04-07T20:34:34Z

Hi, thank you for the response. Sorry, I have sent you the data I have classified with custom database that I built from Bacteria and Archaea genomes. The commands I used to build the database are:

centrifuge-download -o taxonomy taxonomy
centrifuge-download -o library -m -d "archaea,bacteria" refseq > seqid2taxid.map
cat library/*/*.fna > input-sequences.fna
centrifuge-build -p 10 --conversion-table seqid2taxid.map --taxonomy-tree taxonomy/nodes.dmp --name-table taxonomy/names.dmp input-sequences.fna abv

I think that succeeded, because as you can see the kraken report gives reasonable classification.

I am now sending new data: https://www.dropbox.com/s/cefkjfz0a4kq1ig/Data.zip?dl=0

There is the folder 'custom' which contains the output, report and kraken-report for the dataset classified with that custom index.
There is the folder 'default' which contains the output, report and kraken-report for the same dataset clasdified with the p_compressed+h+v index
And finally there is the folder integration that contains small dataset which when classified to p_compressed+h+v index gives abundances that are not zero. I don't know why. The output and the report are inside the directory.

Here is the dataset I've been using. It's quite large so I am sending it separately:
https://www.dropbox.com/s/jeuaho0slc45p9w/silico.fastq.zip?dl=0

jmaricb · 2020-04-08T18:05:13Z

@mourisl Hi, one more thing. I don't know if it can help. But my integration.fastq dataset also works with custom index that I created, so the problem might not be in the indexes.

Here are the results of the classification with the custom index: https://www.dropbox.com/s/iojc2br7q17ru1m/integration_custom.zip?dl=0

jmaricb · 2020-04-09T21:29:23Z

@mourisl
Hi,
can you just help me to calculate the abundances by myself. I would like to do that, but in the centrifuge output reads are classified to multiple species. How can I determine to which species each read should classify? Is there a way for centrifuge to determine one species to which certain read should classify to?

Thank You.

mourisl · 2020-04-09T21:46:51Z

@jmaricb You can directly use the abundance from kreport. For the multiple-assigned reads, the count will be added to their lowest ancestor in the taxonomy tree. You can also use "--no-lca" in kreport, which add the count to a strain in the fraction of the number of assignment.

jmaricb · 2020-04-09T21:59:14Z

@mourisl
Sorry for bothering you, but just one more question.
If I have a read that is mapped to three tax ids, like this:
SRR5891470.22869 species 106654 676 676 41 2302 3
SRR5891470.22869 species 470 676 676 41 2302 3
SRR5891470.22869 NZ_CP033858.1 2420300 676 676 41 2302 3

In the report (let's say kreport), this read will be assigned to lowest ancestor of these three tax ids (106654, 470, 2420300), which is Acinetobacter (tax id = 469)? Am I right?

Does this mean that only reads that map to single species will be assigned to that species?

Thank You.

mourisl · 2020-04-09T22:09:30Z

@jmaricb Yes, that is the default behavior of kreport. You can use "--no-lca" in centrifuge-kreport to put fraction of a read to the species. Note that, Centrifuge already assigns a read to its lowest common ancestor if it assigned to too many species (-k option).

jmaricb · 2020-04-09T22:18:09Z

@mourisl
Thank you very much. I think I got everything I need to calculate the abundances.

May I just know one last thing.
"--no-lca Do not report the LCA of multiple assignments, but report count fractions at the taxa."

How do you calculate count fractions for each species from multiple assignments when you use --no-lca?

mourisl · 2020-04-09T22:19:48Z

@jmaricb If a read is assigned to 4 species, the the four species' abundance will add 0.25.

jmaricb · 2020-04-10T08:54:15Z

Thank You for you help.

Adoni5 · 2020-05-21T14:42:20Z

@mourisl

I am also using a compressed index (p_compressed hosted on the site) with nanopore reads, and am getting an abundance of 0. I am building a custom index of bacteria from refseq to test if the compressed indexes are the problem, but was wondering of there is anything else you would recommend trying?

Sample ouput -

readID	seqID	taxID	score	2ndBestScore	hitLength	queryLength	numMatches
2bef9c72-eeab-4b54-b7a0-4f4696866878	NC_018695.1	1229205	225	225	30	215	2
a6d6c54d-b1e2-45ee-858f-0cb61d0fc2f5	NZ_CP016077.1	1612551	121	121	26	439	2

Sample report -

name	taxID	taxRank	genomeSize	numReads	numUniqueReads	abundance
Myxococcales	29	order	9697933	1	0	0.0
Cystobacter fuscus	43	species	12349744	1	1	0.0

xiechangxiao · 2020-06-22T08:08:06Z

I have the same issue. The abundance value always get 0 when I use the latest verion centrifuge and h+p+v+c database analysis nanopore data. Could you help me correct it, thank you.
Here is my code.
centrifuge -x database/centrifuge_databases/hpvc/hpvc -U BC_25.fq.gz --report-file BC_25.report -S BC_25.output

tanushrin · 2020-11-16T16:45:42Z

I am having issue with the abundance estimation; getting 0 abundances for most of the species except one species (with abundance value: 1). In the centrifuge_report.txt, there are species with high abundance however, centrifuge_report.tsv shows abundance as 0.
I created a custom database : archaea, bacteria, protozoa, fungi, plant, algae

Here are the centrifuge commands I have been using:

centrifuge-build -p 24 --conversion-table $REF_SEQ_DIR/accession2taxid_cent.map --taxonomy-tree $REF_SEQ_DIR/nodes.dmp --name-table $REF_SEQ_DIR/names.dmp $DB.fa $DB > $DB.log

centrifuge -p 24 -x $DB -q in.fq > out.txt

centrifuge-kreport -x $DB out.txt > centrifuge_report.txt

How to get proper(non-zero) abundance values? Would appreciate any help.

Thank you!

Kumereng · 2021-01-22T10:33:02Z

Hi i have exactly that same issue which has not been resolved. The abundance is also zero.

lixiaopi1985 · 2021-04-10T19:28:28Z

same issue with the latest Centrifuge.

mourisl · 2021-08-16T19:29:47Z

I just fixed an issue with estimating average genome sizes, which was also related to the abundance estimation procedure. Could you please try the new version and check whether the abundance values become normal? You don't need to rebuild the index.

BaylorLyu · 2021-09-09T04:40:37Z

I just fixed an issue with estimating average genome sizes, which was also related to the abundance estimation procedure. Could you please try the new version and check whether the abundance values become normal? You don't need to rebuild the index.

The problem still have in current version，only few cloumn have abundace value

sybrohee · 2023-03-09T15:44:13Z

Unfortunately, still having the same issue. All abundances stay equal to 0.0 and no iteration was performed.

mourisl · 2023-03-09T18:41:23Z

I can reproduce the zero abundance issue on one of the data sets. I'm working on it now, and it seems more complex than I thought.

sybrohee · 2023-03-10T09:11:37Z

@mourisl Thank you for considering the issue (and all your nice work with centrifuge)

harisankar991 changed the title ~~Why is my abundace estimation value zero for all classifications?~~ Why is my abundace estimation value zero for all classifications? How can this be turned on? Nov 21, 2018

ExplodingCabbage mentioned this issue Jun 4, 2019

Meaning of abundance in classification summary #152

Open

Aiswarya-prasad mentioned this issue Jun 13, 2020

issue with a specific taxon (ID: 853) #192

Closed

tim488 mentioned this issue Apr 27, 2022

/src/mmp2_processing.py:57: ParserWarning: Length of header or names does not match length of data microbiome-gastro-UMG/MeTaPONT#3

Closed

mansi-aai mentioned this issue Oct 30, 2023

Where are you Sulfobacillus? #255

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why is my abundace estimation value zero for all classifications? How can this be turned on? #158

Why is my abundace estimation value zero for all classifications? How can this be turned on? #158

harisankar991 commented Nov 21, 2018

chilltrout commented Feb 12, 2019

harisankarsadasivan commented Feb 13, 2019

shashibioinfo commented Feb 23, 2019

shashibioinfo commented Feb 23, 2019

ExplodingCabbage commented Jun 4, 2019

guokai8 commented Oct 24, 2019

mourisl commented Oct 24, 2019

guokai8 commented Oct 25, 2019

Aiswarya-prasad commented Dec 17, 2019

mourisl commented Jan 7, 2020

jmaricb commented Apr 7, 2020

mourisl commented Apr 7, 2020

mourisl commented Apr 7, 2020 •

edited

Loading

jmaricb commented Apr 7, 2020

jmaricb commented Apr 8, 2020

jmaricb commented Apr 9, 2020

mourisl commented Apr 9, 2020

jmaricb commented Apr 9, 2020

mourisl commented Apr 9, 2020

jmaricb commented Apr 9, 2020

mourisl commented Apr 9, 2020

jmaricb commented Apr 10, 2020

Adoni5 commented May 21, 2020

xiechangxiao commented Jun 22, 2020

tanushrin commented Nov 16, 2020

Kumereng commented Jan 22, 2021

lixiaopi1985 commented Apr 10, 2021

mourisl commented Aug 16, 2021

BaylorLyu commented Sep 9, 2021

sybrohee commented Mar 9, 2023

mourisl commented Mar 9, 2023

sybrohee commented Mar 10, 2023

Why is my abundace estimation value zero for all classifications? How can this be turned on? #158

Why is my abundace estimation value zero for all classifications? How can this be turned on? #158

Comments

harisankar991 commented Nov 21, 2018

chilltrout commented Feb 12, 2019

harisankarsadasivan commented Feb 13, 2019

shashibioinfo commented Feb 23, 2019

shashibioinfo commented Feb 23, 2019

ExplodingCabbage commented Jun 4, 2019

guokai8 commented Oct 24, 2019

mourisl commented Oct 24, 2019

guokai8 commented Oct 25, 2019

Aiswarya-prasad commented Dec 17, 2019

mourisl commented Jan 7, 2020

jmaricb commented Apr 7, 2020

mourisl commented Apr 7, 2020

mourisl commented Apr 7, 2020 • edited Loading

jmaricb commented Apr 7, 2020

jmaricb commented Apr 8, 2020

jmaricb commented Apr 9, 2020

mourisl commented Apr 9, 2020

jmaricb commented Apr 9, 2020

mourisl commented Apr 9, 2020

jmaricb commented Apr 9, 2020

mourisl commented Apr 9, 2020

jmaricb commented Apr 10, 2020

Adoni5 commented May 21, 2020

xiechangxiao commented Jun 22, 2020

tanushrin commented Nov 16, 2020

Kumereng commented Jan 22, 2021

lixiaopi1985 commented Apr 10, 2021

mourisl commented Aug 16, 2021

BaylorLyu commented Sep 9, 2021

sybrohee commented Mar 9, 2023

mourisl commented Mar 9, 2023

sybrohee commented Mar 10, 2023

mourisl commented Apr 7, 2020 •

edited

Loading