input-rna returns zero fold-change from non-rsem input #425

umasstr · 2019-03-12T23:15:20Z

Given a 2-column input .tsv (from kallisto abundance.tsv or abundance.h5 processed by sleuth), import-rna generates a .cnr with zero log fold-change for every gene. If normalization is excluded from the code then input-rna returns non-zero values.

I would appreciate any insight as to why values may be thrown out at the quantile step of normalization.

etal · 2019-03-18T23:04:07Z

How many samples did you use? If just one, then it normalizes against itself and the result is all zeros. (I'll see about improving the error reporting there.)

umasstr · 2019-04-10T21:45:14Z

This is using four samples with this command: cnvkit.py import-rna -f counts -g ~/cnvkit/data/ensembl-gene-info.mm10.tsv -o ~/output /data/sample1/abundance.tsv /data/sample2/abundance.tsv /data/sample3/abundance.tsv /data/sample4/abundance.tsv.

Lines in the gene info file look something like this:
ENSMUSG00000102693 34.2056074766 1 3073253 3074322 4933401J01Rik 0 1070 tsl5
ENSMUSG00000064842 36.3636363636 1 3102016 3102125 Gm26206 1 110 tsl5
ENSMUSG00000051951 50.1375894331 1 3205901 3671498 Xkr4 2 465598 tsl5
ENSMUSG00000102851 39.7916666667 1 3252757 3253236 Gm18956 3 480 tsl5

and the sample abundance files look like this:
ENSMUSG00000102693 0.006857
ENSMUSG00000064842 0.000000
ENSMUSG00000051951 0.078329
ENSMUSG00000102851 0.019840

they are TPMs, I have also tried multiplying them by the read depth to approximate number of counts, and have also tried the --no-txlen and --no-gc flags but all combinations produce .cnr files with weights of 1 for all genes and log fold-change values of 0 for all genes.

umasstr · 2019-04-10T21:50:03Z

hg38-based tpms generated by kallisto return a similar normalization problem. However, if I submit .rsem from the same dataset, it works.

etal · 2019-04-11T05:49:47Z

Thanks for the details, this sounds like a bug in reading the plain 2-column format. I'll look into it.

etal · 2019-11-29T18:42:32Z

It looks like the 2-column input format is now working with import-rna -f counts in the development version. I'll roll another release.

etal added the documentation label Mar 18, 2019

etal added the bug label Apr 11, 2019

etal mentioned this issue Apr 17, 2019

CNVkit for RNA-seq: import-rna -f counts not working #437

Closed

etal added the rna label May 27, 2019

SJRussell mentioned this issue Nov 15, 2019

Input for the import-rna command #479

Open

etal closed this as completed Nov 29, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

input-rna returns zero fold-change from non-rsem input #425

input-rna returns zero fold-change from non-rsem input #425

umasstr commented Mar 12, 2019

etal commented Mar 18, 2019

umasstr commented Apr 10, 2019

umasstr commented Apr 10, 2019

etal commented Apr 11, 2019

etal commented Nov 29, 2019

input-rna returns zero fold-change from non-rsem input #425

input-rna returns zero fold-change from non-rsem input #425

Comments

umasstr commented Mar 12, 2019

etal commented Mar 18, 2019

umasstr commented Apr 10, 2019

umasstr commented Apr 10, 2019

etal commented Apr 11, 2019

etal commented Nov 29, 2019