Remove redundant salmon TPM output #575

j-andrews7 · 2021-03-04T16:32:01Z

Description of the bug

Currently, the pipeline generates multiple TPM output from salmon, e.g. salmon.merged.gene_tpm.tsv, salmon.merged.gene_tpm_length_scaled.tsv. These are redundant, as the TPM values don't change. These files are output on both a per-sample and merged basis, so it creates a fair amount of clutter.

Expected behaviour

A single TPM file should be produced instead.

The text was updated successfully, but these errors were encountered:

grst · 2021-03-04T16:37:57Z

Actually, they should not be the same...

TPM is scaled by effective gene length per sample
TPM length-scaled is scaled by average effective gene length across all samples

(see tximport Vignette)

So yes, on a per-sample basis, they are redundant. The merged files should be (slightly?) different.

j-andrews7 · 2021-03-04T16:39:41Z

No, TPMs are TPMs. The counts derived from them will change with those different parameters, the TPM values themselves do not. The salmon.merged.gene_counts.tsv, salmon.merged.gene_counts_length_scaled.tsv, etc files are still necessary and will change slightly as you describe.

grst · 2021-03-04T16:42:18Z

Are you sure? That's not how I understand this section...

We could alternatively generate counts from abundances, using the argument countsFromAbundance, scaled to library size, "scaledTPM", or additionally scaled using the average transcript length, averaged over samples and to library size, "lengthScaledTPM"

j-andrews7 · 2021-03-04T16:47:26Z

Yes. That section applies to the counts that are derived from the abundance (TPMs). The abundance values themselves do not change. You can run with countsFromAbundance set to any option - the abundance matrix will be the same for them all.

This is also confirmed by Mike Love here.

grst · 2021-03-04T16:52:03Z

ah, now I got it! Thanks for clearing that up!

drpatelh · 2021-04-16T22:27:32Z

FIxed in #598

j-andrews7 added the bug Something isn't working label Mar 4, 2021

drpatelh added this to the 3.1 milestone Apr 11, 2021

lpantano mentioned this issue Apr 16, 2021

Fix salmon files and add tips to docs #598

Merged

6 tasks

drpatelh closed this as completed Apr 16, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove redundant salmon TPM output #575

Remove redundant salmon TPM output #575

j-andrews7 commented Mar 4, 2021

grst commented Mar 4, 2021

j-andrews7 commented Mar 4, 2021 •

edited

Loading

grst commented Mar 4, 2021 •

edited

Loading

j-andrews7 commented Mar 4, 2021

grst commented Mar 4, 2021

drpatelh commented Apr 16, 2021

Remove redundant salmon TPM output #575

Remove redundant salmon TPM output #575

Comments

j-andrews7 commented Mar 4, 2021

Description of the bug

Expected behaviour

grst commented Mar 4, 2021

j-andrews7 commented Mar 4, 2021 • edited Loading

grst commented Mar 4, 2021 • edited Loading

j-andrews7 commented Mar 4, 2021

grst commented Mar 4, 2021

drpatelh commented Apr 16, 2021

j-andrews7 commented Mar 4, 2021 •

edited

Loading

grst commented Mar 4, 2021 •

edited

Loading