RATS analysis fails with Ensembl annotations #40

Klim314 · 2017-11-03T06:20:24Z

With Ensembl annotations for kallisto quantifications, RATS will produce a solely of NA results due to the Ensembl ".N" version numbers

[1] Summary of DTU results:
                       category  tally
1         DTU genes (gene test)      0
2     non-DTU genes (gene test)      0
3          NA genes (gene test)  52636
4      DTU genes (transc. test)      0
5  non-DTU genes (transc. test)      0
6       NA genes (transc. test)  52636
7        DTU genes (both tests)      0
8    non-DTU genes (both tests)      0
9         NA genes (both tests)  52636
10              DTU transcripts      0
11          non-DTU transcripts      0
12               NA transcripts 131195

Looking at the Genes, all genes/transcripts fail to be detected by RATS as follows

dtus_subset$Genes %>% head()
            parent_id  elig sig elig_fx quant_reprod rep_reprod DTU transc_DTU known_transc detect_transc elig_transc maxDprop
1: ENSMUSG00000000001 FALSE  NA      NA           NA         NA  NA         NA            1             0           0       NA
2: ENSMUSG00000000003 FALSE  NA      NA           NA         NA  NA         NA            2             0           0       NA
3: ENSMUSG00000000028 FALSE  NA      NA           NA         NA  NA         NA            3             0           0       NA

Examining the raw data reveals this to be due to the Ensembl gene/transcript version numbers. Stripping the .N suffix resolves this issue.

$boot_data_A
$boot_data_A[[1]]
                    target_id      bs0      bs1    bs10     bs11     bs12     bs13    bs14     bs15     bs16     bs17     bs18
     1:    ENSMUST00000000001       NA       NA      NA       NA       NA       NA      NA       NA       NA       NA       NA
     2:  ENSMUST00000000001.4 23.28946 22.50972 24.6344 24.70217 23.98399 25.11433 24.7658 23.61903 24.20159 23.83014 26.02804
     3:    ENSMUST00000000003       NA       NA      NA       NA       NA       NA      NA       NA       NA       NA       NA
     4: ENSMUST00000000003.13  0.00000  0.00000  0.0000  0.00000  0.00000  0.00000  0.0000  0.00000  0.00000  0.00000  0.00000
     5:    ENSMUST00000000010       NA       NA      NA       NA       NA       NA      NA       NA       NA       NA       NA

The issue seems similar to that faced by Patcher's Sleuth here:
pachterlab/sleuth#58

The text was updated successfully, but these errors were encountered:

fruce-ki · 2017-11-03T09:20:39Z

RATs does not process or interpret the IDs in any way. Any string is used 'as is'. As such, the IDs in your annotation must match exactly those in the quantification files. It is your responsibility to ensure that the same IDs are used across all the analysis steps.
Notice the section about Annotation Discrepancies in the input vignette. RATs will use the provided annotation as its guide. Any IDs in the annotation, not matched exactly in the quantifications will be assumed to have 0 expression. Any IDs in your quantifications that do not match the annotation will be ignored completely.

fruce-ki · 2017-11-03T09:28:01Z

Yes it is the same "problem" as the one reported for sleuth.
From my perspective this is a user error, not a program error. I consider as a liability any code "magic" that assumes a certain ID format and changes the provided IDs to conform to that presumed format. I don't think a program should be taking such initiative, because if the presumption is wrong, then the result will be worthless and the error may go unnoticed. I want RATs to work with any format of ID, including non-official formats, so automatically messing with the provided IDs is not a good idea.

fruce-ki · 2017-11-03T09:37:54Z

If however, you did use the same annotation, but Kallisto chopped off the version numbers, thus creating the mismatch in the IDs, then I may need to consider adding some optional ID "magic", as it is not really a user error if a third party program edits the IDs.

It wasn't clear from your question, what form of IDs are in your annotation and what form are in your quantifications and whether the same annotation file was used for quantification and DTU.

fruce-ki · 2017-11-08T18:49:25Z

Hi!
Do you have anything to add to this issue? Did you resolve the problem?

Thanks!
Kimon

fruce-ki added the question User query about usage. label Nov 3, 2017

fruce-ki closed this as completed Nov 20, 2017

brettvanderwerff mentioned this issue Feb 13, 2019

All transcripts/genes ineligible? #64

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RATS analysis fails with Ensembl annotations #40

RATS analysis fails with Ensembl annotations #40

Klim314 commented Nov 3, 2017

fruce-ki commented Nov 3, 2017

fruce-ki commented Nov 3, 2017 •

edited

Loading

fruce-ki commented Nov 3, 2017

fruce-ki commented Nov 8, 2017

RATS analysis fails with Ensembl annotations #40

RATS analysis fails with Ensembl annotations #40

Comments

Klim314 commented Nov 3, 2017

fruce-ki commented Nov 3, 2017

fruce-ki commented Nov 3, 2017 • edited Loading

fruce-ki commented Nov 3, 2017

fruce-ki commented Nov 8, 2017

fruce-ki commented Nov 3, 2017 •

edited

Loading