Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No annotation while using aggregation_column #86

Closed
ersgupta opened this issue Oct 11, 2016 · 6 comments
Closed

No annotation while using aggregation_column #86

ersgupta opened this issue Oct 11, 2016 · 6 comments

Comments

@ersgupta
Copy link

I recently updated sleuth from 0.28.0 to 0.28.1 for gene level analysis i.e. aggregation_column option. One issue that I face is, the annotation columns get dropped in sleuth_results option if I use aggregation_column in sleuth_prep.
Can you confirm this? Ofcourse I can add the annotations on my own using basic R stuff (eg. merge) but thought to check with you guys, in case its supposed to work like this or I am missing some option in any of the steps.

Here are the commands I used:

so <- sleuth_prep(s2c, ~ treatment, target_mapping = t2g, aggregation_column = "ens_gene")
mm<-model.matrix(~ treatment)
so <- sleuth_fit(so, mm)
models(so)
so <- sleuth_wt(so, 'treatmentA')
results_cultivar <- sleuth_results(so, 'treatmentA', show_all=F)

PS: It works fine without the aggregation_column option. I posted this google groups as well, but I thought it could be considered as a bug as well, so posting here.

regards
Saurabh

@warrenmcg
Copy link
Collaborator

Hi @ersgupta,

Could you show the head of your t2g table (do head(t2g))? That would help in solving your problem.

@ersgupta
Copy link
Author

ersgupta commented Mar 15, 2017

Hi @warrenmcg,

Thank you for your reply.

Here you go:

> head(t2g)
    target_id  ens_gene ext_gene description
1 AT3G11415.1 AT3G11415                     
2 AT1G31258.1 AT1G31258                     
3 AT5G24735.1 AT5G24735                     
4 AT2G45780.1 AT2G45780                     
5 AT2G42425.1 AT2G42425                     
6 AT4G01533.1 AT4G01533

@warrenmcg
Copy link
Collaborator

Hi @ersgupta,

Still not clear on what's happening. I should have asked this before, but could you also show the head of your kallisto table and the results file before and after gene aggregation?

  1. After you run at the transcript level (no gene aggregation), do head(so$kal[[1]]$abundance) and head(results_cultivar)
  2. After you run at the gene level, do head(results_cultivar).

Thanks,
Warren

@ersgupta
Copy link
Author

Hi @warrenmcg,

Sure. I am also going through the code to see if I can find something. Basically, its just skipping to add the annotation columns after aggregation.

No gene aggregation:

> head(results_cultivar)
    target_id          pval          qval         b       se_b mean_obs  var_obs     tech_var    sigma_sq smooth_sigma_sq final_sigma_sq  ens_gene ext_gene                                                                 description
1 AT4G29780.1  0.000000e+00  0.000000e+00  4.850874 0.12719963 7.934130 7.078710 0.0022694551 0.022000162     0.005306600    0.022000162 AT4G29780                                                                                     
2 AT5G13930.1  0.000000e+00  0.000000e+00  3.965661 0.07480014 7.164071 4.724270 0.0029832652 0.004927742     0.005409326    0.005409326 AT5G13930      CHS                  Chalcone synthase [Source:UniProtKB/Swiss-Prot;Acc:P13114]
3 AT5G59550.1  0.000000e+00  0.000000e+00  2.833183 0.06665922 7.408626 2.411574 0.0013375536 0.003033070     0.005327625    0.005327625 AT5G59550    DURF2  E3 ubiquitin-protein ligase RDUF2 [Source:UniProtKB/Swiss-Prot;Acc:Q940T5]
4 AT3G18080.1 1.199293e-285 7.636198e-282 -2.222089 0.06152367 8.466105 1.483412 0.0002953235 0.002339488     0.005382420    0.005382420 AT3G18080   BGLU44                Beta-glucosidase 44 [Source:UniProtKB/Swiss-Prot;Acc:Q9LV33]
5 AT5G45340.1 2.450046e-266 1.248004e-262  3.650858 0.10471161 6.671383 4.011788 0.0094664905 0.006980293     0.006004214    0.006980293 AT5G45340 CYP707A3     Abscisic acid 8'-hydroxylase 3 [Source:UniProtKB/Swiss-Prot;Acc:Q9FH76]
6 AT2G30020.1 5.364989e-260 2.277348e-256  4.078554 0.11840801 7.831432 5.007207 0.0015765107 0.019454176     0.005303231    0.019454176 AT2G30020          Probable protein phosphatase 2C 25 [Source:UniProtKB/Swiss-Prot;Acc:O80871]

> head(so$kal[[1]]$abundance)
    target_id est_counts  eff_len len       tpm sample
1 ATMG00010.1          0 275.6981 462 0.0000000    OC1
2 ATMG00030.1          1 145.9368 324 0.2439780    OC1
3 ATMG00040.1          0 760.2741 948 0.0000000    OC1
4 ATMG00050.1          0 211.9070 396 0.0000000    OC1
5 ATMG00060.1         17 354.6882 542 1.7065442    OC1
6 ATMG00070.1          7 385.5595 573 0.6464307    OC1

After gene aggregation:

> head(results_cultivar)
  target_id          pval          qval         b       se_b mean_obs  var_obs     tech_var    sigma_sq smooth_sigma_sq final_sigma_sq
1 AT4G29780  0.000000e+00  0.000000e+00  4.850874 0.12719963 7.934130 7.078710 0.0022694551 0.022000162     0.005243659    0.022000162
2 AT5G13930  0.000000e+00  0.000000e+00  3.965661 0.07262234 7.164071 4.724270 0.0029832652 0.004927742     0.004926685    0.004927742
3 AT5G59550  0.000000e+00  0.000000e+00  2.833183 0.06488461 7.408626 2.411574 0.0013375536 0.003033070     0.004977465    0.004977465
4 AT3G57520 3.067187e-301 1.511587e-297  3.330369 0.08977308 8.799584 3.337078 0.0004189715 0.011669837     0.005648870    0.011669837
5 AT5G16030 1.943880e-288 7.663940e-285 -2.266261 0.06244013 8.079297 1.544245 0.0005327627 0.003795169     0.005315393    0.005315393
6 AT3G18080 5.964233e-280 1.959549e-276 -2.222089 0.06215131 8.466105 1.483412 0.0002953235 0.002339488     0.005498855    0.005498855

Thanks,
Saurabh

@warrenmcg
Copy link
Collaborator

Oh, I see the problem: buried in the code for sleuth_results is a condition for adding the annotations to transcript-level results using target mappings, but not gene-level results. I was confused because my results had the annotations, but then I realized that I had modified the code myself a while ago to add in the annotations. I'll see if I can get these changes incorporated into a pull request. For the time being, you should just do it yourself after you get the results.

tl;dr if one is expecting the results table to include annotations, there is indeed a "bug" in the program.

Best,
Warren

@ersgupta
Copy link
Author

Cool. Yes, I had added these myself. Thanks :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants