-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refining the table of polytomous result #19
Comments
Have you read in the 6 mmdiff comparison files using the polyclass() R function? If so then you should have a posterior probability column in the table returned by that function, which you can rank on. You should find that if you plot the "mu"s for the features with high posterior probabilities, you observe strong evidence of clustering within groups The "eta" columns will tell you for each group whether it is up or down regulated relative to the global mean bw
|
Hi Ernest, Yes, indeed I havres1=polyclass(files=c("E18vsOthers.mmdiff", res1=polyclass(files=c("E18vsOthers.mmdiff", "P1vsOthers.mmdiff", However, I do not get the eta column in my output file. I have attached few lines of my out put file for your reference. Please guide, thank you, Best regards On Mon, Sep 26, 2016 at 3:41 PM, Ernest Turro notifications@github.com
"feature_id" "mu_E18_1.namesorted.bam.hits.gene" "mu_E18_2.namesorted.bam.hits.gene" "mu_E18_3.namesorted.bam.hits.gene" "mu_E18_4.namesorted.bam.hits.gene" "mu_P1_1.namesorted.bam.hits.gene" "mu_P1_2.namesorted.bam.hits.gene" "mu_P1_3.namesorted.bam.hits.gene" "mu_P3_1.namesorted.bam.hits.gene" "mu_P3_2.namesorted.bam.hits.gene" "mu_P3_3.namesorted.bam.hits.gene" "mu_P10_1.namesorted.bam.hits.gene" "mu_P10_2.namesorted.bam.hits.gene" "mu_P10_3.namesorted.bam.hits.gene" "mu_Mo3_1.namesorted.bam.hits.gene" "mu_Mo3_2.namesorted.bam.hits.gene" "mu_Mo3_3.namesorted.bam.hits.gene" "mu_Mo3_4.namesorted.bam.hits.gene" "mu_Mo12_1.namesort.bam.hits.gene" "mu_Mo12_2.namesort.bam.hits.gene" "mu_Mo12_3.namesort.bam.hits.gene" "mu_Mo12_4.namesort.bam.hits.gene" "mu_Mo24_1.namesort.bam.hits.gene" "mu_Mo24_2.namesort.bam.hits.gene" "mu_Mo24_3.namesort.bam.hits.gene" "mu_Mo24_4.namesort.bam.hits.gene" "mu_Mo24_5.namesort.bam.hits.gene" "mu_Mo24_6.namesort.bam.hits.gene" "mu_Mo24_7.namesort.bam.hits.gene" "postprob_model0" "postprob_model1" "postprob_model2" "postprob_model3" "postprob_model4" "postprob_model5" "postprob_model6" "postprob_model7" |
That looks fine postprob_model0 is the posterior probability of the baseline model (global mean, I assume) The "mu" values are the estimated log expression values You're right that the etas are not read in but you can find them in the *mmdiff files if you need them Best wishes
|
Hi, Thank you. Possibly the last one, I have a lot of NaNs, is it wise to remove them? Best |
Where are the NaNs?
|
Hi, Sorry, it is NA not NAN. Shall remove the lines carrying NA or replace them with 0? what is best way Best regards On Mon, Sep 26, 2016 at 4:27 PM, Ernest Turro notifications@github.com
|
Which column of which file do you see NAs in?
|
here it is "feature_id" "mu_E18_1,namesorted,bam,hits,gene" "mu_E18_2,namesorted,bam,hits,gene" "mu_E18_3,namesorted,bam,hits,gene" "mu_E18_4,namesorted,bam,hits,gene" "mu_P1_1,namesorted,bam,hits,gene" "mu_P1_2,namesorted,bam,hits,gene" "mu_P1_3,namesorted,bam,hits,gene" "mu_P3_1,namesorted,bam,hits,gene" "mu_P3_2,namesorted,bam,hits,gene" "mu_P3_3,namesorted,bam,hits,gene" "mu_P10_1,namesorted,bam,hits,gene" "mu_P10_2,namesorted,bam,hits,gene" "mu_P10_3,namesorted,bam,hits,gene" "mu_Mo3_1,namesorted,bam,hits,gene" "mu_Mo3_2,namesorted,bam,hits,gene" "mu_Mo3_3,namesorted,bam,hits,gene" "mu_Mo3_4,namesorted,bam,hits,gene" "mu_Mo12_1,namesort,bam,hits,gene" "mu_Mo12_2,namesort,bam,hits,gene" "mu_Mo12_3,namesort,bam,hits,gene" "mu_Mo12_4,namesort,bam,hits,gene" "mu_Mo24_1,namesort,bam,hits,gene" "mu_Mo24_2,namesort,bam,hits,gene" "mu_Mo24_3,namesort,bam,hits,gene" "mu_Mo24_4,namesort,bam,hits,gene" "mu_Mo24_5,namesort,bam,hits,gene" "mu_Mo24_6,namesort,bam,hits,gene" "mu_Mo24_7,namesort,bam,hits,gene" "postprob_model0" "postprob_model1" "postprob_model2" "postprob_model3" "postprob_model4" "postprob_model5" "postprob_model6" "postprob_model7" "ENSRNOG00000000044,6" 1,93077 1,77034 1,94492 2,01215 2,07374 2,27246 2,10575 2,20234 2,41065 2,22659 2,37298 2,2594 2,47746 2,67266 2,6812 2726 2,34917 2,43676 2,2224 2,29195 2,35387 1,93058 2,06998 7,05513 2,37321 2,24386 2,29025 2,29686 0 NA NA 0 0 NA NA NA |
Hmmm, did you use version 1.0.9 of mmdiff? Would you mind printing out the posterior probabilities and the Bayes factors recorded in the seven comparison mmdiff files for one of the features, e.g. ENSRNOG00000000098.7? Thanks
|
The version of mmdiff is mmdiff-1.0.8 feature_id bayes_factor posterior_probability alpha0 beta0_0 beta0_1 beta0_2 beta0_3 beta0_4 beta0_5 beta0_6 alpha1 beta1_0 beta1_1 beta1_2 beta1_3 beta1_4 beta1_5 beta1_6 eta1_0 mu_E18_1.namesorted.bam.hits.gene mu_E18_2.namesorted.bam.hits.gene mu_E18_3.namesorted.bam.hits.gene mu_E18_4.namesorted.bam.hits.gene mu_P1_1.namesorted.bam.hits.gene mu_P1_2.namesorted.bam.hits.gene mu_P1_3.namesorted.bam.hits.gene mu_P3_1.namesorted.bam.hits.gene mu_P3_2.namesorted.bam.hits.gene mu_P3_3.namesorted.bam.hits.gene mu_P10_1.namesorted.bam.hits.gene mu_P10_2.namesorted.bam.hits.gene mu_P10_3.namesorted.bam.hits.gene mu_Mo3_1.namesorted.bam.hits.gene mu_Mo3_2.namesorted.bam.hits.gene mu_Mo3_3.namesorted.bam.hits.gene mu_Mo3_4.namesorted.bam.hits.gene mu_Mo12_1.namesort.bam.hits.gene mu_Mo12_2.namesort.bam.hits.gene mu_Mo12_3.namesort.bam.hits.gene mu_Mo12_4.namesort.bam.hits.gene mu_Mo24_1.namesort.bam.hits.gene mu_Mo24_2.namesort.bam.hits.gene mu_Mo24_3.namesort.bam.hits.gene mu_Mo24_4.namesort.bam.hits.gene mu_Mo24_5.namesort.bam.hits.gene mu_Mo24_6.namesort.bam.hits.gene mu_Mo24_7.namesort.bam.hits.gene sd_E18_1.namesorted.bam.hits.gene sd_E18_2.namesorted.bam.hits.gene sd_E18_3.namesorted.bam.hits.gene sd_E18_4.namesorted.bam.hits.gene sd_P1_1.namesorted.bam.hits.gene sd_P1_2.namesorted.bam.hits.gene sd_P1_3.namesorted.bam.hits.gene sd_P3_1.namesorted.bam.hits.gene sd_P3_2.namesorted.bam.hits.gene sd_P3_3.namesorted.bam.hits.gene sd_P10_1.namesorted.bam.hits.gene sd_P10_2.namesorted.bam.hits.gene sd_P10_3.namesorted.bam.hits.gene sd_Mo3_1.namesorted.bam.hits.gene sd_Mo3_2.namesorted.bam.hits.gene sd_Mo3_3.namesorted.bam.hits.gene sd_Mo3_4.namesorted.bam.hits.gene sd_Mo12_1.namesort.bam.hits.gene sd_Mo12_2.namesort.bam.hits.gene sd_Mo12_3.namesort.bam.hits.gene sd_Mo12_4.namesort.bam.hits.gene sd_Mo24_1.namesort.bam.hits.gene sd_Mo24_2.namesort.bam.hits.gene sd_Mo24_3.namesort.bam.hits.gene sd_Mo24_4.namesort.bam.hits.gene sd_Mo24_5.namesort.bam.hits.gene sd_Mo24_6.namesort.bam.hits.gene sd_Mo24_7.namesort.bam.hits.gene ENSRNOG00000000098.7 0.0190623 0.00211356 1.82557 -0.604319 -0.175977 0.0598454 0.22367 0.338942 0.176893 0.0316759 1.4831 -0.502818 -0.0845974 0.169414 0.315968 0.435606 0.265885 0.441127 -0.414926 1.23244 1.01857 1.35909 1.41719 1.71721 1.84907 1.40537 2.08355 1.6536 2.06811 1.75018 1.94563 2.55977 2.33215 1.96623 2.21867 2.33136 1.86095 2.04603 2.19401 2.09354 1.00136 1.81151 -2.88187 1.73583 1.92559 2.05915 2.16223 0.167869 0.165434 0.16887 0.108137 0.0938489 0.118504 0.214196 0.105806 0.108486 0.101371 0.110449 0.129985 0.0992743 0.101388 0.0961584 0.100015 0.080088 0.0841759 0.0798832 0.0635577 0.0931624 0.438508 0.192649 9.84077 0.108151 0.0894878 0.0723886 0.101675 |
Could you print out this line across all seven files please? This one looks fine (Bayes factor of 0.0190623 and posterior probability of 0.00211356) but I think some of the other files may have NA or NaN due to an issue that I resolved in version 1.0.9
|
Here they are: feature_id bayes_factor posterior_probability alpha0 beta0_0 beta0_1 beta0_2 beta0_3 beta0_4 beta0_5 beta0_6 alpha1 beta1_0 beta1_1 beta1_2 beta1_3 beta1_4 beta1_5 beta1_6 eta1_0 mu_E18_1.namesorted.bam.hits.gene mu_E18_2.namesorted.bam.hits.gene mu_E18_3.namesorted.bam.hits.gene mu_E18_4.namesorted.bam.hits.gene mu_P1_1.namesorted.bam.hits.gene mu_P1_2.namesorted.bam.hits.gene mu_P1_3.namesorted.bam.hits.gene mu_P3_1.namesorted.bam.hits.gene mu_P3_2.namesorted.bam.hits.gene mu_P3_3.namesorted.bam.hits.gene mu_P10_1.namesorted.bam.hits.gene mu_P10_2.namesorted.bam.hits.gene mu_P10_3.namesorted.bam.hits.gene mu_Mo3_1.namesorted.bam.hits.gene mu_Mo3_2.namesorted.bam.hits.gene mu_Mo3_3.namesorted.bam.hits.gene mu_Mo3_4.namesorted.bam.hits.gene mu_Mo12_1.namesort.bam.hits.gene mu_Mo12_2.namesort.bam.hits.gene mu_Mo12_3.namesort.bam.hits.gene mu_Mo12_4.namesort.bam.hits.gene mu_Mo24_1.namesort.bam.hits.gene mu_Mo24_2.namesort.bam.hits.gene mu_Mo24_3.namesort.bam.hits.gene mu_Mo24_4.namesort.bam.hits.gene mu_Mo24_5.namesort.bam.hits.gene mu_Mo24_6.namesort.bam.hits.gene mu_Mo24_7.namesort.bam.hits.gene sd_E18_1.namesorted.bam.hits.gene sd_E18_2.namesorted.bam.hits.gene sd_E18_3.namesorted.bam.hits.gene sd_E18_4.namesorted.bam.hits.gene sd_P1_1.namesorted.bam.hits.gene sd_P1_2.namesorted.bam.hits.gene sd_P1_3.namesorted.bam.hits.gene sd_P3_1.namesorted.bam.hits.gene sd_P3_2.namesorted.bam.hits.gene sd_P3_3.namesorted.bam.hits.gene sd_P10_1.namesorted.bam.hits.gene sd_P10_2.namesorted.bam.hits.gene sd_P10_3.namesorted.bam.hits.gene sd_Mo3_1.namesorted.bam.hits.gene sd_Mo3_2.namesorted.bam.hits.gene sd_Mo3_3.namesorted.bam.hits.gene sd_Mo3_4.namesorted.bam.hits.gene sd_Mo12_1.namesort.bam.hits.gene sd_Mo12_2.namesort.bam.hits.gene sd_Mo12_3.namesort.bam.hits.gene sd_Mo12_4.namesort.bam.hits.gene sd_Mo24_1.namesort.bam.hits.gene sd_Mo24_2.namesort.bam.hits.gene sd_Mo24_3.namesort.bam.hits.gene sd_Mo24_4.namesort.bam.hits.gene sd_Mo24_5.namesort.bam.hits.gene sd_Mo24_6.namesort.bam.hits.gene sd_Mo24_7.namesort.bam.hits.gene ENSRNOG00000000098.7 inf 1 -nan -0.435103 -0.0140217 0.226919 0.385137 0.507219 0.349306 0.186145 1.87446 -0.491195 -0.294448 -0.0502627 0.109003 0.224025 0.0621883 -0.0840703 -0.223738 1.23244 1.01857 1.35909 1.41719 1.71721 1.84907 1.40537 2.08355 1.6536 2.06811 1.75018 1.94563 2.55977 2.33215 1.96623 2.21867 2.33136 1.86095 2.04603 2.19401 2.09354 1.00136 1.81151 -2.88187 1.73583 1.92559 2.05915 2.16223 0.167869 0.165434 0.16887 0.108137 0.0938489 0.118504 0.214196 0.105806 0.108486 0.101371 0.110449 0.129985 0.0992743 0.101388 0.0961584 0.100015 0.080088 0.0841759 0.0798832 0.0635577 0.0931624 0.438508 0.192649 9.84077 0.108151 0.0894878 0.0723886 0.101675 ENSRNOG00000000098.7 inf 1 -nan -0.438551 -0.014951 0.235706 0.384479 0.505696 0.35142 0.181437 1.89396 -0.622727 -0.206974 0.0436585 0.201801 0.318719 0.156792 0.00888626 0.00289205 1.23244 1.01857 1.35909 1.41719 1.71721 1.84907 1.40537 2.08355 1.6536 2.06811 1.75018 1.94563 2.55977 2.33215 1.96623 2.21867 2.33136 1.86095 2.04603 2.19401 2.09354 1.00136 1.81151 -2.88187 1.73583 1.92559 2.05915 2.16223 0.167869 0.165434 0.16887 0.108137 0.0938489 0.118504 0.214196 0.105806 0.108486 0.101371 0.110449 0.129985 0.0992743 0.101388 0.0961584 0.100015 0.080088 0.0841759 0.0798832 0.0635577 0.0931624 0.438508 0.192649 9.84077 0.108151 0.0894878 0.0723886 0.101675 ENSRNOG00000000098.7 inf 1 -nan -0.446658 -0.0015702 0.235258 0.389591 0.502541 0.32965 0.202072 1.99038 -0.702985 -0.282921 -0.0720392 0.120476 0.239042 0.0766737 -0.0709585 0.0332761 1.23244 1.01857 1.35909 1.41719 1.71721 1.84907 1.40537 2.08355 1.6536 2.06811 1.75018 1.94563 2.55977 2.33215 1.96623 2.21867 2.33136 1.86095 2.04603 2.19401 2.09354 1.00136 1.81151 -2.88187 1.73583 1.92559 2.05915 2.16223 0.167869 0.165434 0.16887 0.108137 0.0938489 0.118504 0.214196 0.105806 0.108486 0.101371 0.110449 0.129985 0.0992743 0.101388 0.0961584 0.100015 0.080088 0.0841759 0.0798832 0.0635577 0.0931624 0.438508 0.192649 9.84077 0.108151 0.0894878 0.0723886 0.101675 ENSRNOG00000000098.7 0 0 1.78696 -0.518377 -0.0942221 0.149615 0.308135 0.423671 0.26336 0.115039 -nan -0.555334 -0.135602 0.11226 0.303931 0.386602 0.222534 0.0869579 -nan 1.23244 1.01857 1.35909 1.41719 1.71721 1.84907 1.40537 2.08355 1.6536 2.06811 1.75018 1.94563 2.55977 2.33215 1.96623 2.21867 2.33136 1.86095 2.04603 2.19401 2.09354 1.00136 1.81151 -2.88187 1.73583 1.92559 2.05915 2.16223 0.167869 0.165434 0.16887 0.108137 0.0938489 0.118504 0.214196 0.105806 0.108486 0.101371 0.110449 0.129985 0.0992743 0.101388 0.0961584 0.100015 0.080088 0.0841759 0.0798832 0.0635577 0.0931624 0.438508 0.192649 9.84077 0.108151 0.0894878 0.0723886 0.101675 |
Yes, so you can see that for this feature the Bayes factor is infinite in several of the comparisons. In these cases, you have very strong evidence against the baseline model but it's not clear which of the alternative models is a better fit because each one is so overwhelmingly better fitting than the baseline model that gamma was estimated as being 1 (and the Bayes factor as infinite) in several comparison. As a quick workaround you could manipulate the Bayes factors that are infinite and change them to a very high value. Similarly increase slightly the ones that are zero to just above zero. In mmseq.R, in this loop:
You could add:
where large_val is something slightly larger than the largest Bayes factor you've observed and small_val (near zero) is something slightly closer to zero than any other Bayes factor you've observed. It's a bit of a hack but a fairly principled one and it will allow you to include these features in the ranking. Note that if your prior is the same across the different models, and you have a mixture of infinite and zero Bayes factors, you'll essentially end up with posterior probability = 1 / number of models with an infinite Bayes factor, which could be quite low in your situation, given you are comparing seven models. best wishes
|
Hi,
I have applied the polytomous analysis on my dataset consisting of 7 groups and each group with a replicate. The results are in a tabular form for all the genes.
I would like to know on the basis of which column shall I sort or filter to get the signature genes for each group.
Thank you.
Best regards
Abhishek
The text was updated successfully, but these errors were encountered: