Skip to content

5. Posterior summary

Catalina Vallejos edited this page Jun 7, 2020 · 4 revisions

‼️ THIS WIKI IS NO LONGER MAINTAINED - PLEASE REFER TO THE VIGNETTE INSTEAD ‼️

The following code generates a synthetic example to illustrate BASiCS:

Data <- makeExampleBASiCS_Data()
Chain <- BASiCS_MCMC(Data = Data, N = 20000, Thin = 20, Burn = 10000, 
                     PrintProgress = FALSE)

To access the MCMC chains associated to individual parameter use the function displayChainBASiCS. For example,

displayChainBASiCS(Chain, Param = "mu")[1:5,1:5]

As a summary of the posterior distribution, the Summary methods calculates posterior medians and the High Posterior Density (HPD) intervals for each model parameter. As a default option, HPD intervals contain 0.95 probability.

ChainSummary <- Summary(Chain)

Subsequently, the displaySummaryBASiCS method can extract posterior summaries for individual parameters. For example

head(displaySummaryBASiCS(ChainSummary, Param = "mu"))

Above,

  • First column: posterior medians
  • Second column: lower limit of the 95% HPD interval
  • Third column: upper limit of the 95% HPD interval

The following figures display posterior medians and the corresponding HPD 95% intervals for gene-specific parameters $\mu_i$ (mean) and $\delta_i$ (over-dispersion)

par(mfrow = c(2,2))
plot(ChainSummary, Param = "mu", main = "All genes", log = "y")
plot(ChainSummary, Param = "mu", Genes = 1:10, main = "First 10 genes")
plot(ChainSummary, Param = "delta", main = "All genes")
plot(ChainSummary, Param = "delta", Genes = c(2,5,10,50), main = "5 custom genes")

It is also possible to obtain similar summaries for the normalising constants $\phi_j$ and $s_j$.

par(mfrow = c(1,2))
plot(ChainSummary, Param = "phi")
plot(ChainSummary, Param = "s", Cells = 1:5)

Finally, it is also possible to create a scatterplot of posterior estimates for gene-specific parameters. Typically, this plot will exhibit the confounding effect that is observed between mean and over-dispersion.

par(mfrow = c(1,2))
plot(ChainSummary, Param = "mu", Param2 = "delta", log = "x", SmoothPlot = FALSE)
plot(ChainSummary, Param = "mu", Param2 = "delta", log = "x", SmoothPlot = TRUE)

The option SmoothPlot = TRUE is generally recommended as this plot will contain thousands of genes when analysing real datasets.