# Poisson-Gamma model

# _Josep Fortiana_ $\hskip6cm$ 2022-03-28

### Adapted from [Brian Reich - NC State University](https://statistics.sciences.ncsu.edu/people/bjreich/) - [Poisson/Gamma model](https://www4.stat.ncsu.edu/~bjreich/BSMdata/JAGS2.html)

***
###### LaTeX macros
$\def\prob{P}$
$\def\argmax{\operatorname{arg\,max}}$
$\def\argmin{\operatorname{arg\,min}}$
$\def\borel{\operatorname{Borel}}$
$\def\cE{\cal E}$
$\def\cP{\cal P}$
$\def\R{\mathbb{R}}$ 
$\def\N{\mathbb{N}}$
$\def\Z{\mathbb{Z}}$
$\def\Ee{\operatorname{E}}$
$\def\va{\text{v.a.}}$
$\def\var{\operatorname{var}}$
$\def\cov{\operatorname{cov}}$
$\def\cor{\operatorname{cor}}$
$\def\binomdist{\operatorname{Binom}}$
$\def\berndist{\operatorname{Ber}}$
$\def\betabinomdist{\operatorname{Beta-Binom}}$
$\def\betadist{\operatorname{Beta}}$
$\def\gammadist{\operatorname{Gamma}}$
$\def\hyperdist{\operatorname{Hypergeom}}$
$\def\hypergeomdist{\operatorname{Hypergeom}}$
$\def\poissondist{\operatorname{Poisson}}$
$\def\geomdist{\operatorname{Geom}}$
$\def\normaldist{\operatorname{N}}$
$\def\unifdist{\operatorname{Unif}}$
$\DeclareMathOperator{\indica}{\mathbb{1}}$
$\def\CondTo{\mathbin{|\mskip0.5mu}}$
***

# 00 - Data and model

We sample **$n$ square miles** of the state and observe **$y\in\{0,1,2,\dots\}$ animals** of the species of interest. 

Our objective is to describe the number $\lambda$ of animals per square mile. The statistical model (likelihood and prior) is:

$$
    y\,|\,\lambda\sim\poissondist(n\cdot\lambda),\mskip50mu \lambda\sim\gammadist(\alpha,\beta).
$$

## Problem constant ($n$) and observed data ($y$)

In [None]:
n<-20
y<-11

## Prior parameters

The expectation of a $\gammadist(a,b)$ is $a/b$ and its variance is $a/b^2$.

In [None]:
prior.a<-0.5
prior.b<-0.5

# 01 - Exact treatment, using the conjugate property

The prior is $\gammadist(\alpha,\beta)$. Given the observed $y$ in $n$ square miles, the posterior is: 

$$
    \lambda\,|\,y\sim\gammadist(\alpha',\beta').
$$

$$
    \left\{\begin{array}{lcl}
    \alpha'  &=& \alpha+y,\\[0.25cm]
    \beta'   &=& \beta+n.
    \end{array}\right.
$$

Therefore, the posterior mean, standard deviation, and 90% interval can be found exactly.

In [None]:
posterior.a<-prior.a+y
posterior.b<-prior.b+n

In [None]:
# Posterior mean 
round(posterior.a/posterior.b,5)                  

In [None]:
# Posterior sd 
round(sqrt(posterior.a)/posterior.b,5)                    

In [None]:
# Posterior (quantile-centered) 90% interval
round(qgamma(c(0.025,0.975),shape=posterior.a,rate=posterior.b),5)     

# 02 - Numerical treatment with discretization

## Discretization grid

In [None]:
# Interval [0,2] determined to contain most of the gamma probability (gamma density near 0)
# Grid of lambda values
grid<-seq(0.01,2,by=0.01)    

## Likelihood, prior, posterior

In [None]:
Lik <- dpois(y,n*grid)
Lik <- Lik/sum(Lik)                     #standardize

Prior <- dgamma(grid,prior.a,prior.b)
Prior <- Prior/sum(Prior)               #standardize

Joint  <- Lik*Prior

Post   <- Joint/sum(Joint)

## Plots

In [None]:
options(repr.plot.width=6.5,repr.plot.height=6.5)
plot(grid,Lik,type="l",lty=3,col="blue",cex.lab=1.5,lwd=3,
     ylim=c(0,0.025),xlab=expression(lambda),ylab="Density",main="Prior, posterior, likelihood",
     cex.lab=1.3,cex.main=1.5)
lines(grid,Prior,col="cyan",lwd=3)
lines(grid,Post,col="magenta",lwd=3)
legend("topright",c("Likelihood","Prior","Posterior"),lwd=c(3,3,3),lty=c(3,1,1),col=c("blue","cyan","magenta"),inset=0.05)

# 03 - JAGS treatment with `rjags`

In [None]:
#install.packages("rjags",repos= "https://cloud.r-project.org")
require(rjags)

## Write model data

In [None]:
Poiss.01.data<-list(y=y,n=n,a=prior.a,b=prior.b)
Poiss.01.model_string <- "model{
  # Likelihood (can't have formulas in distribution functions)
  y  ~  dpois(mu)
  mu <- n*lambda
  # Prior
  lambda ~ dgamma(a, b)
 }"

## Compile model

In [None]:
Poiss.01.model <- jags.model(textConnection(Poiss.01.model_string), data = Poiss.01.data)

## Generate and summarize samples

In [None]:
update(Poiss.01.model, 10000, progress.bar="none")

In [None]:
Poiss.01.samples <- coda.samples(Poiss.01.model,variable.names=c("lambda"),n.iter=20000, progress.bar="none")
summary(Poiss.01.samples)

In [None]:
options(repr.plot.width=13,repr.plot.height=6.5)
plot(Poiss.01.samples)

# 04 - Stan treatment 

In [None]:
#install.packages("rstan", dependencies=TRUE,repos= "https://cloud.r-project.org")
require(rstan)

In [None]:
# Following directions:
# For execution on a local, multicore CPU with excess RAM we recommend calling
# options(mc.cores = parallel::detectCores()).
# To avoid recompilation of unchanged Stan programs, we recommend calling
# rstan_options(auto_write = TRUE)
options(mc.cores = parallel::detectCores())
rstan_options(auto_write = TRUE)

## Specify model

In [None]:
modelString = "
  data{
    int<lower=0> n ;
    int<lower=0> y ; 
    real<lower=0> a ;
    real<lower=0> b ;
    }
  parameters{
    real<lower=0> lambda ;
    }
  model{
    lambda ~ gamma(a,b) ;
    y ~ poisson(n*lambda) ; 
    }"

## Compile model and sample from the posterior pdf

In [None]:
# Translate model to C++ and compile to DSO:
stanDso <- stan_model( model_code=modelString ) 

In [None]:
# Specify data:
n<-20
y<-11
prior.a<-0.5
prior.b<-0.5
a <- prior.a
b <- prior.b
dataList = list(n=n,y=y,a=a,b=b)

In [None]:
# Generate posterior sample:
stanFit <- sampling( object=stanDso, 
                     data = dataList, 
                     chains = 3,
                     iter = 3000, 
                     warmup = 200, 
                     thin = 1)

In [None]:
S<-summary(stanFit)
round(S$summary,5)

# Diagnostic diagrams with the `bayesplot` package

In [None]:
#install.packages("bayesplot", dependencies=TRUE,repos= "https://cloud.r-project.org")
require(bayesplot)

In [None]:
color_scheme_set("green")
options(repr.plot.width=10,repr.plot.height=6)
mcmc_trace(stanFit, pars = c("lambda"))

In [None]:
options(repr.plot.width=10,repr.plot.height=6)
mcmc_acf(stanFit,pars=c("lambda"))

In [None]:
color_scheme_set("viridisC")
options(repr.plot.width=10,repr.plot.height=6)
mcmc_acf_bar(stanFit,pars=c("lambda"))

# Analysis of posterior pdf properties

### Posterior credible interval 

In [None]:
color_scheme_set("yellow")
options(repr.plot.width=7,repr.plot.height=4)
mcmc_intervals(stanFit, pars = c("lambda"),prob=0.75,prob_outer=0.95)
# Defaults are
# prob = 0.5,
# prob_outer = 0.9,

### Areas diagram

In [None]:
options(repr.plot.width=7,repr.plot.height=7)
mcmc_areas(stanFit, pars = c("lambda"),prob=0.75,prob_outer=0.95)

### Histogram

In [None]:
color_scheme_set("brightblue")
options(repr.plot.width=7,repr.plot.height=7)
mcmc_hist(stanFit, pars = c("lambda"),binwidth=0.05)

### Density plot

In [None]:
options(repr.plot.width=7,repr.plot.height=7)
mcmc_dens(stanFit, pars = c("lambda"))

In [None]:
color_scheme_set("viridisE")
options(repr.plot.width=7,repr.plot.height=7)
mcmc_dens_chains(stanFit, pars = c("lambda"))

### Violin plot

In [None]:
color_scheme_set("brightblue")
options(repr.plot.width=7,repr.plot.height=7)
mcmc_violin(stanFit, pars = c("lambda"))