### [Bayesian Methods for Small Population Analysis](https://sites.nationalacademies.org/cs/groups/dbassesite/documents/webpage/dbasse_184766.pdf)



Seldom can inferences from small populations stand on their own, because estimates are unstable (have low precision). Therefore, modeling or other stabilization/(information enhancement) is
necessary; strategies include:
- Aggregation
- Regression both within and across populations
- Hierarchical (Bayesian/EB) modeling to ‘borrow information’ within and between data sources


Stabilization/enrichment targets include,
- Estimated regression slopes and residual variances
- Small Area (Domain) estimates (SAEs)
- Estimated SMRs and the challenges of low information
- Survey weights (Gelman, 2007)

#### Shrinkage can be controversial (Normand et al., 2016)
- Direct estimates with greatest uncertainty are shrunken closest to the regression surface, potentially conferring undue benefits or punishments. 
- Especially troublesome when the model is mis-specified (always true!) and sample size is informative so that the degree of shrinkage is ‘connected at the hip’ to the underlying truth
- Standard model fitting gives more weight to the stable units, consequently the units that ‘care about’ the regression model have less influence on it
- Recent approaches increase the weights for the relatively unstable units, paying some variance, but improving estimation performance for mis-specified models (Chen et al., 2015; Jiang et al., 2011)

#### Trading off Variance and Bias (for the linear model)

Each with an underlying feature of interest ($θ_k$ ), $k$ units (individuals, clusters, regions, . . . ).


A direct (unbiased) estimate of it ($Y_k$ ), with estimated variance ($\hat{σ}_k^2$)


Unit-specific attributes $X_k$  produce,
$$ regression \ prediction = \hat{β}X_k \\
residual = Y_k - \hat{β}X_k  $$

Inviting three choices for estimating the $θ_k$:
- Direct: Use the $Y_k$ (unbiased, but possibly unstable)
- Regression: Use the regression (stable, but possibly biased)
- Middle ground: A weighted average of Regression and Direct

$$ \hat{θ}_k = regression \ prediction + (1 − \hat{B}_k ) * residual
= \hat{β}X_k   +  (1 − \hat{B}_k )( Y_k - \hat{β}X_k )$$

$\hat{B}_k = \frac{\hat{σ}_k^2}{\hat{σ}_k^2 + \hat{\tau}^2} $,  $\hat{\tau}^2$ -  residual/unexplained variance, model lack of fit



#### Stabilizing Variance Estimates

The $\hat{σ}_k^2$ come from a (Gamma) prior with,
- Estimated mean $\hat{m}^2$
- Estimated effective sample size $\hat{M}^2$
The empirical Bayes estimates are,
$$\tilde{σ}_k^2 = \hat{m} + (1 − B_k )(\hat{σ}_k^2 - \hat{m}) $$ 

$B_k = \frac{\hat{M} }{\hat{M} + d_k }, \tilde{d}_k \approx B_kd_+ + (1 − B_k )d_k  $

$d_k = n  − 2 $ - degrees of freedom


The distribution of $\tilde{σ}_k^2$ isn’t chi-square, but a fully Bayesian analysis (possibly via MCMC) produces the joint posterior distribution of the slopes and variances,
and supports valid intervals and other inferences



### [Bayesian methods for small population analysis](https://www.nap.edu/read/25112/chapter/8)



Hoyle suggested several ways to strategically think about adjustments in the study plan that would help to support inference in a small sample situation. He showed a general t-test—the ratio of a parameter estimate to its standard error. If the goal is to detect a significant effect, there are two options for increasing t: increase the parameter estimate or decrease the standard error.

Louis said that, as discussed by Hoyle, regression functions and equations are attractive. However, he said that he would focus on Bayesian empirical-based hierarchal models that borrow information across units. Bayesian formalism, the engine of Bayesian analysis, forces thinking that produces effective approaches. He gave an example of trading off bias and variance for a linear model. 

Nothing in the Bayesian approach eliminates the hard work needed to develop a good model, he stressed. The regression model can be used to produce estimates that may be quite stable; however, the unit-level estimates are likely to be biased. Now there are two estimates, the direct estimate that is unbiased but has high variance, and the regression estimate that may be biased but has lower variance. In this context, Bayesian modeling suggests a middle ground—an estimate that is partway between the direct estimate and the regression estimate. The Bayes estimate is a third estimate, a weighted average of the direct estimate and the regression estimate. The weight applied to the direct estimate will be close to 1 if the direct estimate has low variance relative to the regression estimate. However, if the direct estimate has high variance relative to the regression estimate, the weight on the regression estimate will be higher.