### [Bayes or Bootstrap? ](https://stats.stackexchange.com/questions/25286/when-to-use-bootstrap-vs-bayesian-technique)

If you have any useful prior information, or the problem have a hierarchical (nested) structure, then a bayesian technique is probably better (especially if the number of model parameters is large relative to the amount of available data, so estimation would benefit from "bayesian shrinking"). Otherwise MLE/bootstrap is sufficient

The Bayesian approach models such data very naturally, whereas the bootstrap was originally designed for data modelled as i.i.d. While it has been extended to hierarchical data, such approaches are relatively underdeveloped.

Another possible approach is to use mixed-effects models (e.g. using R package lme4) to model the hierarchical structure you've aluded to. That would also help to stabilize the estimates for (hierarchical) models with large number of parameter

From page 267 of Elements of Statistical Learning:

Bootstrap versus Maximum Likelihood

In essence the bootstrap is a computer implementation of nonparametric or parametric maximum likelihood. The advantage of the bootstrap over the maximum likelihood formula is that it allows us to compute maximum likelihood estimates of standard errors and other quantities in settings where no formulas are available. 

### [The bootstrap and Markov chain Monte Carlo (Efron)](http://citeseerx.ist.psu.edu/viewdoc/download?rep=rep1&type=pdf&doi=10.1.1.220.9156)

This note concerns the use of parametric bootstrap sampling to carry out Bayesian
inference calculations. This is only possible in a subset of those problems amenable to
MCMC analysis, but when feasible the bootstrap approach offers both computational and
theoretical advantages. The discussion here is in terms of a simple example, with no attempt
at a general analysis.

It is not really surprising that the bootstrap and MCMC share some overlapping territory.
Both are general-purpose computer-based algorithmic methods for assessing statistical accuracy, and both enable the statistician to deal effectively with nuisance parameters, obtaining
inferences for the interesting part of the problem. 
On the less salubrious side, both share the
tendency of general-purpose algorithms toward overuse.

Of course the two methodologies operate in competing inferential realms: frequentist for
the bootstrap, Bayesian for MCMC. The working assumption is
that we have a Bayesian prior in mind and the only question is how to compute its posterior
distribution. Arguments about the merits of Bayesian versus frequentist analysis will not be
taken up here, except for our main point that the two camps share some common ground.


To calculate posterior $$E(\theta | \hat{\beta}) =\frac{ \int_{\mathcal{B}} t(\beta)\pi(\beta)g_\beta(\hat{\beta})d\beta}{\int_{\mathcal{B}} \pi(\beta)g_\beta(\hat{\beta})d\beta}$$ with bootstrap they change likelihood function $g_\beta(\hat{\beta})$ to bootstrap density $g_{\hat{\beta}}(\beta)$

Define the conversion ratio
$$R(\beta) = \frac{g_\beta(\hat{\beta})}{g_{\hat{\beta}}(\beta)}$$

In this way, any Bayesian posterior expectation can be evaluated from parametric bootstrap
replications.

$$\hat{E}(\theta | \hat{\beta}) =\frac{ \sum_{i=1}^{B} t_i\pi_i R_i}{\sum_{i=1}^{B} \pi_i R_i}$$

A connection between the nonparametric bootstrap and Bayesian inference was suggested under the
name “Bayesian bootstrap” in Rubin (1981), and also in Section 10.6 of Efron (1982). Newton
and Raftery (1994) make the connection more tangible, applying (2.14) with nonparametric
bootstrap samples

![image.png](attachment:image.png)

Figure graphs the Boot and Bayes counts. We see that the Jeffreys Bayes
distribution is shifted to the right of the Boot distribution. It can be shown that the Jeffreys
prior $\pi(\beta) = 1/\sigma^2$
is almost exactly correct from a frequentist point of view. (The Welch–Peers theory of 1963 shows this
kind of Bayes-frequentist agreement holding asymptotically for all priors of the form $\pi(\beta) = h(\alpha_0)/\sigma^2 $, with $h(\cdot)$ any smooth positive function. The components of variance problem is
unusual in allowing a simple expression for the Welch–Peers prior; see Section 6 of Efron
(1993).)

The Bca system of confidence intervals (“bias-corrected and adjusted,” Efron, 1987) adjust
the raw bootstrap distribution — represented by the dashed curve in Figure — to achieve second-order accurate frequentist coverage. The corresponding “confidence distribution” requires a different conversion factor $R_{Bca}(\beta)$, that does not involve any Bayesian input. In the example of Figure 1, the Bca
confidence density almost perfectly matches the Jeffreys posterior density

### [A Simulation Study Comparing the Performance of Bayesian Markov Chain Monte Carlo Sampling and Bootstrapping](http://lutzonilab.org/publications/lutzoni_file189.pdf)

Bayesian Markov chain Monte Carlo sampling has become increasingly popular in phylogenetics as a method for both
estimating the maximum likelihood topology and for assessing nodal confidence. Despite the growing use of posterior
probabilities, the relationship between the Bayesian measure of confidence and the most commonly used confidence
measure in phylogenetics, the nonparametric bootstrap proportion, is poorly understood. 

We used computer simulation to
investigate the behavior of three phylogenetic confidence methods:
- Bayesian posterior probabilities calculated via Markov chain Monte Carlo sampling (BMCMC-PP)
- Maximum likelihood bootstrap proportion (ML-BP)
- Maximum parsimony bootstrap proportion (MP-BP): in phylogenetics, maximum parsimony is an optimality criterion under which the phylogenetic tree that minimizes the total number of character-state changes is to be preferred

BMCMC-PP and ML-BP were often strongly correlated with one another but could provide substantially
different estimates of support on short internodes. 
In contrast, BMCMC-PP correlated poorly with MP-BP across most of
the simulation conditions that we examined. For a given threshold value, more correct monophyletic groups were
supported by BMCMC-PP than by either ML-BP or MP-BP. 

Result: Bayes or Bootstrap?

To answer this question, phylogeneticists must have
some idea of what they would like their confidence method
to measure. Nonparametric bootstrapping is appropriate if
one is interested in the sensitivity of observed results to the
sampling error associated with collecting characters from
a hypothesized underlying character distribution. If one is
willing to specify a fully probabilistic model of character
evolution and wishes to place confidence limits on the
results of an analysis conditioned on the observed data and
that model, Bayesian posterior probabilities are the appropriate confidence measure to use. 

In cases where one
decides to bootstrap, it is useful to note that it may require
a relatively large amount of data to obtain high confidence
on short internodes (Berbee, Carmean, and Winka 2000)
compared with BMCMC-PP. 

When assessing posterior
probabilities, it is important to remember that confidence
values estimated on extremely short internodes may
sometimes be sensitive to the underlying stochastic process.

Additional work is needed to determine the
circumstances where the more conservative nature of
likelihood bootstrapping may be preferred to the increased
power of BMCMC-PP.