# Bayesian Hypothesis Test Specification

When using Bayesian models for hypothesis testing in `spearmint` gives the researcher the ability you can configure a number of settings, including the model structure, hyperparameters to the model, and the parameter estimation used:

```python
from spearmint import HypothesisTest

"""
Custom prior: Beta(alpha=100, beta=100)

Defines strong prior belief that conversion probability = 0.5.
These priors can be estimated from historical data or using domain expertise.
"""
bayesian_model_params = dict(prior_alpha=100., prior_beta=100.)

bayesian_test = HypothesisTest(
    ...,
    inference_method="bayesian",
    bayesian_model_name="bernoulli",  # use Bernoulli likelihood model, rather than the default Binomial likelihood model
    bayesian_model_params=bayesian_model_params,  # Use strong Beta prior
    bayesian_parameter_estimation_method="advi"  # Bernouill (but not Binomial) model supports ADVI parameter estimation
)
```


## Prior Specification

When using Bayesian models for hypothesis testing, you can explicitly declare the model using the `bayesian_model_name` parameter. Furthermore, hyperparameters for the model can specified during `HypothesisTest` initialization by providing a `bayesian_model_params` dictionary. If no `bayesian_model_params` are provided, default hyperparameters will be used, generally corresponding with weak or non-informative priors.

## Parameter Estimation Method Specification
In addition to specifying the model form and parameters, `spearmint` also allows the user to customize the parameter estimation method used by specifying one of three possible `bayesian_parameter_estimation_method`s:

- `"mcmc"`: [Markov Chain Monte Carlo (MCMC)](https://pymcmc.readthedocs.io/en/latest/theory.html). This is the most general parameter estimation method, as all models are supported, but MCMC is computationally expensive. Therefore using the `"mcmc"`  parameter estimation method can be prohibitive with large datasets (greater than a few thousand observations).
- `"advi"`: [Autodiff Variational Inference (ADVI)](https://www.pymc.io/projects/docs/en/latest/api/generated/pymc.ADVI.html). The `"advi"` method uses gradient-based optimization on an objective that assumes the posterior is smooth and unimodal (i.e. Gaussian). Due to smoothness constraints, models that have discrete compenents, like the `"binomial"` and `"poisson"` models are not supported. The `"advi"` method scales well to larger datasets, but may become innacurate for more complex distributions that are not unimodal or well-described by a quadratic log likelihood.
- `"analytic"`: Uses [Conjugate prior](https://en.wikipedia.org/wiki/Conjugate_prior) analytic solution. For some hierarchical models that incorporate priors that are conjugate to the likelihood, there are simple posterior update rules based on moments of the sample distribution. The `"analytic"` method scales very well to large datasets, and supports all variable types.


##  Continuous Variables, $X \in \mathbb{R}$

These models make the assumption that the underlying data have a symmetric, real-valued distribution that is characterized well by the Normal (Gaussian) or Student's-*t* distributions. The Student's-*t* model will generally be more robust to outliers in the data.


### Hierarchical Gaussian Model

Models observed variables, $X$ as Normally-distributed with a Normal prior on the mean $\mu$ and a Half-Normal prior on the standard deviation $\sigma$.

$$
\begin{align*}
\mu &\sim \text{Normal}(\mu_{{\mu}}, \sigma_{{\mu}}) \\
\sigma &\sim \text{HalfNormal}_{[0, \infty]}(\mu_{{\sigma}}, \sigma_{{\sigma}}) \\
X &\sim \text{Normal}(\mu, \sigma),
\end{align*}
$$

where $\bar{x}$ and $\text{std}(x)$ are the estimated mean and standard deviation of the observations.
#### Model Specification:
##### `bayesian_model_name`:
- `'gaussian'`

##### `bayesian_model_params` (hyperparameters):

- `prior_mean_mu` ($\mu_{{\mu}}$, default=0.0): the mean of the Gaussian prior for the expected value of each group
- `prior_var_mu` ($\sigma_{{\mu}}$, default=0.0): the variance of the Gaussian prior for the expected value of each group
- `prior_mean_sigma`($\mu_{{\sigma}}$, default=1.0): the mean of the Half-Normal prior (bounded to the left at zero) for the variance of each group
- `prior_var_sigma` ($\sigma_{{\sigma}}$, default=5.0): the variance of the Half-Normal prior (bounded to the left at zero) for the variance of each group

##### Supported `bayesian_parameter_estimation_method`s:
- `"mcmc"`
- `"advi"`
- `"analytic"`


### Hierachical Student's-*t* Model

Models observed variables $X$ as Student's-t-distributed, with a Gaussian prior on the mean and an Exponential prior on the degrees of freedom:

$$
\begin{align*}
\mu &\sim \text{Normal}(\mu_{{\mu}}, \sigma_{{\mu}}) \\
\sigma &\sim \text{HalfNormal}_{[0, \infty]}(\mu_{{\sigma}}, \sigma_{{\sigma}}) \\
\nu &\sim \text{Exponential}(1 / \Lambda) \\
X &\sim \text{StudentT}(\nu, \mu, \sigma)
\end{align*}
$$

#### Model Specification:
##### `bayesian_model_name`:
- `'student_t'`

##### `bayesian_model_params` (hyperparameters):

- `prior_mean_mu` ($\mu_{{\mu}}$, default=0.0): the mean of the Gaussian prior for the expected value of each group
- `prior_var_mu` ($\sigma_{{\mu}}$, default=0.0): the variance of the Gaussian prior for the expected value of each group
- `prior_mean_sigma`($\mu_{{\sigma}}$, default=1.0): the mean of the Half-Normal prior (bounded to the left at zero) for the variance of each group
- `prior_var_sigma` ($\sigma_{{\sigma}}$, default=5.0): the variance of the Half-Normal prior (bounded to the left at zero) for the variance of each group
- `prior_nu_precision` (1 / $\Lambda$, default=5.0): the precision $\nu$ for the Exponential prior on the degrees of freedom $\nu$

##### Supported `bayesian_parameter_estimation_method`s:
- `"mcmc"`
- `"advi"`

## Binary / Proportion Variables, $X \in (0, 1) \subset \mathbb{R}$

### Beta-Binomial Model

Models the number of successes $s$ observed over $N$ binary trials as a Binomial distribution, with a Beta prior on the success probability $p$ for the trials.

$$
\begin{align*}
p &\sim \text{Beta}(\alpha_p, \beta_p) \\
s &\sim \text{Binomial}(N, p)
\end{align*}
$$

#### Model Specification:
##### `bayesian_model_name`:  
- `'binomial'`

##### `bayesian_model_params` (hyperparameters):

- `prior_alpha` ($\alpha_p$, default=1.0): the success shape parameter for the Beta prior distribution
- `prior_beta` ($\beta_p$, default=1.0): the failure shape parameter for the Beta prior distribution

The intuition  is that the larger $\alpha$ and $\beta$, the stronger the beliefs, as each parameter can be thought of a "mock counts" of successes and failures. Thus the expected value of the prior will be located at
$$
\begin{align*}
\text{E}[p] &= \frac{\alpha}{\alpha + \beta} 
\end{align*}
$$

##### Supported `bayesian_parameter_estimation_method`s:
- `"mcmc"`
- `"analytic"`

### Beta-Bernoulli Model

Models the probability of success $p$ as a Bernoulli distribution with a Beta prior over possible success probabilities.

$$
\begin{align*}
\theta &\sim \text{Beta}(\alpha_\theta, \beta_\theta) \\
p &\sim \text{Bernoulli}(\theta)
\end{align*}
$$

#### Model Specification:
##### `bayesian_model_name`: 
- `'bernoulli'`

##### `bayesian_model_params` (hyperparameters):

- `prior_alpha` ($\alpha_\theta$, default=1.0): the success shape parameter for the Beta prior distribution
- `prior_beta` ($\beta_\theta$, default=1.0): the failure shape parameter for the Beta prior distribution

##### Supported `bayesian_parameter_estimation_method`s:
- `"mcmc"`
- `"advi"`
- `"analytic"`


## Count / Rate Variables, $X \in \mathbb{N} \cup {\{0\}}$

### Gamma-Poisson Model

Models observed counts of event successes $X$ recorded over multiple trials of standardized sampling time as a Poisson distribution, with a Gamma prior on the underlying success rate parameter $\lambda$.

$$
\begin{align*}
\lambda &\sim \text{Gamma}(\alpha_\lambda, \beta_\lambda) \\
X &\sim \text{Poisson}(\lambda)
\end{align*}
$$


#### Model Specification:
##### `bayesian_model_name`: 
- `'poisson'`

##### `bayesian_model_params` (hyperparameters):

- `prior_alpha` ($\alpha_\lambda$, default=1.0): The shape parameter for the Gamma prior over the poisson rate
- `prior_beta` ($\beta_\lambda$, default=1.0): The rate parameter for the Gamma prior over the poisson rate

##### Supported `bayesian_parameter_estimation_method`s:
- `"mcmc"`
- `"analytic"`