2022-03-28 Ludovico Massaccesi

# Interval estimation

## Confidence Level
$$CL(f)=\inf_{\mu\in A}\int_{x:f(x)\ni\mu}p(x;\mu)$$
where $x$ is the data, $\mu$ are the parameters of the distribution, and the function $f(x)$ gives the interval based on the data.
The _confidence level_ is a property of the algorithm $f$.

## Coverage
This one is defined to make the confidence level easier to understand and compute.
The coverage is
$$C(\mu)=\int_{x:f(x)\ni\mu}p(x;\mu)$$
so that
$$CL(f)=\inf_{\mu\in A}C(\mu)$$

In practice we want $C(\mu)=CL$.
If we have $C(\mu)>CL$ we are overcovering (making too-large intervals).
If we have $C(\mu)<CL$ we are undercovering, i.e. our algorithm does not obtain the required CL (it is _wrong!_).

## Neyman's confidence band construction
We look at the (possible) data $x$ as a function of the parameters $\mu$.

 - For each value of $\mu$, we compute the integral of the probability of $x$, $\int p(x;\mu)\text dx$, which will be our coverage $C(\mu)$.
 - For each value of $\mu$, we choose a set of $x$ over which the integral is $CL$ (there are many possible ways to do this).
 - Then, we take the data and we swap the axes: our confidence region will be given by the intersection of the line $x=\text{our data}$ with the intervals defined before.

### Ordering algorithms
To choose the intervals, we usually define an _ordering function_ $o(x)$, so that we can build the band by requiring that
$$\int_{o(x)>c}p(x;\mu)\geq CL$$
One usually does this by including the samples $x$ with larger $o(x)$ first, and then going to $x$ points with lower $o(x)$ values.

There are many possible $o(x)$ to choose:

 - low/high $o(x)=\pm x$, usually for upper/lower limits;
 - central $o(x)$ such that we take half of the $x$ samples to be excluded from below, and the other half from above (the $o(x)$ function can become very complicated in this case);
 - P-ordering $o(x)=p(x;\mu)$ (fast to compute);
 - ...

# Exercise: Poisson
Not the Gaussian because it can all be done analytically and very easily.
$$p(n;\mu)=\frac{e^{-\mu}\mu^n}{n!}$$
Do it with all the four methods:

1. lower limit $\sum_{n=0}^{n_{max}(\mu)}p(n;\mu)\geq CL$;
2. upper limit $\sum_{n=n_{min}(\mu)}^\infty p(n;\mu)=1-\sum_{n=0}^{n_{min}(\mu)-1}p(n;\mu)\geq CL$;
3. central $\sum_{n=n_{max}(\mu)+1}^\infty p(n;\mu)=1-\sum_{n=0}^{n_{max}(\mu)}p(n;\mu)\leq\frac{1-CL}{2}$ and, separately, $\sum_{n=0}^{n_{min}(\mu)-1}p(n;\mu)\leq\frac{1-CL}{2}$; and
4. P-ordering $\sum_{n:p(n;\mu)>\bar p}p(n;\mu)\geq CL$.

# Bayesian interval estimation
This method is based on the _posterior_:
$$\pi_x(\mu)=\pi(\mu)\frac{L_x(\mu)}{p(x)}$$
where $\pi(\mu)$ is the _prior_ "probability" of $\mu$, $\pi_x(\mu)$ is the _posterior_ "probability" of $\mu$, $p(x)$ is the probability of the data $x$, and $L_x(\mu)$ is the likelihood of $\mu$ given $x$ (in practice it is $p_\mu(x)$, the probability of $x$ given $\mu$).

The interval is derived from the posterior:
$$f(x)=\phi:\int_\phi \pi_x(\mu)\text d\mu=c$$
where $c$ is a constant value called _credibility_ or _credibility level_.
This is more straightforward than the frequentist method, but it invloves credibilities / fake probabilities, and requires a prior.

In order to choose $\phi$ we can define an ordering, like before, but in the _parameters space_ instead of the _observations space_.
The common choice is the _posterior ordering_, similar to the probability ordering mentioned before.

Of course, like before, we could choose any $f$ we want.

We can also define the credibility of our $f$, like we defined the CL before,
$$Cr(x)=\int_{\mu\in f(x)} \pi_x(\mu)\text d\mu$$
Note that this is a function of the data $x$, not of the parameters $\mu$ (everything is complementary in the Bayesian method).

# Exercise: Poisson but Bayesian
Compute credible intervals using constant prior $\pi(\mu)=\text{const}$ (this has infinite integral, must be taken into account).