# AMS (The statistical model)

## Introduction
**Goal: discover Higgs boson**
* How to discover a new particle?
  * find an excess of events (i.e. more events than predicted by the Standard Model without the new particle) that can be explained by e.g. Higgs production and decay
  * define phase-space region ("search region" or "signal region") enriched in signal events and with as few background events as possible
    * _signal_ = collision events in which the new particle is produced (here: $pp\to h^0 + X$, $h^0\to\tau^+\tau^-$)
    * _background_ = all other events (here e.g. $pp\to Z^0 + X$, $Z^0\to\tau^+\tau^-$)
    * definition done based on expected event yields from simulation
  * then count events in actual data and compare to expected event yields 
  * note: we will not define the phase-space region by hand but have a MVA algorithm do this
* How to quantify an excess? Is it a discovery or just a fluke?
  * use a test based on the profile-likelihood ratio to reject the background-only hypothesis
  * for details, see [this article][1] or the [manual][2] of the Higgs boson machine learning challenge

[1]: https://arxiv.org/abs/1007.1727
[2]: http://opendata.cern.ch/record/329/

## Derivation
Observed number of events follows a Poisson distribution:
$$P(n|\mu_s,\mu_b) = \frac{(\mu_s + \mu_b)^n}{n!}\exp(-(\mu_s + \mu_b))$$
* $\mu_s$: number of signal events predicted by simulation
* $\mu_b$: number of background events predicted by simulation

**To claim a signal discovery, we need to reject the background-only hypothesis**, the hypothesis of $\mu_s = 0$.
* I.e. we show that the probability (the p_-value_) that the observed number of events is consistent with only background events and no signal production is below a predefined threshold. 

To do so, we will use the likelihood ratio
$$\lambda = \frac{P(n|0,\mu_b)}{P(n|\hat\mu_s,\mu_b)} = \left( \frac{\mu_b}{\hat\mu_s+\mu_b} \right)^n \exp(\hat\mu_s)$$

to define a test statistic
$$q_0 = \begin{cases}
    -2\ln\lambda & \text{if } n>\mu_b \\
    0 & \text{otherwise}
\end{cases}$$
$q_0$ can be approximated by a simple analytic expressions according to Wilks' theorem (for large enough values).
In particular, the _p_-value of the background-only hypothesis is given by
$$p=1-\Phi(\sqrt{q_0})$$
with $\Phi$ being the cumulative normal distribution.

The _p_-value can be converted to a significance:
$$Z = \Phi^{-1}(1-p)$$
* _p_ = 50% (pure chance) corresponds to $Z=0$, 95% to $Z\simeq1.64$
* Note: in particle physics, we use a significance of $5\sigma$ as threshold for a discovery ($p < 2.9\cdot10^{-7}$).

We obtain
$$Z = \sqrt{q_0} = \begin{cases}
    \sqrt{ 2 \left[ n \ln \left(\frac n {\mu_b} \right) - n + \mu_b \right] } & \text{if } n>\mu_b \\
    0 & \text{otherwise}
\end{cases}
$$

Assuming our simulation is correct, we expect to observe $\mu_b = b$ and $n = s+b$ events and obtain the _approximate median significance_
$$AMS = \sqrt{2 \left[ (s+b) \ln\left(1+\frac sb\right) - s\right] }$$
which we will use to quantify the effectiveness of our MVA algorithms.

## Intuitive (simplifying) interpretation
For $b\gg s$
$$AMS \approx \frac{s}{\sqrt b}.$$

* The typical size of a background fluctuation (std. dev. of the Poisson distribution) is $\sqrt{b}$, i.e. AMS roughly tells us by how many standard deviations (of the background) our signal is expected to "stick out" from the background.
* $b\gg s$ is a strong assumption
    * may be the case in measurements but typically not in searches in extreme phase-space regions
    * will use a regularization term here, i.e. $b\to b+b_\text{reg} \overset{\text{here}}{=} b+10$