# Toy Signal + Background Likelihood with Marked Poisson Models

Consider $n$ observed events and the hypothesis that there are both background and signal processes present in the events. For each event there is an observation of a property of the event, $x$, such that all collected data are given by $\vec{x} = \left\{x_1, \ldots, x_n\right\}$.

To model the behavior of this process, first consider the probability density to observed $n$ events. If we assume that this is a counting experiment, then this can be modeled by a Poisson distribution with $n$ observed events and the number of expected events is given by $N_{\textrm{expected}} = \mu N_{\textrm{signal}} + N_{\textrm{background}} \equiv \mu S + B$,

$$
f_{\textrm{counting}} \left(n\right) = \textrm{Pois}\left(n\,\middle| \,\mu S + B\right).
$$

Then consider the probability density that particular value of $x$ was observed for each event given the [mixture model](https://en.wikipedia.org/wiki/Mixture_distribution) composed of signal and background samples

\begin{align*}
f\left(x\,\middle|\,\mu\right) &= \frac{1}{\nu_{\textrm{total}}} \sum_{s\,\in \textrm{samples}} \nu_{s}\, f_{s}\left(x\,\middle|\,\mu\right)\\
    &= \frac{\mu S \,f_{S}\left(x\right) + B \,f_{B}\left(x\right)}{\mu S + B}
\end{align*}

such that the joint density for all $n$ statistically independent events is

$$
\prod_{i=1}^{n} f\left(x_{i}\,\middle|\,\mu\right) = \prod_{i=1}^{n} \frac{\mu S \,f_{S}\left(x_{i}\right) + B \,f_{B}\left(x_{i}\right)}{\mu S + B}
$$

Then the probability density function to have obseved the data is the joint density of having observed the number of events and these values of $x$,

$$
\mathcal{P}\left(\vec{x}\,\middle|\,\mu\right) = \textrm{Pois}\left(n\,\middle| \,\mu S + B\right) \prod_{i=1}^{n} \frac{\mu S \,f_{S}\left(x_{i}\right) + B \,f_{B}\left(x_{i}\right)}{\mu S + B}
$$

or, in a more explicit and general notation,

$$
\mathcal{P}\left(\vec{x}\,\middle|\,\mu\right) = \textrm{Pois}\left(N_{\textrm{observed}}\,\middle| \,N_{\textrm{expected}}\right) \prod_{i=1}^{N_{\textrm{observed}}} \frac{\mu N_{S} \,f_{S}\left(x_{i}\right) + N_{B}\,f_{B}\left(x_{i}\right)}{N_{\textrm{expected}}}.
$$

Taking the observed data, $\vec{x}$, as fixed then the likelihood is seen to take the familiar form of an extended likelihood (for a [marked Poisson model](https://en.wikipedia.org/wiki/Poisson_point_process#Marked_Poisson_point_process))

\begin{align}
L\left(\mu\right) &= \textrm{Pois}\left(N_{\textrm{observed}}\,\middle| \,N_{\textrm{expected}}\right) \prod_{i=1}^{N_{\textrm{observed}}} \frac{\mu N_{S} \,f_{S}\left(x_{i}\right) + N_{B}\,f_{B}\left(x_{i}\right)}{N_{\textrm{expected}}}\\
    &= \frac{\left(N_{\textrm{expected}}\right)^{N_{\textrm{observed}}} e^{-N_{\textrm{expected}}}}{N_{\textrm{observed}}!} \prod_{i=1}^{N_{\textrm{observed}}} \frac{\mu N_{S} \,f_{S}\left(x_{i}\right) + N_{B}\,f_{B}\left(x_{i}\right)}{N_{\textrm{expected}}}\notag\\
    &= \frac{e^{-N_{\textrm{expected}}}}{N_{\textrm{observed}}!} \prod_{i=1}^{N_{\textrm{observed}}} \mu\, N_{S} \,f_{S}\left(x_{i}\right) + N_{B}\,f_{B}\left(x_{i}\right)\notag
\end{align}

and so then the log likelihood is seen to be

\begin{align*}
\ln L\left(\mu\right) &= \ln\left(\frac{e^{-N_{\textrm{expected}}}}{N_{\textrm{observed}}!}\right) + \ln \left(\prod_{i=1}^{N_{\textrm{observed}}} \mu\, N_{S} \,f_{S}\left(x_{i}\right) + N_{B}\,f_{B}\left(x_{i}\right)\right) \\
    &= - N_{\textrm{expected}} - \ln \left(N_{\textrm{observed}}!\right) + \sum_{i=1}^{N_{\textrm{observed}}} \ln\left(\,\mu\, N_{S} \,f_{S}\left(x_{i}\right) + N_{B}\,f_{B}\left(x_{i}\right)\right)\\
\end{align*}

Ignoring the constant of the data, $\ln \left(N_{\textrm{observed}}!\right)$, the negative log likelihood is then

$$
-\ln L\left(\mu\,;\,N_{S}, \vec{\theta}_{S}, N_{B}, \vec{\theta}_{B}\right) = N_{\textrm{expected}} - \sum_{i=1}^{N_{\textrm{observed}}} \ln\left(\mu\, N_{S} \,f_{S}\left(x_{i}\,\middle|\,\vec{\theta}_{S}\right) + N_{B}\,f_{B}\left(x_{i}\,\middle|\,\vec{\theta}_{B}\right)\right)
$$

which in the more typical notation of the field is

$$
-\ln L\left(\mu\right) = \left(\,\mu S + B\right) - \sum_{i=1}^{n} \ln\left(\,\mu S \,f_{S}\left(x_{i}\right) + B\,f_{B}\left(x_{i}\right)\right)
$$

## Binned Model

$$
f_{\textrm{hist}}\left(x_i\right) \to \frac{\nu_{\textrm{bin}}^{\textrm{hist}}\left(x_i\right)}{N^{\textrm{hist}} \Delta_{\textrm{bin}}\left(x_i\right)}
$$

so negative log likelihood is then

$$
-\ln L\left(\mu\right) = \left(\,\mu S + B\right) - \sum_{i=1}^{n} \ln\left(\,\mu \frac{\nu_{\textrm{bin}}^{S}\left(x_i\right)}{\Delta_{\textrm{bin}}\left(x_i\right)} + \frac{\nu_{\textrm{bin}}^{B}\left(x_i\right)}{\Delta_{\textrm{bin}}\left(x_i\right)}\right)
$$