# EML Decomposition for Likelihood-free Inference

### Combined EML for multiple sensors

This notebook will follow the notation used in R. Barlow's [*Extended maximum likelihood*](https://www.sciencedirect.com/science/article/pii/0168900290913348) and J. Lanfranchi's [*Likelihoods for Retro*](https://github.com/IceCubeOpenSource/retro/blob/master/notebooks/likelihood_function_derivation.ipynb) unless otherwise stated.

Barlow's Equation 5 expresses the extended likelihood as follows:

$L=\left[ \prod_{i=1}^{K} P(x_i) \right] e^{-\mathcal{N}}$, or 

$L=\left[ \prod_{i=1}^{K}  p(x_i) \right] \mathcal{N}^{K} e^{-\mathcal{N}}$

where $P(x_i) = \mathcal{N} p(x_i)$ and $p(x_i)$ is a probability density function (PDF) for the observable quantity $x$. Each $x_i$ is a separate observation, assumed to be drawn from the same PDF, and $\mathcal{N}$ is the expected number of observations or events. The shape of $p$ and the value of $\mathcal{N}$ are each dependent on an underlying hypothesis. When conducting parameter extraction, the underlying hypothesis is varied to optimize $L$.

In the context of photosensors, the above formulation of the extended likelihood applies most easily to a single sensor. Assuming this photosensor provides a variable length series of pulse times, the observable quantity $x$ would be the pulse time and $\mathcal{N}$ the expected number of pulses. Further, following the treatment by Lanfranchi, variable-charge hits will be accommodated by replacing $p(t_i)$ with $p(t)^{q_{d, i}}$ and $\mathcal{N}$ with $\Lambda_d$, where $q_{d, i}$ is the charge of pulse $i$ in detector $d$ and $\Lambda_d$ the total amount of charge expected in detector $d$. In this discussion, $\Lambda_d$ will include both noise and signal charge.

Therefore, the extended likelihood for sensor $d$ is

$L_d=\left[ \prod_{i} p_d\left(t_{d, i}\right)^{q_{d, i}} \right] \Lambda_{d} ^{Q_d} e^{-\Lambda_{d}}$, 

where $Q_d = \sum_i q_{d, i}$.

Assuming the observations for each detector fluctuate independently, the extended likelihood for all sensors is a product of the respective likelihoods for each individual sensor:

$L = \prod_{d=1}^{D_{tot}} L_d$,

$L = \prod_{d=1}^{D_{tot}} \left[\left(\prod_{i=1}^{K_d} p_d\left(t_{d, i}\right)^{q_{d, i}} \right
)\Lambda_{d} ^{Q_d} e^{-\Lambda_{d}}\right]$.

### DOM(net) formulation

Rearranging the terms of the combined likelihood, one obtains

$\boxed{L = \left[\prod_{d, i} p_d\left(t_{d, i}\right)^{q_{d, i}}\right] \cdot \left[\prod_{d=1}^{D_{tot}} \Lambda_{d} ^{Q_d} e^{-\Lambda_{d}} \right]}$

The term on the left is a product over all observed pulses, and the term on the right is a product over all sensors, including those that have observed 0 pulses. In the context of *freeDOM*, this formulation corresponds to a DOMnet approach, where the product over pulses is handled by hitnet and the product over sensors is handled by DOMnet. Recall that, in this case, $p_{d}\left(t\right)$ is the pulse time PDF for detector $d$ and satisfies $\int p_{d} \left(t\right) dt = 1$. In principle, one could train separate hitnets for each DOM and use them to evaluate the above expression.

### Total Charge(net) formulation

Defining the quantities $Q_{tot} = \Sigma_{d,i} q_{d, i}$ and $\Lambda_{tot} = \Sigma_d \Lambda_d$, the above DOM term can be rewritten as follows:

$\prod_{d=1}^{D_{tot}} \Lambda_{d} ^{Q_d} e^{-\Lambda_{d}} = \left[\prod_{d=1}^{D_{tot}} \Lambda_{d} ^{Q_d}\right]  e^{-\Lambda_{tot}}$,

$\prod_{d=1}^{D_{tot}} \Lambda_{d} ^{Q_d} e^{-\Lambda_{d}} = \left[\prod_{d,i}\Lambda_{d}^{q_{d, i}}\right]  e^{-\Lambda_{tot}}$

Again, $\prod_{d,i}$ is a product over all observed pulses. Continuing,

$\prod_{d=1}^{D_{tot}} \Lambda_{d} ^{Q_d} e^{-\Lambda_{d}} = \left[\prod_{d,i} \left(\frac{\Lambda_{tot}}{\Lambda_{tot}}\right)^{q_{d, i}}\Lambda_{d}^{q_{d, i}}\right]  e^{-\Lambda_{tot}}$

$\prod_{d=1}^{D_{tot}} \Lambda_{d} ^{Q_d} e^{-\Lambda_{d}} = \left[\prod_{d,i} \left(\frac{\Lambda_{d}}{\Lambda_{tot}}\right)^{q_{d, i}}\right] \Lambda_{tot}^{Q_{tot}} e^{-\Lambda_{tot}}$

The term outside of the brackets is a Poisson term for the total observed charge. Further, we are left with a number of factors of the form $\left(\frac{\Lambda_{d}}{\Lambda_{tot}}\right)^{q_{d, i}}$, one per pulse. These can be distributed to their associated terms in the product over all observed pulses as follows:

$L = \left[\prod_{d, i} p_d\left(t_{d, i}\right)^{q_{d, i}}\right] \cdot \left[\prod_{d,i} \left(\frac{\Lambda_{d}}{\Lambda_{tot}}\right)^{q_{d, i}}\right] \Lambda_{tot}^{Q_{tot}} e^{-\Lambda_{tot}}$

$L = \left\{\prod_{d, i} \left[\frac{\Lambda_{d}}{\Lambda_{tot}} p_d\left(t_{d, i}\right)\right]^{q_{d, i}}\right\} \cdot \left [\Lambda_{tot}^{Q_{tot}} e^{-\Lambda_{tot}}\right]$.

Defining $p_d^T\left(t\right) \equiv \frac{\Lambda_{d}}{\Lambda_{tot}} p_d\left(t_{d, i}\right)$, the above can be more concisely written as follows:

$\boxed{L = \left[\prod_{d, i} p^T_d\left(t_{d, i}\right)^{q_{d, i}}\right]\cdot \left [\Lambda_{tot}^{Q_{tot}} e^{-\Lambda_{tot}}\right]}$

The $T$ superscript distinguishes $p_d^T$ from $p_d$ and indicates that the former is intended to be used in the "total charge" formulation of the combined likelihood. The total charge formulation is algebraically equivalent to the DOM formulation, but groups terms in a conceptually different way. Rather than a product over all pulses and a product over all sensors, hit or not-hit, we have a product over all pulses and a single Poisson term for the total charge aggregated over all sensors. In the case where most sensors observe 0 pulses, this formulation may lend itself to more computationally efficient reconstructions. However, it is important to emphasize that the PDF used in the product over all pulses in the total charge formulation ($p_d^T$) is not the same as the one used in the DOM formulation ($p_d$). 

The change from $p_d$ to $p_d^T$ reflects the conceptual difference between the two formulations. The likelihood expressed in terms of total charge has the same structure as the single sensor likelihood, except that the "sensor" in question is the aggregate of all photosensors rather than a single photosensor. In the case of the total charge formulation, the detector index $d$ has taken the role of an observed pulse feature, similar to $t$. In fact, $p_d^T$ is a PDF in time and detector index (well, not exactly a density, but a hybrid density-mass function, given that $d$ is a discrete quantity):

$\Sigma_{d=1}^{D_{tot}} \int_t p_d^T(t) dt = \Sigma_d \frac{\Lambda_{d}}{\Lambda_{tot}} \int_t p_d(t) dt = \Sigma_d \frac{\Lambda_{d}}{\Lambda_{tot}} = 1$.

The above arguments could be repeated for any grouping of sensors, not just all or one-at-a-time. For any grouping of sensors, the PDF $p$ will have to be modified to include the expected charge fraction observed by a given sensor within the group.

## Conclusion

In conclusion, treating sensors one at a time or in groups should yield identical results provided the PDF used in the pulse term is modified appropriately. In the context of *freeDOM*, this means chargenet can not be changed independently of hitnet.

# To-do: Add illustrative plots from a toy experiment MC in support of the above 