# Bayesian line aware statistic

The single-detector algorithm described in previous sections returns the most probable track of the loudest signal assumed to be in Gaussian noise. However, an astrophysical signal is not expected to have an amplitude which is orders of magnitude above the noise floor, but have an amplitude more similar to the noise. Therefore, a signal with a large amplitude is more likely to be of instrumental origin rather than astrophysical.


To improve the ability of SOAP to detect astrophysical signals we use a bayesian "line-aware" statistic, which rewards the statistic when they have similar SNR in each detector.

We first consider the model of Gaussian noise with no signal present. Within
a single summed segment, the likelihood of Gaussian noise at
frequency $\nu$ is given by a $\chi^2$ distribution,

\begin{equation}
p(F_j|\nu_j,M_{\text{N}},I) = \frac{1}{2^{d/2}\Gamma(d/2)}F_j^{d/2 - 1}\exp{\left\{
\frac{F_j}{2}\right\}}
\end{equation}

where $F_j$ is the frequency domain power summed over sub-segments within a single day, as described in Sec.~\ref{soap:sumdata} and  $d$ is the number of degrees of freedom,  equal to twice the total number of summed SFTs.  $M_{\rm{N}}$ represents the model that the data is simply Gaussian noise. In the presence of a signal (model $M_{\text{S}}$), the power should follow a non central $ \chi^2 $ distribution in which the non-centrality parameter $\lambda$ is the square of the \gls{SNR}, $(\lambda = \rho_{\rm{opt}}^2 )$, i.e.,

\begin{equation}
\begin{split}
p(F_j|\nu_j,\lambda,M_{\text{S}},I) = \frac{1}{2} \exp{\left\{ -\frac{F_j+\lambda}{2}\right\}} \left( \frac{F_j}{\lambda} \right)^{d/4 - 1/2} \\
I_{d/2 -1}\left( \sqrt{\lambda F_j}\right).
\end{split}
\end{equation}


If a signal is present we therefore expect the \gls{SFT} powers in the detector to follow Eq.~\ref{soap:las:noncentral}.  We can then determine the evidence for model $M_{\text{S}}$ by marginalising over $\lambda$,

\begin{equation}
\begin{split}
p(F^{(1)}_{j} \mid \nu_j,M_{\rm{S}},I) = \int_0^{\infty}  p(\lambda,w) 
p(F^{(1)}_{j}|\nu_j,\lambda,M_{\text{S}},I) d\lambda.
\end{split}
\end{equation}

Here we set the prior on $\lambda$ to be an exponential distribution of width $w$, this is done somewhat arbitrarily as we expect the majority of signals to have a low \gls{SNR}. This distribution follows,
\begin{equation}
p(\lambda,w) = \exp\left( \frac{-\lambda}{w}\right).
\end{equation}


In this single-detector case, we expect an astrophysical signal to look very similar to that of a line other than its amplitude (or SNR). Therefore, we set the evidence for an astrophysical signal and an instrumental signal to follow Eq.~\ref{soap:las:signal:single}, where the width $w$ different between the two models.

We then have three models, one for an astrophysical signal, one for an instrumental line and one for Gaussian noise. 

The posterior probability of model $M_{\text{GL}}$, which contains the probability of Gaussian noise or Gaussian noise with a line (taken as mutually exclusive) is
\begin{equation}
\begin{split}
p(M_{\rm{GL}} \mid F^{(1)}_{j},\nu_j ,I) = p(M_{\rm{G}} \mid F^{(1)}_{j},\nu_j ,I) \\
+p(M_{\rm{L}} \mid F^{(1)}_{j} ,\nu_j, I).
\end{split}
\end{equation}


We can now find the posterior odds ratio for the presence of a signal over noise or a line,
\begin{equation}
\begin{split}
O^{(1)}_{\rm{S/GL}}(F^{(1)}_{j}\mid\nu_j) &=  \frac{p(M_{\rm{S}} \mid F^{(1)}_{j} ,\nu_j)}{p(M_{\rm{GL}} \mid F^{(1)}_{j},\nu_j)}
= \frac{p(M_{\rm{S}} \mid F^{(1)}_{j} ,\nu_j)}{p(M_{\rm{G}} \mid F^{(1)}_{j} ,\nu_j) +p(M_{\rm{L}} \mid F^{(1)}_{j} ,\nu_j)}\\
&=\frac{p(M_{\rm{S}})p(F^{(1)}_{j} \mid M_{\rm{S}},\nu_j)}{p(M_{\rm{G}})p(F^{(1)}_{j}\mid M_{\rm{G}},\nu_j) + p(M_{\rm{L}})p(F^{(1)}_{j}\mid M_{\rm{L}},\nu_j) } \\
&= \frac{p(F^{(1)}_{j} \mid M_{\rm{S}},\nu_j)p(M_{\rm{S}})/p(M_{\rm{G}})}{p(F^{(1)}_{j}\mid M_{\rm{G}},\nu_j) + p(F^{(1)}_{j}\mid M_{\rm{L}},\nu_j)p(M_{\rm{L}})/p(M_{\rm{G}}) }
\end{split}
\end{equation}
In practice it is convenient to use the log odds ratio,
\begin{equation}
\begin{split}
\log\left[ O^{(1)}_{\rm{S/GL}}(F^{(1)}_{j})\right] &=  \log\left[ p(F^{(1)}_{j} \mid M_{\rm{S}}) \right] \\
&- \left[ \log\left( p(F^{(1)}_{j}\mid M_{\rm{G}}) \right. \right. \\
&\left.\left.+  p(F^{(1)}_{j}\mid M_{\rm{L}})p(M_{\rm{L}})/p(M_{\rm{G}})\right) \right]
\end{split}
\end{equation}


Most improvements can be made when running over multiple detectors, the multi-detectors line-aware statistic can be defined by following a similar derivation to the single detector case above

\begin{equation}
\begin{split}
\log\left[ O^{(2)}_{\rm{S/GL}}(F^{(1)}_{j},F^{(2)}_{j})\right] &=  \log\left[ p(F^{(1)}_{j},F^{(2)}_{j} \mid M_{\rm{S}}) \right] \\
&- \left[ \log\left( p(F^{(1)}_{j},F^{(2)}_{j}\mid M_{\rm{G}}) \right. \right. \\
&\left.\left.+  p(F^{(1)}_{j},F^{(2)}_{j}\mid M_{\rm{L}})p(M_{\rm{L}})/p(M_{\rm{G}})\right) \right]
\end{split}
\end{equation}
