# Homework 5

#####Tianyu
For a simple counting experiment, the expected background event is $b$ and the observed event is $n$.
The best estimator for signal event $s$ is:
$$s=n-b.$$

There are different metrics to evaluate discovery significance.

* Simplified Z0
$$ Z_{0, simple} = s/\sqrt{b}$$

* Asympototic Z0
$$ Z_{0, asymptotic} = \sqrt{2((s+b)\mathrm{ln} (1+s/b)-s)}$$

* Bayesian Z0
$$ p-value = \int_{n}^{\infty}\mathrm{Poisson}(k|b) dk$$.
$$Z_{0, Bayesian} =  \mathrm{Gauss_{1-sided}(p-value)} $$


In this exercise, we will implement each of the metric and compare consistency.


In [10]:
import numpy as np
import matplotlib.pyplot as plt
#import iminuit.minimize as minimize
from tqdm import tqdm
import scipy
from scipy.stats import poisson

# Define test statistics q_0 for Frequentist approach
# 1. We require signal event s >0 for positive signal yield.
#    Therefore, the test statistics q_0 is 0 if N_obs <= Nb
# 2. Compute two Poisson loglikelihood of
#    a) backgorund only model
#    b) signal+background model
#    Evaluate -2 log likelihood ratio between a) and b)


def q0(N_obs, Nb):
    n = np.asarray(N_obs, dtype=float)
    b = np.asarray(Nb, dtype=float)
    n, b = np.broadcast_arrays(n, b)

    val = 2.0 * (n * np.log(np.where(b > 0, np.where(n > 0, n, 1.0) / b, 1.0)) - (n - b))
    q0_out = np.where((b == 0) & (n > 0), np.inf,
                      np.where((n > b) & (b > 0), val, 0.0))
    return q0_out



Implement four metrics:

In [11]:
def SimplifiedZ0(N_obs, N_b):
    return (N_obs - N_b) / np.sqrt(N_b)


def AsymptoticZ0(N_obs, N_b):
    return np.sqrt(2 * (N_obs * np.log(N_obs / N_b) - (N_obs - N_b)))


def BayesianZ0(N_obs, N_b):
    pvalue = 1 - poisson.cdf(N_obs - 1, N_b)
    return scipy.stats.norm.ppf(1 - pvalue)


Now, let's apply our code for numerical calculations.

Consider the case that backogrund only model with yields b=0.5 and observed events n=5.

Calclate discovery significance for each of the metric, respectively.

In [12]:

Nobs = 5
Nb = 0.5

Z_simple = SimplifiedZ0(Nobs, Nb)
Z_asymp = AsymptoticZ0(Nobs, Nb)
Z_bayes = BayesianZ0(Nobs, Nb)

print("SimplifiedZ0 =", Z_simple)
print("AsymptoticZ0 =", Z_asymp)
print("BayesianZ0  =", Z_bayes)


SimplifiedZ0 = 6.363961030678928
AsymptoticZ0 = 3.7451102693966782
BayesianZ0  = 3.579515933906913


Describe the consistency between different metrics.

Write your answers here:

All three methods show the same trend â€” higher excess gives higher significance.
The simplified one overestimates since it assumes Gaussian stats.
The asymptotic and Bayesian results are close and more accurate for small counts.
Here, all three agree that the signal is significant, just with different magnitudes.


