# Frequentist vs Bayesian significance

For a simple counting experiment, the expected background event is $b$ and the observed event is $n$. 
The best estimator for signal event $s$ is:
$$s=n-b.$$ In this exercise, we will implement Frequentist significance calculation.

$$ p-value =  \int_{q_{0,n}}^{\infty} f(q_0|b) dq_0 $$
$$Z_{0, Frequentist} = \mathrm{Z score\ 1-tailed\ (p-value)} $$
, where $f(q_0|b)$ is distribution of test statistics $q_0$ in background only hypothesis.
The test statistics is defined as likelihood ratio between backogrund only model and the best fit model 
$$ q_0 = -2 \mathrm{ln} \frac{L(n|s=0,b)}{L(n|s,b)} $$
In the simple counting experiment, the Likelihood $L$ is Poisson distribution. The test statistics can be written as
$$ q_0 = -2 \mathrm{ln} \frac{\mathrm{Poisson}(n|b)}{\mathrm{Poisson}(n|s+b)}$$




In [7]:
!pip install tqdm
import numpy as np
import matplotlib.pyplot as plt
import iminuit.minimize as minimize
from tqdm import tqdm
import scipy
from scipy.stats import poisson

# Define test statistics q_0 for Frequentist approach
# 1. We require signal event s >0 for positive signal yield.
#    Therefore, the test statistics q_0 is 0 if N_obs <= Nb
# 2. Compute two Poisson loglikelihood of 
#    a) backgorund only model
#    b) signal+background model
#    Evaluate -2 log likelihood ratio between a) and b)

def q0(N_obs,N_b):
    q0_out=0
    if N_obs > N_b:
        ll_b = poisson.logpmf(N_obs, N_b)
        ll_sb = poisson.logpmf(N_obs, N_obs)
        q0_out = -2 * (ll_b - ll_sb)

    return q0_out



In [8]:
def FreqnetistZ0(N_obs,N_b):
    #set random seed to guarantee reproducibility
    np.random.seed(seed=8)

    # Step1. Generate f(q_0|B) distribtuion
    # Generate toy experiments based on background only model and compute q_0 for each experiment
    n_toys = 100000
    q0_b = np.zeros(n_toys)
    for i in range(n_toys):
        N_toy = np.random.poisson(N_b)
        q0_b[i] = q0(N_toy, N_b)
    q0_obs = q0(N_obs, N_b)

    
    # Step2. Compute p-value. The fraction of toy experiments with q_0 greater or eqal to q_{0,obs}
    p_value = np.sum(q0_b >= q0_obs) / n_toys
    # Convert p-value to Z-score
    Zscore = scipy.stats.norm.ppf(1 - p_value)

    return Zscore


Now, let's apply our code for numerical calculations.

Consider the case that backogrund only model with yields b=0.5 and observed events n=5.

Calclate discovery significance.

In [9]:
Nobs=5
Nb=0.5    

print("Z0freq:%.2f"%(FreqnetistZ0(Nobs,Nb)))

Z0freq:3.55


Q1: How is the number compared to the Baysian signficance from homework5?
Ans: The Bayesian signifincance is 4.19, which is slightly higher meaning that under Bayesian framewor, which uses prior information, the evidence for the signal is stronger under the Bayesian approach than what the frequencist approach suggests.


Consdier a background only model with yields b=4 and observed events n=5. Calculate Baysian signifiance and Frequqnt signficance, and compare the results.

In [None]:
Nb = 4
Nobs = 5
def BayesianZ0(N_obs,N_b):
    pvalue = 1-poisson.cdf(N_obs, N_b)
    Zscore= scipy.stats.norm.ppf(1-pvalue)
    return Zscore
print("Z0freq:%.2f"%(FreqnetistZ0(Nobs,Nb)))
print("Z0Bayes:%.2f"%(BayesianZ0(Nobs,Nb)))

# Similarly, for Nb=4, Nobs=5, the frequentist Z0 is significantly lower than the Bayesian Z0.
# Meaning that the Bayesian approach suggests stronger evidence for the signal than the frequentist approach.

Z0freq:0.33
Z0Bayes:0.79
