# A simple counting experiment

For our simple "HEP"-inspired counting experiment from the lectures, we can write the likelihood as, 

$$
L(\mu,\eta) = \lambda(\mu,\eta)^{n}e^{-\lambda}\cdot c^{-\frac{1}{2}\eta^{2}}
$$

where we've dropped the $n!$ in the likelihood (for reasons that will be clear in the future). $\mu$ is the parameter of interest (relating our measured cross-section to some hypothetical one) and $\eta$ is our nuisance parameter that encodes our uncertainty on the luminosity, 

$$
\lambda(\mu,\eta) = \mu\sigma(pp\rightarrow \mathrm{X})A\epsilon l_{0}\cdot(1+\kappa)^{\eta}+B
$$

We'll assume that we have some specified values given to us,  

In [None]:
%matplotlib notebook

import numpy
import matplotlib.pyplot as plt
plt.rcParams.update({'font.size': 14})

sigma_TH = 0.01
A   = 0.5
eff = 0.9
l   = 100.
k   = 0.1
B   = 1.0
n   = 2 # our measured data

And with these we can specify the Poisson term,

In [None]:
# Poisson mean
def lamb(mu,eta):
  return mu*eff*A*l*((1+k)**eta)*sigma_TH + B

Our aim here will be to remove the dependence of the likelihood (and therefore our infrence on $\mu$) on the parameter $\eta$. As explained in lectures, there are two ways  to procede common in HEP, profiling, or marginaisation. 

##  Profiling

Profiling is the method of  removing the dependence on $\eta$ through finding the values of $\eta$ for which $L(\mu,\eta)$ is maximised at each value of $\mu$. Thus $\eta$ becomes a function of $\mu$, i.e 

$$
L(\mu,\eta)\rightarrow L(\mu,\eta(\mu))
$$

Often, since we deal with exponentials, its easier to minimize a negative log-likelihood, than maximise a likelihood. So let's write down the negative log-likelihood function. In fact by convention we also multiply by two (more on that in lectures later). 

$$
q(\mu,\eta) = -2\ln L(\mu,\eta)  = \eta^{2}+2\lambda(\mu,eta)-2n\ln\lambda(\mu,\eta)
$$

In [None]:
# q = -2lnL
def q(mu,eta):
  la = lamb(mu,eta)
  return eta*eta + 2*la -2*n*numpy.log(la)

To profile the function, we need to find value(s) of $\eta$ for which,

$$
q{\prime} = \frac{\partial q}{\partial \eta} = 0  
$$

We'll use a numerial method known as the "Newton method" to solve this. This method works as follows; 

   * Choose an initial starting point for the parameter, call it $\eta_{0}$.
   * The next point proposed is $\eta_{1}=\eta_{0}-\frac{q\prime(\eta_{0})}{q\prime\prime(\eta_{0})}$. This new point now replaces $\eta_{0}$/ 
   * Continue iterating until $|q\prime|<\delta$, where $\delta$  is some pre-defined tolerance
   
Since we can analytically calculate them, we also write the functions for $q\prime$  and $q\prime\prime$,

In [None]:
# dq/deta
def dq(mu,eta):
  dl = lamb(mu,eta)*numpy.log(1+k)
  return 2*eta + 2*dl - 2*n*numpy.log(1+k)

# d^2q/deta^2
def d2q(mu,eta):
  log_k = numpy.log(1+k)
  la  = lamb(mu,eta)
  return 2+2*la*log_k*log_k

Finally, lets put this into a function which returns the profiled value of $\eta$ for a given value of $\mu$. 

In [None]:
def profiled_eta(mu):
  # numerical minimumisation via newton method
  tol = 0.01
  init_eta =-0.5 if lamb(mu,0)-n > 0 else 0.5
  init_eta = -10
  eps = 100
  while eps > tol:
    qp = dq(mu,init_eta)
    eps = abs(qp)
    init_eta = init_eta - qp/d2q(mu,init_eta)
  return init_eta


Let's see how it looks  now. First, we can plot the  value of $q(\mu,\eta)$ in a 2D color map.

We then call our `profiled_eta` function to plot the value of eta which minimizes $q(\mu,\eta)$ for each value of $\mu$ in a reasonably wide range. 

Finally, by plugging these values of $\eta$ into our function for $q$, we obtain the *profiled negative log-likelihood* as a function of $\mu$. 

In [None]:
# plotting
fig, (ax1,ax2) = plt.subplots(1,2)

# 1. plot q(mu,eta)
xaxis = numpy.linspace(-0.9,15,50)
yaxis = numpy.linspace(-3,3,50)
z = [ [q(mu,eta) for mu in xaxis] for eta in yaxis ]
X,Y = numpy.meshgrid(xaxis,yaxis)
c = ax1.pcolor(X,Y,z)
fig.colorbar(c,ax=ax1)
ax1.set_xlabel("$\\mu$")
ax1.set_ylabel("$\eta$")
ax1.set_title("$q(\\mu,\eta)$")

# 2. plot the profiled value of eta as a function of mu
eta_mu = [ profiled_eta(mu) for mu in xaxis ]
ax1.plot(xaxis,eta_mu,color='red')

# 2. plot the profile likelihood
q_mu = [ q(mu,eta) for mu,eta in zip(xaxis,eta_mu)]
ax2.plot(xaxis,q_mu)
ax2.set_xlabel("$\\mu$")
ax2.set_ylabel("$q(\\mu)$")

plt.show()