Efficiency Toy
====

Example of the efficiency correction procedure using multiple distributions

In [None]:
# Just some boilerplate code
import numpy as np
from scipy.stats import norm
import matplotlib.pyplot as plt

from matplotlib import rcParams
rcParams['figure.figsize'] = (10.0, 8.0)

rng = np.random.default_rng()
pdf_domain = (-3, 7)
pdf_range = (0, 1.0)


def sample(N, pdf):
    """
    Use accept-reject sampling to get N samples from pdf
    
    slow
    
    """
    points = np.zeros(N)
    num_generated = 0

    while num_generated < N:
        # Generate a point
        x = pdf_domain[0] + (pdf_domain[1] - pdf_domain[0]) * rng.random()
        y = pdf_range[0] + (pdf_range[1] - pdf_range[0]) * rng.random()

        if y < pdf(x):
            points[num_generated] = x
            num_generated += 1

    return points

We have three different distributions - call them WS, RS ("wrong/right sign") and phsp.
We can generate these from the models ("model sample") or via LHCb Monte Carlo ("MC sample").

The MC samples are affected by the efficiency- by comparing the MC sample to the model samples
We can extract the efficiency function $\epsilon(x)$.

We have three models; their PDFs are $\mathcal{A}_{RS}(x)$, $\mathcal{A}_{WS}(x)$ and $\mathcal{A}_{phsp}(x)$.
The corresponding PDFs describing the MC are $\mathcal{A}_{RS}(x)\epsilon(x)$, etc.

The PDF describing the combined model samples is:
$p_{model}(x) = \mathcal{I}_{RS}\mathcal{A}_{RS}(x) + \mathcal{I}_{WS}\mathcal{A}_{WS}(x) + \mathcal{I}_{phsp}\mathcal{A}_{phsp}(x)$; the PDF describing the combined MC samples is this multiplied by $\epsilon(x)$. $\mathcal{I}_i$ are numerical factors that can be found from the relative statistics of the MC samples.

Below is a toy example of the procedure used to find the efficiency.

First we need to generate model and MC samples:

Next we need to find the weights $\mathcal{I}_i$ describing the combined model:

We can then construct the model PDF. This, when multiplied by the efficiency, should also describe the MC sample.

We can recover the efficiency function by taking a sample from the combined model PDF and comparing it to the combined MC samples: