# Integration and Sampling

In this notebook we first investigate sampling random numbers other than uniform, and then use random number sampling to calculate integrals.

## Requirements

We need a random number generator. We could use one of the RNGs implemented in [`rng.ipynb`](rng.ipynb), but instead we will use the default `numpy` RNG. We also need the `math` module and `matplotlib`.

In [None]:
# Import the `numpy` and `math` modules.
import numpy as np
import math

# Import the `matplotlib` module.
import matplotlib.pyplot as plt

# Create an RNG, with a seed of 10.
rng = np.random.default_rng(10)

## Introduction

Typical events produced within the Large Hadron Collider (LHC) from colliding protons have $\mathcal{O}(100)$ or more particles produced. When calculating a cross-section for a two-to-two process we typically only need to integrate over two variables, $\theta$ and $\phi$. A two-to-$n$ process requires integrating over $3n -4$ variables, so a typical LHC event would require integrating over $\mathcal{O}(300)$ variables. This is numerically challenging, at best, and with current technology is just simply not possible. To calcululate LHC events, we can instead factorise the problem into more manageable parts using probabilistic methods. Even still, calculating a perturbative cross-section for a $4$-body final state requires integrating over $8$ variables which is a challenging numerical integration. The bottom line is that performing high dimension integrals quickly and efficiently is a core problem in particle physic, and is very numerically challenging.

However, before we tackle integration with MC, we need to first discuss how we can efficiently sample distributions. In the [`rng.ipynb`](mc/rng.ipynb) notebook, we have hard to make a good generator for uniformly-distributed random variates. In practice, however, the probability distributions of interest are not uniform. Fortunately, uniform random variates can either be transformed into a different distribution or used as part of an accept/reject algorithm that converges to the desired probability distribution. Random variates -- uniform or not -- are also a primary part of the Monte Carlo integration method, so it is worthwhile to know how to transform uniform into complicated.

In this notebook, we only consider continous distributions, but everything that we say can be applied, with some modification, to discrete distributions.

## Analytic Sampling

Analytic, or inverse cumulative distribution function (CDF) sampling allows us to transform a uniform distribution into our target distribution, $f(x)$. However, this is not possible for every $f(x)$. To sample $f(x)$ the following must generally be fulfilled.

1. The sampling of $f(x)$ is bounded, where over this range $f(x)$ is positive.

$$
f(x) \geq 0 \text{ for } x_\min < x < x_\max
$$

2. The integral of $f(x)$ can be calculated.

$$
F(x) = \int \text{d}x\, f(x)
$$

3. The integral of $f(x)$ can be inverted, which we label $F^{-1}(x)$.

With these three conditions met we can then sample a distribution for $f(x)$ as follows. First, we can consider integrating a distribution from $x_\min$ to $x$, as shown in the figure below.

![Schematic of analytic sampling.](figures/sampleAnalytic)

We then draw a uniform random number $R$ which gives us the following relation.

$$
\int_{x_{\min}}^x \text{d}x'\, f(x') = R \int_{x_{\min}}^{x_{\max}} \text{d}{x'}\, f(x')
$$

We then perform the integration, where $F(x)$ is the indefinite integral of $f(x)$.

$$
F(x) - F(x_{\min}) = R(F(x_\max) - F(x_\min))
$$

We can then write $F(x_\max) - F(x_\min)$ as $A$, the area under the integral.
$$
F(x) - F(x_{\min}) = R A
$$

We then solve for $x$.

$$
x = F^{-1}(F(x_{\min}) + R A)
$$

So, we can uniformly sample $R$ and then use the final relation to transform this into $x$, as sampled from $f(x)$.

### Exercise: generic sampler

Before we try to generate any specific distributions using this method, let us first set up a generic sampler class which uses the steps above.

In [None]:
### START_EXERCISE
class SampleAnalytic:
    """
    Base class to analytically sample a distribution from a random
    distribution.
    """

    def __init__(self, rng, xmin, xmax):
        """
        Initialize the sampler, given the limits on f(x).

        rng:  uniform random number generator, should have method `uniform()`.
        xmin: lower bound of the sampling region.
        xmax: upper bound of the sampling region.
        """
        self.rng = rng
        self.xmin = xmin
        self.xmax = xmax
        self.F_xmin = self.F(xmin)
        self.area = self.F(xmax) - self.F(xmin)

    def f(self, x):
        """
        Return the function being sampled, f(x). This method is not necessary,
        but very useful for importance sampling and checking the distribution.

        x: value to calculate f(x) for.
        """
        # Implment f(x) here.
        return 0.0

    def F(self, x):
        """
        Returns F(x), the indefinite integral for f(x).

        x: value to calculate the indefinite integral for f(x).
        """
        # Implement F(x) here.
        return 0.0

    def F_inv(self, f):
        """
        Returns the inverse of the F(x).

        F: the value of F(x) to calculate the inverse.
        """
        # Implement F^-1(x) here.
        return 0.0

    def __call__(self):
        """
        Return the sampled value.
        """
        # Define the function from above that transforms a uniformly sampled
        # random number to the desired distribution.
        return 0.0

In [None]:
class SampleLinear(SampleAnalytic):
    """
    Class to analytically sample a linear function.
    """

    def __init__(self, rng, xmin, xmax, m, b):
        """
        Initialize the sampler, given the limits on f(x) and the linear
        parameters.

        f(x) = mx + b

        rng:  uniform random number generator, should have method `uniform()`.
        xmin: lower bound of the sampling region.
        xmax: upper bound of the sampling region.
        m:    slope of the linear distribution.
        b:    intercept of the linear distribution.
        """
        # Set the linear parameters. This must be done before the base class
        # is initialized.
        # Initialize the base class.

    def f(self, x):
        """
        Return the function being sampled, f(x).

        x: value to calculate f(x) for.
        """
        return 0.0

    def F(self, x):
        """
        Returns F(x), the indefinite integral for f(x).

        x: value to calculate the indefinite integral for f(x).
        """
        return 0.0

    def F_inv(self, f):
        """
        Returns the inverse of the F(x).

        F: the value of F(x) to calculate the inverse.
        """
        # Handle the special case of no slope.
        return 0.0

In [None]:
# Create the sampler.

# Call the `plot_sampler` method.

In [None]:
class SampleCauchy(SampleAnalytic):
    """
    Class to analytically sample a Cauchy function.
    """

    def __init__(self, rng, xmin, xmax, x0, gamma):
        """
        Initialize the sampler, given the limits on f(x) and the linear
        parameters.

        f(x) = 1/pi * (gamma/(x - x0)^2 + gamma^2)

        rng:   uniform random number generator, should have method `uniform()`.
        xmin:  lower bound of the sampling region.
        xmax:  upper bound of the sampling region.
        x0:    location parameter.
        gamma: scale parameter.
        """
        # Set the parameters.
        # Initialize the base class.
        super().__init__(rng, xmin, xmax)

    def f(self, x):
        """
        Return the function being sampled, f(x).

        x: value to calculate f(x) for.
        """
        return 0.0

    def F(self, x):
        """
        Returns F(x), the indefinite integral for f(x).

        x: value to calculate the indefinite integral for f(x).
        """
        return 0.0

    def F_inv(self, f):
        """
        Returns the inverse of the F(x).

        F: the value of F(x) to calculate the inverse.
        """
        return 0.0

In [None]:
# Create the sampler.

# Plot the comparison.