Matthew Sett
<br>
Date: Feb. 1, 2023
<br>
PHYS 2030 W23

# <center><font color=#46769B>Exercise 10: Beta decay</font></center>

## <font color=#46769B>Introduction</font>

While neutrons found in nuclei are generally stable, a "free" neutron is not.
With a half-life for decay of around 10 minutes, a lone neutron will eventually decay through the process of nuclear $\beta$ decay.
In this process, the neutron ($n$) is converted into a proton ($p$), emitting an electron and a neutrino in the process. A lone proton, of course, is stable, making up the nucleus of a Hydrogen atom.
Schematically, the reaction is

$$n \to  p + e^- + \bar{\nu}_e \, ,$$

where $e^-$ is an electron and $\bar{\nu}_e$ is a *neutrino*. The neutrino is a nearly-massless particle that interacts very feebly. (To be more precise, $\bar{\nu}_e$ is an [electron antineutrino](https://en.wikipedia.org/wiki/Electron_neutrino). Their feebleness is why we do not perceive them readily in our day-to-day lives, despite the fact that the sun is bombarding us with trillions of neutrions every second.) 

Here we will concern ourselves with the readily-observable electron that is emitted. The energy of the electron is described by a continuous distribution, known as the $\beta$-decay spectrum.<font color=red>$^1$</font>
The $\beta$-decay spectrum tells us the probability to observe an electron with a given energy $E$ is

$$P(E) = \left\{ \begin{array}{cl} A E \sqrt{E^2 - E_m^2} (E_{\rm max} - E)^2 & {\rm for} \; E_m \le E \le E_{\rm max} \\
0 & {\rm otherwise} \end{array} \right. \, , \qquad (1)$$

where the minimum electron energy is given by its rest mass energy $E_m = 0.511 \; {\rm MeV}$ and the maximum available energy is $E_{\rm max} = 1.292 \; {\rm MeV}$.<font color=red>$^2$</font>  
$A = 17.661$ is a normalizing constant.

## <font color=#46769B>How good is a proposal distribution?</font>

We will use __importance sampling__ to study the PDF $P(E)$ in Eq. (1) by sampling from a __proposal distribution__ $Q(E)$. We consider two choices for $Q(E)$ and we will assess which one is better. How is this done?

Suppose we want to calculate the following (true) mean of a function, with respect to $P(E)$:

$$\overline{f(E)}_P = \int_{-\infty}^{+\infty} dE \, P(E) \, f(E) \, , \qquad (2) $$ 

e.g., if we want the mean energy, we would take simply $f(E) = E$. Importance sampling says we rewrite this integral as

$$\overline{f(E)}_P = \int_{-\infty}^{+\infty} dE \, \frac{P(E) f(E)}{Q(E)} \, Q(E) = \overline{\frac{P(E) f(E)}{Q(E)}}_Q\, . $$ 

Then we evaluate this integral by sampling $E$ from $Q(E)$, and then computing the mean

$$\left\langle \frac{P(E) f(E)}{Q(E)} \right\rangle = \sum_{i=0}^{N-1} \frac{P(E_i)}{Q(E_i)} f(E_i) = \langle f(E) \rangle_w$$

which is the formula for the weighted mean.

Now, we want our proposal distribution $Q(E)$ to give us the best approximation for the true mean. What $Q(E)$ gives us the best result? It turns out that we want $Q(E)$ to be largest where integrand in Eq. (2), $P(E) f(E)$, is largest. (Consider $f(E)$ to be a nonnegative function, for simplicity, though this is not a requirement.) Intuitively, it is easy to understand, especially if you think of each sample $E_i$ as a *measurement*. 
- If most of our measurements are where the integrand $f(E) P(E)$ gives the most contribution to the integral in (2), we will get accurate results.
- If most of our measurements are where the integrand $f(E) P(E)$ does not contribute much to the integral, we will get less accurate results. Only a few samples, with large weights, will contribute, and effectively it's as if we had chosen a much smaller value of $N$.

That is, we want $Q(E)$ to be large where $P(E) f(E)$ is large, and small where $P(E) f(E)$ is small. In fact, the optimal choice is when $Q(E)$ is *identical* to $P(E) f(E)$:

$$Q(E) =  Z P(E) f(E) \qquad (3)$$

up to a normalizing constant $Z$ required by having $Q(E)$ be a valid PDF that is normalized to 1. (A proof of this can be found [here](https://stats.stackexchange.com/questions/324668/how-is-this-minimum-variance-worked-out-for-this-importance-sampling-estimator), but we will not prove it here.) Note that:

- The best choice of $Q(E)$ depends not just on $P(E)$, but also on the function $f(E)$ we are calculating the mean of.
- In practice, taking $Q(E)$ as in Eq. (3) is probably not an option. Since we assumed we couldn't sample from $P(E)$ directly, it is probably not possible to sample from Eq. (3) either!

This gives us an idea for how to quantify the goodness of $Q(E)$: calculate the variance of $P(E)f(E)/Q(E)$

$$\Delta\left(\frac{P(E)f(E)}{Q(E)}\right)^2 \qquad (4)$$

For the optimal choice of $Q(E)$ in (3), we have $P(E)f(E)/Q(E) = 1/Z$ for every sample, and therefore there is zero variance. The larger Eq. (4), the worse our proposal distribution.

Eq. (4) is easily calculated once you have your samples `E` and your weights `w`. For the case where $f(E) = E$, we just need to evaluate

```py
numpy.var(w*E)
```

In general, if you had a different function $f(E)$, you would calculate

```py
numpy.var(w*f(E))
```

The smaller this number, the better our proposal distribution.

Required reading:
- *Lesson 4: Importance Sampling*

Our goals for this notebook are:
- Use imporance sampling to describe a target distribution.
- Quantify a good vs bad proposal distribution

### <font color=#46769B>Footnotes</font>

<font color=red>$^1$</font> Historically, the continuous energy distribution of electrons in $\beta$-decay was crucial puzzle in early particle physics. Before neutrinos were known, it was curious that the electron only possessed a *fraction* of the total available energy, seeming to violate conservation of energy. To rescue this cherished principle, Pauli postulated a new ["little neutral" particle](https://www.symmetrymagazine.org/article/march-2007/neutrino-invention) responsible for carrying away unseen that missing energy. 
It took several decades before neutrinos were observed directly, finally confirming Pauli's hypothesis. Fast forward to now, neutrino studies and [neutrino factories](https://www.dunescience.org/) have become an integral part of our efforts to understand the fundamental building blocks of nature.


<font color=red>$^2$</font> We express energy in units of mega-electron-volts, where $1 \; {\rm MeV} = 10^6$ electron-volts $\approx 1.6 \times 10^{-13}\; {\rm Joules}$.







## <font color=#46769B>Part (a)</font>

Take a uniform proposal distribution $Q(E)$ that is constant in the domain $[E_m,E_{\rm max}]$ and zero outside this domain. (Be sure your distribution is normalized correctly.) Perform the following tasks:

- Generate $N = 10^5$ samples for $E$ from $Q(E)$.
- Calculate the weights $w = P(E)/Q(E)$ from your samples.
- To check everything, make a plot that shows:
    - Unweighted histogram of your samples for $E$ and a plot of $Q(E)$, which should agree.
    - Weighted histogram of your samples for $E$ and a plot of $P(E)$, which should agree.
    - Include a legend on your plot, and choose an appropriate number of bins and opacity (`alpha`) for your histogram
- Calculate the $\langle E \rangle$ and $\Delta E/\sqrt{N}$ with respect to $P(E)$, i.e., calculated as *weighted* means.
- Calculate the variance of $\frac{P(E)}{Q(E)} E$, as described above in and below Eq. (4).

In [None]:
import numpy as np
import matplotlib.pyplot as plt

# Define constants

A = 17.661
Em = 0.511
Emax = 1.292

# Your code here



## <font color=#46769B>Part (b)</font>

Repeat the steps in Part (a) with a normal proposal distribution $Q(E) = \mathcal{N}(\mu,\sigma)$. 

For starters, repeat these steps for the following values which yield "bad" proposal distributions: 
- $\mu = 1$ and $\sigma=100$
- $\mu = 1$ and $\sigma = 0.1$

Next, repeat everything again, adjusting your values of $\mu$ and $\sigma$ until you find a better proposal distribution than you found in Part (a). 

In [None]:
# Your code here
