# Calculating statistical uncertainty for Compton scattering

This notebook will guide you through the calculation of the statistical uncertainty in your experimental estimate of the Compton wavelength shift. 

Step 1. Import numpy and pyplot from matplotlib. Also call the command that ensures matplotlib plots happen inline, within the current window.

In [None]:
#you can just run this cell
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

In the Compton scattering experiment, you have measured four different count rates, $R_0, R_1, R_2$ and $R_{background}$. 

Each of these measurements is a sample measurement of a statistical process, meaning that if you measured them again, there would be some variation to the measurements - not because of some malfunction of the counting device but because the actual number of counts varies because it is a random, statistical process. The Poisson distribution is an appropriate model for how these count rates vary statistically. 

A Poisson distribution with a population mean of $\mu$ has a standard deviation of $\sigma=\sqrt{\mu}$. For the count rate $R_0$, this means its Poisson distribution has a population mean of $\mu_{R_0}$ and a standard deviation of $\sqrt{\mu_{R_0}}$.

Now your measurement of the average count rate over a time interval, $\bar{R_0}$, for example, can be thought of as a sample mean of its Poisson distribution. It's an estimate of the 'true' mean, or population mean, of the distribution, but you can't say they are exactly the same because you only used a finite number of measurements of $R_0$ to determine $\bar{R_0}$, over a short time interval. There is statistical uncertainty as $\bar{R_0}$ as an estimate of $\mu_{R_0}$. This uncertainty is known in statistics as "standard error". 

In general, the smaller the sample size used to calculate the sample mean, the larger its statistical uncertainty  (standard error) at being a good estimate of the population mean. Also the statistical uncertainty is larger if the original Poisson distribution has larger standard deviation. Standard error (SE) is calculated by taking the standard deviation of the Poisson distribution and dividing by the square root of the number of samples you used to calculate the sample mean. For $SE_{R_0}$ this translates to

$SE_{R_0}=\sqrt{\dfrac{\mu_{R_0}}{\Delta t_{R_0}}}\approx\sqrt{\dfrac{\bar{R_0}}{\Delta t_{R_0}}}.$

We don't actually know the standard deviation for the statistical population, so we approximate it with the sample mean $\bar{R_0}$ in the above equation. In the above equation SE decreases with the length of time we measured the count rate over because this translates to a larger sample size.

The same is true of the statistical uncertainty (standard error) in each of your measured count rates $R_1, R_2$ and $R_b$.

Step 2. Define a function "calculate_SE" that will take the "count_rate" and the "time_interval" as arguments and return the calculated statistical uncertainty (statistical error) in the sample count rate.

In [None]:
def calculate_SE():
    

Enter your measured values of count rate $\bar{R_0}, \bar{R_1}, \bar{R_2}$ and $\bar{R_b}$ and the time intervals over which you measured them.

In [None]:
R_0=
delta_T_0=
R_1=
delta_T_1=
R_2=
delta_T_2=
R_b=
delta_T_b=


Step 3. Now use the your calculate_SE function to calculate the uncertainty in each of your count rates. Print the values to screen so you can write them into your lab book.

In [None]:
SE_R0=
print()
SE_R1=
print()
SE_R2=
print()
SE_Rb=
print()

The statistical uncertainties in the count rates $\bar{R_0}, \bar{R_1}, \bar{R_2}$ and $\bar{R_b}$ impact the statistical uncertainty of your experimental estimate of the x-ray wavelength shift $\Delta\lambda$. 

Before we get further into calculating that statistical uncertainty, we would like to get an explicit expression for $\Delta\lambda$ in terms of $R_0, R_1, R_2$ and $R_b$. This is a bit tricky and messy. For those who like details, here it is!!

Since:

$T_1=\frac{(R_1-R_b)}{(R_0-R_b)}\\
\ln{T_1}=\ln{(R_1-R_b)}-\ln{(R_0-R_b)}\\
\text{Similarly,}\\ \ln{T_2}=\ln{(R_2-R_b)}-\ln{(R_0-R_b)}.$

We also know that:

$T_1=\exp{(-a(\frac{\lambda_1}{100})^n)}\\
\ln{T_1}=-a(\frac{\lambda_1}{100})^n\\
\text{so } \lambda_1=100(\frac{-\ln{T_1}}{(a)})^{1/n}.$


Similarly:

$\lambda_2=100(\frac{-\ln{T_2}}{a})^{1/n}.$

Now: 

$\Delta\lambda=\lambda_2-\lambda_1\\
\Delta\lambda=100a^{-1/n}((-\ln{T_2})^{1/n}-(-\ln{T_1})^{1/n}).$

We already have an expression for $\ln{T_1}$ and $\ln{T_2}$ in terms of $R_0, R_1, R_2, R_b$, so we can substitute that in.

$\begin{equation*}\Delta\lambda=100a^{-1/n}((\ln{(R_0-R_b)}-\ln{(R_2-R_b)})^{1/n}-(\ln{(R_0-R_b)}-\ln{(R_1-R_b)})^{1/n})\\
\text{where } a=7.6 \text{ and } n=2.75.\end{equation*}$



Step 4. In the next cell calculate the wavelength shift $\Delta\lambda$ using the last formula given, (directly above this cell) and your variables R_0, R_1, R_2 and R_b, then print it to screen and check it against the earlier calculation you made in your lab book.

In [None]:
a=7.6
n=2.75
#Complete the calculating of Deltalambda using the above equation
Deltalambda=
#complete print statement to print value to screen
print()

By looking at equation above for $\Delta\lambda$, you can see it is a function of several variables - it depends on four variables $R_0, R_1, R_2$ and $R_b$. The general rule for calculating the statistical uncertainty in $q$ which depends on independent random variables $x,y,\text{ and }z$ is:

$\sigma_q=\sqrt{(\frac{\partial q}{\partial x}\sigma_x)^2+(\frac{\partial q}{\partial y}\sigma_y)^2+(\frac{\partial q}{\partial z}\sigma_z)^2}\\
\text{where } \frac{\partial q}{\partial x} \text{ is the partial derivative of } q \text{ with respect to }x.$

So applying this rule to $\Delta\lambda$ we get:

$\sigma_{\Delta\lambda}=\sqrt{(\frac{\partial \Delta\lambda}{\partial R_0}SE_{R_0})^2+(\frac{\partial \Delta\lambda}{\partial R_1}SE_{R_1})^2+(\frac{\partial \Delta\lambda}{\partial R_2}SE_{R_2})^2+(\frac{\partial \Delta\lambda}{\partial R_b}SE_{R_b})^2}$,

where the standard error you calculated earlier represent the uncertainty in our sample means $\bar{R_0}$, etc.


Taking the partial derivative of equation for $\Delta\lambda$ is tedious. Let's skip to the result and get on with the calculation!

$\frac{\partial \Delta\lambda}{\partial R_0}=\frac{100a^{-1/n}}{n(R_0-R_b)}((\ln{(R_0-R_b)}-\ln{(R_2-R_b)})^{(1/n-1)}-(\ln{(R_0-R_b)}-\ln{(R_1-R_b)})^{(1/n-1)})\\
\frac{\partial \Delta\lambda}{\partial R_1}=\frac{100a^{-1/n}}{n(R_1-R_b)}(\ln{(R_0-R_b)}-\ln{(R_1-R_b)})^{(1/n-1)}\\
\frac{\partial \Delta\lambda}{\partial R_2}=\frac{100a^{-1/n}}{-n(R_2-R_b)}(\ln{(R_0-R_b)}-\ln{(R_2-R_b)})^{(1/n-1)}\\
\frac{\partial \Delta\lambda}{\partial R_b}=\frac{100a^{-1/n}}{n}\left(\left(\frac{1}{R_2-R_b}-\frac{1}{R_0-R_b}\right)\left(\ln\left(R_0-R_b\right)-\ln\left(R_2-R_b\right)\right)^{1/n-1}-\left(\frac{1}{R_1-R_b}-\frac{1}{R_0-R_b}\right)\left(\ln\left(R_0-R_b\right)-\ln\left(R_1-R_b\right)\right)^{1/n-1}\right)$

Step 5. In the next cell, carefully translate these formula for the partial derivatives into code to calculate the values of each from your experimental measurements. (The last one has been done in stages for you already!)

In [None]:
#Calculating partial derivative of wavelength shift with respect to (wrt) R_0
partial_wrt_R_0=
#Calculating partial derivative of wavelength shift with respect to R_1
partial_wrt_R_1=
#Calculating partial derivative of wavelength shift with respect to R_2
partial_wrt_R_2=
#Calculating partial derivative of wavelength shift with respect to R_b
partial_wrt_R_b=(100*(a**(-1/n))/(n))*((1/(R_2-R_b) - 1/(R_0-R_b))*(np.log(R_0-R_b)-np.log(R_2-R_b))**(1/n - 1) - (1/(R_1-R_b) - 1/(R_0-R_b))*(np.log(R_0-R_b)-np.log(R_1-R_b))**(1/n - 1))

Step 6. Now that you have calculated the partial derivative of wavelength shift with respect to each of your count rate measurements, write a function that calculates the contribution of one measurement to the statistical uncertainty of the wavelength shift. The function will need to take the partial derivative for that variable and its SE value.


In [None]:
def calculateUncertaintyContribution(partialDeriv, SE):
#Fill in the calculation for one squared term contribution to the total uncertainty


Step 7. Now calculate the uncertainty contribution (uc) of each of your count rates using your function and print them to screen so you can write them in your lab book. We've given the example for the uncertainty contribution of $\bar{R_0}$.


In [None]:
uc_R_0=calculateUncertaintyContribution(partial_wrt_R_0,SE_R0)
print()
uc_R_1=
print()
uc_R_2=
print()
uc_R_b=
print()

Step 8. Note in your lab books which measurement(s) contributed the biggest uncertainty term(s)?
Hence, if you had to improve the uncertainty in wavelength shift, which terms would you measure with greater accuracy (certainty) and how would you do this?

Finally run the last cell to calculate the total statistical uncertainty in wavelength shift and write it in your lab book. 

In [None]:
sigma_deltaLambda=np.sqrt(uc_R_0+uc_R_1+uc_R_2+uc_R_b)
print("sigma_Deltalambda={}".format(sigma_deltaLambda))