# Systematic Uncertainties and Their Propagation

<CENTER><img src="../../images/ATLASOD.gif" style="width:50%"></CENTER>

This notebook uses ATLAS Open Data https://opendata.atlas.cern to teach you the concepts of detector acceptance and efficiency!

ATLAS Open Data provides open access to proton-proton collision data at the LHC for educational purposes. ATLAS Open Data resources are ideal for high-school, undergraduate and postgraduate students.

Notebooks are web applications that allow you to create and share documents that can contain, for example:
1. live code
2. visualisations
3. narrative text

## The Uncertainty of Measurements

Some numerical statements are exact, such as the number of books on a desk. However, all *measurements*, no matter how careful they are taken, have some degree of uncertainty that can come from a variety of sources. The process of evaluating uncertainties and identifying sources of errors is called **error analysis**. The goal of error analysis is to properly estimate uncertainities in measurements and try to reduce them as much as possible. 

### The Importance of Knowing the Uncertainty

The associated uncertainty of any measurement is just as important as the measured value of the measurement because it gives information on how well the measurement was made. By not reporting the uncertainty of measurements, we may be mislead and/or not be able to make any valid conclusions. 

### Reporting Measurements

Every measurement should be reported as a measured value with its uncertainty and appropriate unit:

$$ \text{measurement} = \text{(measured value $\pm$ uncertainty) units} $$

or

$$ x = x_\text{best} \pm \delta x, $$

where $x_\text{best}$ represents the best estimate of the measurement of some quantity $x$ and $\delta x$ is the associated uncertainty of the best estimate. Sometimes the *relative uncertainty* is used:

$$ \text{Relative uncertainty} = \left| \frac{\text{uncertainty}}{\text{measured value}} \right| = \left| \frac{\delta x}{x_\text{best}} \right|. $$

## Random and Systematic Errors

We may classify errors in measurement as either *random* or *systematic*, depending on how the measurement was obtained.

* **Random Errors:** Statistical fluctuations (in either direction) in the measurement due to the precision limitations of the measuring device. These types of errors can be detected statistically and can be reduced by taking a large number of measurements. 

* **Systematic Errors:** Reproducible inaccuracies that are consistently in the same direction. These types of errors are difficult to detect and cannot be reduced by increasing the number of measurements. 

Below is a list of some common sources of errors:

* **Incomplete Definition** (systematic or random). The quantity to be measured may not be clearly defined, leading to different measurements by different people. For example, if two different people measured the length of a string, they may get different answers since each person may stretch the string with different tension when making their measurements. 

* **Environmental Factors** (systematic or random). There may be flucutations of a measurement due to the outside surroundings that may cause the measurment to be changed.

* **Calibration** (systematic). If a measuring instrument was not calibrated correctly (or at all) before making measurements, then all the measurement will be off by the same amount. 

In High Energy Physics (HEP) experiments, systematic errors can occur from a variety of sources, including event generation, calibration, collider and detector simulation, and particle reconstruction and identification.

## Accuracy and Precision

For a single measurement, **accuracy** tells you how close your measurement is to an ideal, theoretical, or accepted value (assuming one exists). For a group of measurements, it is how close the *average* is to the ideal value. It is often reported quantitatively by the **relative error:**

$$ \text{Relative error} = \frac{\text{measured value - ideal value}}{\text{ideal value}}. $$

A positive sign for relative error indicates that the measured value was higher than the ideal value, and a negative sign indicates that it was lower. Often the relative error is multipled by 100 to give a percentage. Poor accuracy is usually an indication of large *systematic errors*. 


For a group of measurements, **precision** tells you how close your observed values are to one another. In other words, it is the degree of consistency, reliability, reproducibility, and agreement among independent measurements of the same quantity. It is often reported quantitatively by the **standard deviation**:

$$ \sigma = \sqrt{\frac{1}{N-1} \sum_{i=1}^N (x_i - \bar{x})^2 }, $$

where $N$ is the number of measurements made, $x_i$ is the $i$th measured value, and $\bar{x}$ is the average of all the measured values. The standard deviation quantifies the *spread* of the measured values. A low standard deviation means a small spread in measurements (high precision), and a high standard deviation means a large spread in measurements (low precision). Poor precision is usually an indication of large *random errors*.

### Summary: Target Practice

To summarize the concepts we have discussed so far, consider the four target practice experiments shown below. Here each experiment involves a series of shots fired at a target, with the "ideal value" being the center of the target.

<div>
<img src="attachment:target_practice.png" width="500"/>
</div>

**Figure 1.** Summarizing systematic and random errors, accuracy, and precision using a target pratice analogy.

There are four cases to examine:

**(a)** None of the shots are close to the center of the target, so the accuracy is low. The systematic errors are high since the shots are all systematically off-centered in the same direction, in this case toward the upper right. On the other hand, the precision is high because all of the shots are close to one another. This means that the random errors are low.

**(b)** All of the shots made it to the center of the target, so the accuracy is high and the systematic errors are low. Furthermore, the shots are all close to one another, so the precision is also high and the random errors are low. This is the best case scenario, and it is what we strive for in any experiment.

**(c)** None of the shots are close to the center of the target, so the accuracy is low and systematic errors high. Worse still, the shots are not close to each other, so the precision is also low and random errors high. This is the worst case scenario, and it is what we try to avoid in any experiment.

**(d)** The shots are either fairly close to or at the center of the target, so the accuracy is high and systematic errors low. However, the shots are not very close to each other, so the precision is low and random errors high.

Although this target practice analogy summarizes the concepts nicely, it is misleading in one important aspect. Since we are given the position of the target for each experiment, we are able to easily tell how accurate the shots were. Knowing the position of the target amounts to knowing the ideal value of a measured quantity, and in the vast majority of real measurements, we do *not* know this value. 

To think about the difficulty in not knowing the ideal value of a measured quantity, consider the target practice analogy again, but without the positions of the targets. Although we can still easily identify the precision and random errors of the shots, there is no way of knowing the accuracy or systematic errors of the shots. 

<div>
<img src="attachment:target_practice_improved.png" width="500"/>
</div>

**Figure 2.** The same four target practice experiments but without the position of the target. This represents the true nature of most experiments, in which we do not know the "ideal value" of what we are measuring. 

## The Propagation of Uncertainties

Often times, we are unable to directly measure the quantity of interest and instead have to *calculate* it from quantities which *can* be directly measured. The uncertainties of each measurement combine to form an uncertainty in the calculated quantity, and are determined by the procedure of **propagation of uncertainties** (the uncertainties *propagates* through the calculations). 

Suppose $z = z(x_1, x_2, \ldots, x_n)$ is any function of the quantities $x_1, x_2, \ldots, x_n$, to which we know their uncertainties $\delta x_1, \delta x_2, \ldots, \delta x_n$. The uncertainty of $z$ is then given by the general rule:

$$ \delta z = \sqrt{ \left(\frac{\partial z}{\partial x_1}\delta x_1 \right)^2 + \left(\frac{\partial z}{\partial x_2}\delta x_2 \right)^2 + \cdots + \left(\frac{\partial z}{\partial x_n}\delta x_n \right)^2 }. $$

This is sometimes written as

$$ \delta z = \frac{\partial z}{\partial x_1}\delta x_1 \oplus \frac{\partial z}{\partial x_2}\delta x_2 \oplus \cdots \oplus \frac{\partial z}{\partial x_n}\delta x_n, $$

where $\oplus$ denotes *addition in quadrature* (that is, you square each term, sum them up, and then take a square-root). From this general rule, we can find expressions for particular functions of $z$.

### Sums and Differences of Measured Quantities

If $z = x_1 \pm x_2 \pm \cdots \pm x_n$, then the uncertainty in $z$ is given by

$$ \delta z = \sqrt{ (\delta x_1)^2 + (\delta x_2)^2 + \cdots (\delta x_n)^2 }. $$

### Products and Quotients of Measured Quantities

If $z = x_1 \times x_2 \cdots \times x_n$ or $z = x_1 \div x_2 \cdots \div x_n$, the relative uncertainty in $z$ is given by

$$ \frac{\delta z}{|z|} = \sqrt{ \left(\frac{\delta x_1}{x_1} \right)^2 + \left(\frac{\delta x_2}{x_2} \right)^2 + \cdots + \left(\frac{\delta x_n}{x_n} \right)^2 }. $$

### Measured Quantity Times Exact Number

If $z = kx$, where $k$ is known exactly, then the uncertainty in $z$ is given by

$$ \delta z = |k| \delta x. $$

This is generally written as a relative uncertainty, in which we divide both sides by $|z|$:

$$ \frac{\delta z}{|z|} = \frac{\delta x}{|x|}. $$

### Measured Quantity Raised to an Exact Power

If $z = x^n$ and $n$ is an exact number, then 

$$ \delta z = |n| x^{n-1} \delta x $$

This is generally written as a relative uncertainty, in which we divide both sides by $|z|$:

$$ \frac{\delta z}{|z|} = |n| \frac{\delta x}{|x|}. $$