# Review of errors

In a previous lab we discussed four general sources of error and how random errors can be reduced by averaging. In this lab we continue that discussion, exploring alternative ways to express errors, and how to see how big an effect an error will have on a computation.


## accuracy and precision

First, let's establish some shared terminology. The distiction between random and systematic errors is often phrased in terms of how accurate a measurement is and how precise a measurement is. Professional scientists and engineers use the words "accuracy" and "precision" to refer to two different, related concepts. *Accuracy* refers to how close the measurement is to the true value. *Precision* refers to how close repeated measurements are to each other. This is summarized in teh following image -- in this metaphor the arrows are our measurements, and the target is the true value.

<img src=accuracy_and_precision.png width=600>

In the terms described above, a measurement is accurate if $|\varepsilon|$ is small. A measurement is precise if it is close to the average value. It follows that the accuracy can only be as good as the precision, so we often do want to have good precision. However, as the picture shows, it is possible (and indeed, common) to have lots of precision but not much accuracy. In that case, the precision isn't really doing you any good. The real goal is accuracy. Don't lose sight of that!

# Two ways of reporting errors

So far we have been talking about the error $\varepsilon$ directly. This is also called the *absolute error*. This term has nothing to do with absolute value. Absolute error can be positive or negative. The units of absolute error are the same as the units of the measured quantity. A typical ruler has markings down to the 1/16 inch, so interpolating by eye you can make sure the error is less than 1/32 inches. This can be very good for measuring, say, how long your pencil is, since a typical pencil is around 7 inches long. The pencil on my desk is 7 1/8 inches long, $\pm$1/32 inch. I use the symbol $\pm$ to indicate that I don't know whether my guess is too high or too low. All I can say for sure is
$$
7\frac{1}{8} -\frac{1}{32}< t < 7\frac{1}{8}+\frac{1}{32}.
$$

A ruler like this would not be as appropriate for measuring how wide the pencil is -- it can only tell me it is 1/4 inch wide, $\pm$1/32 inch. To express this idea mathematically, we simply divide the error by the true value -- or if we don't know the true value, by our best guess. We write this as
$$
\eta = \frac{\varepsilon}{t} \approx \frac{\varepsilon}{x}, 
$$
using the greek letter eta ($\eta$) which also makes an "e" sound, though it looks sort of like a lowercase n. This quantity is called the relative error. Unlike the absolute error, the relative error is dimensionless -- it would be the same if we expresed the answer in inches or centimeters or smoots. Relative error is usually expressed as a percentage. For example, the length of my pencil is 7 1/8 inches $\pm$ 0.44%. The width of my pencil is 1/4 inch $\pm$ 12.5%. This shows how a ruler is not appropriate for measuring the width of a pencil, but it's fine for measuring the length -- the relative error is significantly different.

We often want an equation relating $t$ and $x$ using the relative error instead of the absolute error. This is easily achieved by factoring:

\begin{align*}
x 
&= t + \varepsilon\\
&= t\left(1+\frac{\varepsilon}{t}\right)\\
&=t(1+\eta)
\end{align*}

### you try:

For each of the quantities expressed below, identify whether it is an absolute error or a relative error. Then, convert it to the other type of error. Make sure you are using appropriate units.

1. 135 pounds $\pm$ 3 pounds
2. 15 PSI $\pm$ 30%
3. 22 hours $\pm$ 0.5%
4. 3 weeks $\pm$ 4 days

## combined errors and significant figures

Most calculations we do involve more than one number. Each number has its own error associated with it. As an example, suppose I measure the length of one pencil to be $x_1$, and another to be $x_2$ inches long. How long will they be together, put end to end? $y = x_1+x_2$, of course. How will the errors relate to each other? They will also add together. We can see this in a formula:

\begin{align*}
(x_1+\varepsilon_1)+(x_2+\varepsilon_2) &= y+(\varepsilon_1+\varepsilon_2)\\
&=y+\varepsilon_{total}
\end{align*}

The answer is of the form $y+\varepsilon_{total}$, where the total error is $\varepsilon_{total}=\varepsilon_1+\varepsilon_2$. So, when adding quantities with error, the absolute errors should add together. 
TODO: give an example

What about if the numbers are multiplied? For example, if you know the length and width of a rectangular sheet of paper, how do you find the error associated with calculating the area? Let's try the formula again:

$$
(x_1+\varepsilon_1)(x_2+\varepsilon_2) = x_1x_2 +x_1\varepsilon_2 + x_2\varepsilon_1 + \varepsilon_1\varepsilon_2.
$$

That is not such a tidy formula. There are a couple things we can do to make it easy on ourselves, though. For one thing, in any reasonable computation, the error is much smaller than the actual quantity being measured. It follows that the product of two errors should be much, much smaller than the product of the two quantities. Symbolically, we can write this as $\varepsilon_1\varepsilon_2<<x_1x_2$. So, we should be safe ignoring that term in the equation:

$$
(x_1+\varepsilon_1)(x_2+\varepsilon_2) \approx x_1x_2 +x_1\varepsilon_2 + x_2\varepsilon_1.
$$

We can do better yet. Let's do just a bit of factoring:

\begin{align*}
x_1x_2 +x_1\varepsilon_2 + x_2\varepsilon_1 &= x_1x_2\left(1+\frac{\varepsilon_2}{x_2}+\frac{\varepsilon_1}{x_1}\right)\\
&=x_1x_2(1+\eta_2+\eta_1)\\
&=x_1x_2(1+\eta_{total})
\end{align*}

There is our rule! instead of the *absolute* errors adding together, the *relative* errors add when multiplying. This is another reason why relative error is so popular: it is easier to propagate a relative error through a product.

TODO: give an example

It should be noted that not all numbers carry errors with them. Consider, for example, the formula for the area of a sphere, $V=4\pi r^2$. Presumably when you measure the radius, that will involve some error. Also, since $\pi$ is irrational, we can only approximate it so there would be error associated with it as well. however, 4 is a pure number. So there would only be two potential sources of error in computing the surface area of a sphere: error in your approximation for $\pi$ and error in your measured radius.

TODO: write exercises

# The derivative rule for error propagation

We have shared two rules for combining errors, when adding and when multiplying. There is just one more tool you need to be able to propagate errors through a great veriety of computations. That is a ganeral rule for how to propagate an error through a general function.

## errors in linear functions

Let's do the simplest function we can think of first. Suppose you know the absolute error in $x$ is $\varepsilon=0.1$ and we wish to know the error in computing $y=mx+b$. Let's assume for now that $m$ is positive. The largest $x$ can possibly be is $x+0.1$, so the largest $y$ can be is $m (x+0.1)+b= mx+b+0.1m$. The smallest $x$ can be is $x-0.1$, so the smallest $y$ can be is $m (x-0.1) +b = mx+b-0.1m$. That is, we could write 
$$
y= mx+b \pm 0.1m.
$$
The absolute error was multiplied by the slope of the line; the intercept had nothing to do with it.

## exercise
Explain what, if anything, would be different if $m$ were negative.

# errors in general functions

If our computations are of any value at all, we have to assume that the input error is pretty small, compared with the other numbers in the computation. Therefore, if a function is differentiable, a linear approximation should be pretty good. That is,

$$
f(x+\epsilon) \approx f(x) + f'(x)\epsilon.
$$

This lets us propagate errors through any function we can differentiate! Let's do an example: if we measure $x=3\pm0.05$, what is $y=x^2+6x-3$ with the error? Our best estimate is $3^2+6\times 3-3=24$, and the error is $f'(3)0.05=12*0.05=0.6$. So, we should report the answer as $y=24\pm 0.6$.

## exercise

For each function and measurement below, report the value of the result with its error.

 1. $x=\pi/6\pm 0.2$, $y=\sin(x)$
 2. $x=2\pm 0.05$, $y=e^x$
 3. $x=3\pm 0.01$, $y=x^2$
 4. $x=3\pm 1\%$, $y=x^2$

# A caution about small derivatives

Here is an example which shows the limitations of this strategy. What if $x=0\pm 0.1$, and $y=\cos(x)$? This approximation would tell you the error should be $0.1f'(0)=-0.1\sin(0)=0$. Is it accurate to say the error is zero in this case? That was a rhetorical question. The answer is no. We know the value of $x$ is somewhere between $-0.1$ and $0.1$, and so $y$ could be as low as $\cos(0.1)=0.995$. It would be better to say the error is about $0.005$. Very small indeed, but not zero. In a later lab (once you have learned about Taylor series) we will discuss how to treat errors in this case. In the mean time, beware! This approximation does not work when $f'(x)=0$ (or even if $f'(x)$ is very small). We will also share a general method you can use even if you don't know the derivative, provided you have a good computer.

## exercise

For the function $f(x)=x^3-3x$, give the value and error associated with $y=f(x)$ when $x=1\pm 0.05$ using the approximation method given above. Explain why it is not a good estimate, and give a better estimate.

# Putting it all together

Let's tackle a question we brought up in a previous lab: what is the volume of the earth? We will assume the earth is a sphere, which introduces an error of around 0.005%. Wikipedia lists the earth's radius as 6371.0 km, which we can interpret as $6371\pm 0.05$. Recall the formula for the volume of a sphere:
$$
V=\frac{4}{3}\pi r^3.
$$
This lets us tell the answer:

In [1]:
import math

r = 6371
v = (4/3)*math.pi*r**3
print(f'{v:,}')

1,083,206,916,845.7535


There is our answer: one trillion, eighty three billion, two hundered and six million, nine hundreed sixteen thousand, eighth hundred forty five point seven five three five cubic kilometers. Is that the best way to represent it? it's certainly a mouthfull. Are we justified in reporting so many figures? What is the error?

We can use our linear approximation rule to find the error in computing $r^3$:

In [2]:
cube_error = 3 * r**2 * 0.05
print(f'{cube_error:,}')

6,088,446.15


That's the absolute error. Since we will multiply this through to the other terms, we should find the relative error:

In [3]:
cube_relative_error = cube_error / r**3
print(cube_relative_error)

2.3544184586407157e-05


What about $\pi$? We are using a floating-point approximation for $\pi$, so we only have so many digits:

In [4]:
print(math.pi)

3.141592653589793


That gives us roundoff error, on the order of $10^{-16}$. We should add these, since pi is multiplied in:

In [5]:
cube_relative_error+ 10**-16

2.3544184586507158e-05

Notice this has no effect on the value, because it's so small. This is how we know we have more than enough digits of $\pi$.

4 and 3 are pure numbers, they have no error. Their relative error would be added in, since they are multiplied. We also need to account for modeling error, so we add that in here. Our final relative error is 

In [6]:
total_relative_error = cube_relative_error + 0.00005
total_relative_error

7.354418458640715e-05

or roughly 0.007%. the absolute error is

In [7]:
abs_err = v*total_relative_error
print(f'{abs_err:,}')

79,663,569.43777709


or about 80 million cubic kilometers. Any digits beyond that are completely extra, not telling us anythign useful. We should report the volume of the earth as $1,083,200,000,000 \pm 80,000,000$ cubic kilometers.

How did we do? Wikipedia gives the volume of Earth as 	$1.08321\times 10^{12} km^3$, which falls comfortably within our error range.

## exercise

Repeat the above analysis for the surface area of the earth. Compare your answer to the one [given on Wikipedia](https://en.wikipedia.org/wiki/Earth).