# Tutorial 1: Numbers

## Review from Lecture

It seems obvious that real numbers $\mathbb{R}$ are a key element of computation. But there are some subtle aspects to numbers that it's worth thinking about. We think of numbers along a line like this:

<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/d/d7/Real_number_line.svg/1000px-Real_number_line.svg.png" width="800" height="500" />


You are told that "almost all" of the numbers on this line are irrational. 
That means if you throw a dart at the line you should never "hit" a rational number. 
The irrationals fill the entire line. 

But there is a paradox:

*No one has ever met a true irrational number in person. 
We hear a lot of name dropping. 
People say things like, "I know $\pi$ and $e$." 
They are big celebrities in some circles. 
But no one's ever really seen them in their total infinity. 
Only fleeting glimpses as they run down the street and jump into a limo.*

My person view: 
Irrational numbers are a convenient fiction. 
They are "defined" as the completion of the rationals under limits of Cauchy sequences. 
What?

Say you have a sequence of rationals (ie "fractions" or ratios of integers), 
$r_{0}, r_{1}, r_{2}, \ldots, r_{n}, r_{n+1}, \ldots$. 
And say you have a way of comparing the distance between any two numbers,

$$\large | r_{n} - r_{m} |. $$

Now pick the smallest number you want, say 1/1,000,000,000, or 1/1,000,000,000,000, or $10^{-16}$. 
You can always find some (big) number $N$ so that  

$$\large | r_{n} - r_{m} |  < 10^{-16}.$$

*for every possible $m,n > N$*.
That is, past a certain point ($N$), any two terms ($r_m,r_n$) are within this distance of each other -- not just adjacent ones $r_{n+1},r_n$.

And if someone had said $10^{-\mathrm{googolplex}}$, we would have been able to find an $N$ for that too. 
Distance is nothing spooky. 
It's just subtraction that school kids do. 

We call this kind of sequence of rational number a ***Cauchy sequence***. 
It looks like it's going somewhere.
But at every step of the way, it's just a bunch of rational numbers. 

The thing about these kinds of sequences is that there may not be a rational number at the end of it. 
The definition is to just make a bigger set of numbers that includes these limits and go on our way as if nothing was ever awkward.

*The point is that real numbers are* ***algorithms***. 
At least the only ones we can ever actually talk about. 
If you can't compute it, then it doesn't exist. 
Most hypothetical "real" numbers that we can pretend to think about are sequences that we could never actually generate with any algorithm. 
We are stuck with the ***common-sense*** computational approach. 

### Here's an example of one such sequence.

Take $r_{0} = 1$.  At every step of the sequence, define 

$$\large r_{n+1} = \frac{r_{n}+2}{r_{n}+1}$$

Nothing but a bunch of rationals all the way... It's also possible to prove that the terms get closer and closer together. 

## `inf` and `nan`


**LECTURE'S PERSONAL NOTE:** 

It is my personal belief that a lot of students think of "Take the limit as $n\to \infty$" as very conceptually similar to "plug in the value $n=\infty$." 
At least my friends and I thought this way as students far into our education. 
There is good reason for this sometimes. 
I still do it when I'm in a hurry. 
And it works on paper sometimes. 
But this is the beauty of computers. 
They *will just not let you "plug in $\infty$"*. 
OK, they do allow it sometimes. 
They sometimes have a number called `inf`. 
And you can use this productively sometimes. 
But you sure better know what you are doing! 
If you use `inf`, you also better get used to the idea of `nan`, which means "not a number". 
See the examples below.

Computers make you realise that everything useful is finite. 
And the infinite is just the idea that you can keep doing something as long as you want. 
And maybe you can make a good guess where the result is going. 
But you'll *never* actually get to the end. 

In [None]:
import numpy as np
print(1/np.inf)
print(1+np.inf)
print(1-np.inf)
print(np.inf*np.inf)
print(np.inf+np.inf)
print(np.inf/np.inf)
print(np.inf-np.inf)
print(1 + np.nan)
print(np.inf*np.nan)

`inf` can get into a calculation and things can still turn out ok. But used the wrong way, `inf` can turn to it's evil partner, `nan`. Anything `nan` touches turns to `nan`.

**BACK TO OUR REGULARLY SCHEDULED PROGRAM**

**TASK 0**: 
Analytically determine the limiting value of the above recursive sequence $r_{n}$, where

$$\large r_{n+1} = \frac{r_{n}+2}{r_{n}+1} \qquad \mathrm{and} \qquad r_0 = 1$$

The limiting value is where the sequence doesn't change anymore; 
$$
\large r_{n+1} = r_{n}
$$
Practise writing your answer using Markdown in the space below.

Markdown here:

For example 
$$
\large \lim_{n \to \infty} r_{n} \ = \ ??? 
$$

**TASK 1**: Write a *recursive* function that generates the $n$th term in this sequence. Show that the terms get closer to each other. Remember: it's not a recursive function unless you see the name of the function *inside* the definition of the function.

In [None]:
# Define function here
def 

In [None]:
# Test function here

Warning: you will have to reboot the kernel if you type np.inf into your function. Why?

**TASK 2:** The original recursion formula is 1st-order *nonlinear*. We can also use a transformation to turn it into a 2nd-order *linear* recursion formula. 

The sequence $r_{n}$ is a *rational number* (i.e., fraction). This means we can write $r_{n}$ as the ratio of integers. 

Find a sequence of integers, $i_{n}$, such that 

$$\large r_{n} = 1+ \frac{i_{n-1}}{i_{n}} \quad ?$$

That is, find coefficients $A$ and $B$ so that  

$$\large i_{n+1} = A\, i_{n} + B\, i_{n-1}.$$

What value do you need to assume for $i_{-1}$? What about $i_{0}$? 

Does this sequence look similar to another (perhaps simpler) sequence you've seen in your education so far? There is a good reason for that. We'll learn more about why a little later.

**TASK 2* Markdown here:

<br>

**TASK 3:** write a recursive function that generates the sequence $i_{n}$. Take the ratio in TASK 2 and show the result is the same as the function output in TASK 1.

In [None]:
# Define function here
def

In [None]:
# Test function here

<br>

## The decimal system

When you look at the output of your $r_{n}$ function, you'll notice that you get the answer in perfectly sensible decimal numbers. This is just one convienent way of representing sequences of rational numbers with higher and higher precision.

$$
\large 
x = \lim_{D\to \infty}\ \sum_{i=-D}^{D} d_i 10^{i},
$$

with 

$$
\large d_{i} \in \{0,1,2,3,4,5,6,7,8,9\}.
$$

Notice that I took the limit as the range of terms $[-D,+D]$ goes to $\infty$. 
This is becuase we can't really ever get everything in one place. 

The Decimal system is just one more way of looking at sequences of rational numbers. 
And we know there are some drawbacks to this.  
Numbers like $1/3$ have decimal representations that go on repeating forever. 
If you want to represent $1/3$ exactly you need to use a base-3 system. 

But we still need to calculate with numbers like $pi$ and $1/3$.
But we cannot fit them in our finite computers. 
This leads to one of the two main sources of error in computations. 


## The binary system 
Virtually every computer uses binary to store numbers. 
A binary number system uses only two values, canonically 0 and 1, in each digit, 
as opposed to the ten we use in decimal. 
Very briefly, here is a short table showing the conversion of the first ten integers from decimal to binary:

|Decimal|Binary|
|-------|------|
|00|0000|
|01|0001|
|02|0010|
|03|0011|
|04|0100|
|05|0101|
|06|0110|
|07|0111|
|08|1000|
|09|1001|
|10|1010|

Each digit of the binary number is called a "bit" ("binary digit"). 
Thus, the bit is the smallest unit of computer memory. One "byte" is 8 bits. 

Just as an arbitrary number can be written in decimal, an arbitrary number can be written in binary as 

$$\large x = \lim_{B\to\infty}\  \sum_{i=-B}^{B} b_i 2^{i},$$

Where now

$$\large b_{i} \in \{0,1\}.$$


Decimal and binary system (and all other integer-base systems) have the nice property that they are (almost) "unique". 
That is if any two numbers have the same binary or decimal expansions, then they are the same number.

There is one important exception to this. In decimal, the number 

$$\large u =  0.9999999999999999999\ (\mathrm{repeating})$$

is not it's own number. This number is equal to $u=1$. In binary the same is true for

$$\large u = 0.11111111111111111 \ (\mathrm{repeating}).$$

There are a lot of clever ways to prove these. We saw a hint about this in Lecture 01.  

**TASK 04:** Use the following (finite) geometric series formula to prove $u=1$ from above:

$$\large (q-1) \sum_{i=1}^{n} q^{-i} \ = \ 1 - q^{-n}$$


**TASK 04** Markdown here: 

<br>

### Experimenting with binary:

`numpy` provides a function (`np.binary_repr()`) to convert *integers* into their binary representations. 
Let's practice a bit by using it. 
Try changing $n$ below. 
Try a bunch of numbers to get a feel for how binary works. 
You might try putting `np.binary_repr()` into a loop.

In [None]:
import numpy as np

In [None]:
n = 602176634

print(np.binary_repr(n))

**TASK 05:** 
Insert a new cell below (it's the **+** button on the top menu bar, or press 'b' when in command mode (when the cells have a blue outline -- press Esc and you'll see. Press 'h' in command mode for all the shortcuts)). 

See what happens when you try to give a floating point number (e.g. 43.2) to `np.binary_repr()`. Read the output carefully. 

<br>

## Floating Point Numbers

Throughout this course, we will need to work with computer representations of real numbers: 
we'll be adding, subtracting, dividing, and multiplying them in order to solve complex problems regarding the world. 

For example, the observable physical universe covers a huge range of scales. One of the smallest measured quantities is the size of a proton.

$$
\large L_{\text{proton}} \approx 10^{-15}\ \mathrm{m}
$$

In the other direction the size of the whole universe

$$
\large L_{\text{universe}} \approx 10^{27}\ \mathrm{m}
$$

The ratio of these scales is 

$$
\large \frac{L_{\text{universe}}}{L_{\text{proton}}} \ \approx \ 10^{42} \ = \ 1\,000\,000\,000\,000\,000\,000\,000\,000\,000\,000\,000\,000\,000\,000
$$

This is a number that is probably impossible to comprehend. But computers can cope with numbers of this size quite easily. In fact, *we* can compute numbers of this size quite easily ***provided we don't care 100% about accuracy***.

### Scientific notation.

A computer could keep track of all of the above 43 digits. But this would be (A) slow and (B) unnecessary. We've already used the solution. It's that we keep track of the 

In order to cover quantities over a large ***dynamic range***, we typically use scientific notation in our paper-and-pencil work, and computers do very much the same thing. The only tricky part is that the number, which we like to represent in decimal form, is stored in binary. 

Let's analyze scientific notation. 
Take an interesting small number. 


For example, the charge of a single electron is (*exactly*) 

$$
\large q \ \equiv \  - 1.602176634 \times 10^{-19}\,  \text{Coulomb}
$$

This is *exact* because a group of serious people got together and *defined* the charge of the electron this way. It really defines the Coulomb unit. As an interesting historical fact, the charge is negative because Benjamin Franklin defined the convection based on a guess. A lot in physics would be easier if he'd done it the other way.

We can write this schematically as 

$$
\large 
\frac{q}{\text{Coulomb}}\  = \  (-1)^S\ M\ \times 10^E,
$$

where 
$$
\large S \ = \ 1 \   = \  \textit{sign}
$$

and

$$
\large M  \ = \ 1.602176634 \   = \  \textit{mantissa} \quad \text{a.k.a.} \quad \textit{significand} 
$$

<br>

$$
\large E  \ = \ -19 \   = \  \textit{exponent} 
$$
 


Of course, the computer stores the number $q$ in binary, and it doesn't keep track of the units (that's your job). 

So this number must be converted to something the computer can understand. 


We'll use the notation $N_{10}$ to mean a number $N$ in base 10, and $N_{2}$ to mean a number in base 2. 
For example 

$$
\large 10_{10} = 1010_{2}
$$

## IEEE 754

https://en.wikipedia.org/wiki/IEEE_floating_point

There are many ways of storing floating point numbers on a computer; 
we will only describe one: IEEE 754. 
This is the most widely used standard for representing these numbers. 
An IEEE 754 number is represented in the following way:

$$
\large x = (-1)^S\ 1.F\ \times 2^{E-B},
$$

where $s$ is the sign, $F$ is the ***fractional part*** of the mantissa (note that there is 1 in front), $E$ is the exponent of the number you want to represent, and $B$ is called the **bias**.

### 64-bit numbers. 

IEEE 754 is a *standard* that allows storing numbers the same way on different computers. On most modern computer chips the CPU has what are called 64-bit registers. This is 64 (very fast) slots that can store either a $0$ or a $1$; something like the following 

    |1|1|0|0|0|0|1|0|0|0|1|0|0|0|0|1|1|1|1|0|1|0|1|0|1|0|0|1|0|1|0|1
    |0|0|1|0|0|1|0|1|0|1|1|1|1|0|0|1|1|1|1|1|1|0|1|1|1|1|1|0|0|1|0|0|
    
    
Of course, the 0s and 1s are actually little "on" or "off" transistors, but it doesn't matter what they are. They could be happy or sad unicorns. We could still compute with them. 

If we have 64 bits to store a number, then how should we break things up? Some of the details vary from chip to chip. But the basic idea is 

    Sign:         1 bit
    
    Significand: 53 bits
    
    Exponent:    10 bits
    
    
    
    
Note that the total exponent stored in memory $E$  has no sign bit and is thus **always positive**. 
This explains the need for the bias: if we want to represent actual negative exponents 

$$
\large E - B < 0  \quad \implies \quad    B>0
$$. 

A floating-point number standard that can't hold the charge of an electron is not very useful. 

## 32- vs 64-bit. 


More than about 15 years ago, 32-bit registers were the default for CPUs. In the old system, when people said `float` they always meant 32-bit `float`. And when they meant 64-bit, they said `double`. That is because these numbers took two registers. 

Now almost all modern hardware has 64-bit registers. Some computing languages still use the old naming. 

* **`float` in Python means means 64-bit**



* **`float32` is Python is the old "single"**



* **`float128` is now requires two registers, and could be called "double".**


Some numerical routines allow `float128` and some don't. But using it is slower, and almost always not needed. 

## Experimenting with binary numbers

Convert a binary digit written as a string (that is, in quotes) prefaced with '0b' to decimal integers

Play around with these expression and try to figure out what's happening

In [None]:
int('0b11111111',base=2) 

In [None]:
int('0b00000000',base=2)

In [None]:
int('0b11111111111111111111111111111111111111111111111111111',base=2)*2**(-53)

**TASK 06:** Using the above information about the number of bits for each part of a 64-bit floating-point number, in the *decimal representation*, (A) calculate the range of largest (positive & negative); (B) the smallest non-zero (positive & negative) numbers you can represent; and (C) calculate the number of decimal digits of accuracy you can expect. 

## Floating-point errors.

There is a lot to say about this topic. Whole books have been written about it. 

Suppose you are solving something like the quadratic equation

$$
\large a\, x^{2} + b\, x + c \ = \ 0 \quad \quad \implies \quad \quad x \ = \ \frac{-b \pm \sqrt{b^{2} - 4 a c }}{2 a}
$$

But if $a \approx 0$, then we can neglect the leading-order coefficient and look as 

$$
\large b\, x + c \ \approx \ 0 \quad \quad \implies \quad \quad x \ \approx \ -\frac{c}{b} \quad \quad \text{(this is regular-sized})
$$

The other solutions (for small $a$) is 

$$
\large x \ \approx \ -\frac{b}{a} \quad \quad \text{(this is} \textit{ large}).
$$



**TASK 07**: Write a function that takes $a,b,c$ as arguments and gives back the answer *using the standard quadratic formula*.  Assume $a \ne 0$. 

In [None]:
def quadratic_formula(a,b,c): 
    
    # Your code here
    
    return x_plus, x_minus

***Test this with values you know work to make sure it's giving the right answers.***

**TASK 08**: Using parameters, $b=1$ and $c=-1$. Look at the solutions of the quadratic formula for 

$$
\large a \ = \ 2^{-n} \quad \text{where} \quad n \ge 1.
$$

Is there a big problem for one of the roots for some sufficiently large value of $n$? If so, why?


<br>

## Bit-wise operators

A binary representation allows us to define different operators on numbers. We already know about `+`, `-`, `*`, ect. 

We can also define `&` (logical `and`), and `|` (logical `or`), and others. How?

True and False satisfy ***logical*** Operations

In [None]:
L = [True,False]

for i in range(2):
    for j in range(2):
        print(" %5s & %5s = %5s" % (L[i], L[j], L[i] & L[j]) )
        print(" %5s | %5s = %5s" % (L[i], L[j], L[i] | L[j]) )
        print(' --------------------- ')

These are the rules of `and` and `or` in Boolean logic. Of course this works the same way if we use `1 = True`, and `0 = False`

In [None]:
L = [1,0]

for i in range(2):
    for j in range(2):
        print(" %i & %i = %i" % (L[i], L[j], L[i] & L[j]) )
        print(" %i | %i = %i" % (L[i], L[j], L[i] | L[j]) )
        print(' ---------- ')

But numbers are just strings of 0s and 1s. Eg, 

$$ 5 = 2^2 + 2^{0} \quad \longrightarrow \quad [\, \ldots,\, 0,\, 0,\, 1,\, 0 ,\, 1, \, 0 \, 0 ,\ldots \,] $$
$$ 6 = 2^2 + 2^{1} \quad \longrightarrow \quad [\, \ldots,\, 0,\, 0,\, 1,\, 1 ,\, 0, \, 0 \, 0 ,\ldots \,] $$

We can use `&` and `|` on each pair of 0s and 1s. And this will represent a new number. 

In [None]:
print(5&6)
print(5|6)

**TASK 09:** Convert the numbers `17`, and `47` (for example) into binary by hand. Take and `&` and/or `or` operations on the binary sequences and reconstruct decimal representations of the results. Do this for a few other numbers until you understand how it works. 

In [None]:
print(41 & 17)
print(41 | 17)

Why do these operations make sense even if the numbers have different numbers of digits and/or binary digits? 

In [None]:
print(1234567 & 987)
print(1234567 | 987)

Honestly, bit-wise operations do not arise very often in everyday application. Serious low-level programmers use them because computer do binary operations ***very fast***. 

But we have already seen a mathematical application of `&` and `|`. We used these operations to find exact solutions to the tower of Hanoi. In that case, we represented the state of the system as a binary number. Each move gave a new binary number based on the logical operations. You can read about the details on `wikipedia`. But the point is that sometimes there are some clever uses of binary that are not just about data on a computer. 