# Floats ≠ $\mathbb R$

Computers represent numbers in binary digits, the language of 0s and 1s. This brief note explains how this is done, and why floats can only be considered as approximately real numbers.

Our normal number system is represented in **base ten**, with the usual digits as follows;

\begin{align}
437 &= 400 + 30 + 7 \\
   &= 4(100) + 3(10) + 7(1) \\
   &= 4(10^2) + 3(10^1) + 7(10^0)
\end{align}

Binary numbers are represented in **base two** like this;
\begin{align}
0 &= 0 \\
1 &= 1 \\
2 &= 1 0 \\
3 &= 1 1 \\
4 &= 1 0 0 \\
5 &= 1 0 1 \\
6 &= 1 1 0 \\
7 &= 1 1 1 \\
8 &= 1 0 0 0 \\
9 &= 1 0 0 1
\end{align}

Given a number $x$ in decimal form, we can find its binary form by hand. We simply subtract from $x$ the largest powers of two smaller $x$. By keeping track of what powers we used and did not, we will have the binary form.

For example, 237 in binary is found as follows...
\begin{align}
237 - 128 &= 109 \\
109 - 64 &= 45 \\
45 - 32 &= 13 \\
13 - 8 &= 5 \\
5 - 4 &= 1 \\
1 - 1 &= 0
\end{align}

and these are the powers of two that were used
$$
\begin{matrix}
2^8 & 2^7 & 2^6 & 2^5 & 2^4 & 2^3 & 2^2 & 2^1 & 2^0 \\
256 & 128 & 64 & 32 & 16 & 8 & 4 & 2 & 1 \\
\hline
  & 1 & 1 & 1 & 0 & 1 & 1 & 0 & 1 & 
\end{matrix}
$$
The binary form of $237$ is $11101101$.

## A decimal $\to$ binary function

Suppose we have to convert $19$ to its binary representation. From above we know it should be;
$x = 19 = 1(2^4) + 0(2^3) + 0(2^2) + 1(2^1) + 1(2^0) = 10011.$

This is what the function needs to do;
1. Take the remainder relative to 2 (`x%2`), gives us the last binary bit.
2. Then integer division by 2 (`x//2`), this shifts all the bits to the right
    * `x//2` $= 1(2^3) + 0(2^2) + 0(2^1) + 1(2^0) = 1001$

Repeating this process will give us the rest of the bits.

> **Note:** The function should be able to handle negative integers.

In [1]:
def decimalToBinary(x):
    '''
    Given an integer x, computes its binary representation.
    '''
    result = ''
    if x < 0:
        isNeg = True
        x = abs(x)
    else:
        isNeg = False
    
    if x == 0:
        result = '0'
    while x > 0:
        result = str(x % 2) + result
        x = x // 2
    if isNeg:
        result = '-' + result
    print(f'The binary representation of {x} is {result}.')

In [2]:
decimalToBinary(19)

The binary representation of 0 is 10011.


In [3]:
decimalToBinary(237)

The binary representation of 0 is 11101101.


## What about Fractions?

In [4]:
decimalToBinary(3/8)

The binary representation of 0.0 is 0.375.


Clearly this is wrong, we didn't pass the function zero and $0.375$ is what we want the binary representation of! So the function needs to be updated.

For $x = 3/8$ we know that;
$$
3/8 = 0.375 = 3(10^{-1}) + 7(10^{-2}) + 5(10^{-3}).
$$

To get the binary form we do the following:

1. Multiply $x$ by a power of $2$ large enough to convert it into a whole number.
2. Convert this whole number into binary.
3. Divide the result by the power of $2$ used in step 1. (The integer part of this step tells you by how much to shift the decimal point)

For our example...
1. $0.375 (2^3) = 3$ (decimal form)
2. $3$ in binary is $11$
3. Dividing by $2^3$ gives $1.375$ so we shift one place from $0.11$ to get $0.011$, which is $0.375$ in binary form

> **Note:** If there is no integer $p$ such that $x\times(2^p)$ is a whole number, then the internal representation will always be an approximation. This is why testing for the equality (`==`) of floats is not exact, and will lead to very bizarre behaviour.

In [5]:
x = 0.1 + 0.2
if x == 0.3:
    print('Duh...')
else:
    print('WTF...?')

WTF...?


> When dealing with floats, instead of `x == y` use `abs(x - y) <` $\epsilon$. Where $\epsilon$ is some very small tolerance level.

## Back to the function...

In [6]:
def fractionToBinary(x):
    '''
    Given a decimal number between 0 and 1, computes binary representation.
    '''
    result = ''
    p = 0
    while ((2**p)*x) % 1 != 0:
        p += 1

    num = int(x*(2**p))
    if num == 0:
        result = '0'
    while num > 0:
        result = str(num%2) + result
        num = num//2

    for i in range(p - len(result)):
        result = '0' + result
    print(f'The binary representation of the decimal {x} is 0.{result}.')

In [7]:
fractionToBinary(3/8)

The binary representation of the decimal 0.375 is 0.011.


---