# Floating Point

The following examples show the steps required to convert a real number to a binary representation to store in hardware. The [IEEE 754](https://docs.oracle.com/cd/E19957-01/806-3568/ncg_math.html) standard defines this hardware standard for how floating point numbers are stored. The two sizes this standard defines are `32-bit` and `64-bit` numbers. We'll use the following definition for these explorations where `S` is the sign bit, `M` is the significand bits, `E` for the exponent.
$$(-1)^{s} M 2^{E}$$
To store the binary, the data will go into fields called: 
* `exp` for the exponent bits: $E + bias$
* `frac` for the fractional bits

|Definition|exp|frac|bias|
|----------|---|----|----|
|32-bit    | 8 | 23 |127 |
|64-bit    | 11| 52 |1023|
| Tiny     | 4 | 3  | 7  |

Here are the steps required to fully complete converting a real number to the binary form:

1. Write fixed point binary representation
   1. Convert the fractional portion via multiplication
1. Normalize the binary number to be of the form `1.xxxe##`
1. Apply bias to the exponent to find `exp`
1. Convert exp to binary
1. Store values, including the sign bit

|S|exp|frac|
|-|---|----|

In [None]:
from decimal import Decimal
import pandas as pd
from IPython.display import display, Math

bias_32 = 127
bias_64 = 1023
bias_tiny = 7

A number to start with: `2080.0`. In this case, there is no fractional portion to consider.

In [None]:
bin(2080)

The normalized binary representation is then: `1.00000100000`

What is the exponent?

In [None]:
E = len(_) - 3
E

Apply the bias:

In [None]:
exp_bits_32 = E + bias_32
exp_bits_32

In [None]:
exp_bits_64 = E + bias_64
exp_bits_64

In [None]:
pd.DataFrame.from_dict([{'Definition': '32-bits', 'exp': bin(exp_bits_32)[2:], 'frac': '0'*23},
                        {'Definition': '64-bits', 'exp': bin(exp_bits_64)[2:], 'frac': '0'*52}
                       ])

Note that since `E` is 11, it cannot be stored into the tiny representation.

Now, for an example with a fractional portion: `7.875`. For this example, a sum of base two fractional portions has been chosen.

In [None]:
bin(7)

In [None]:
bit_remainder = 7.875 % 1
bit_remainder

In [None]:
next_bit = 1
power = -1
frac_bits = ''
while next_bit != 0:
    bit_remainder_prev = bit_remainder
    bit_remainder *= 2
    next_bit = bit_remainder % 1
    frac_bits += '1'
    display(Math(r'$2^{{ {} }}  : {} * 2 = {}'.format(power, bit_remainder_prev, bit_remainder)))
    bit_remainder -= 1
    power -= 1

Now, the full bits representation can be completed:

In [None]:
frac_bits = bin(7)[2:] + '.' + frac_bits
frac_bits

Now, `E` is just 2, which provides the normalized value of `1.11111`

In [None]:
E = 2
E

Apply the bias:

In [None]:
exp_bits_32 = E + bias_32
exp_bits_32

In [None]:
exp_bits_64 = E + bias_64
exp_bits_64

In [None]:
exp_bits_tiny = E + bias_tiny
exp_bits_tiny

In [None]:
frac_bits = frac_bits.replace('.', '').ljust(52 - len(frac_bits), '0')
frac_bits

In [None]:
pd.DataFrame.from_dict([{'Definition': '32-bits', 'exp': bin(exp_bits_32)[2:], 'frac': frac_bits[:23]},
                        {'Definition': '64-bits', 'exp': bin(exp_bits_64)[2:], 'frac': frac_bits[:52]},
                        {'Definition': 'Tiny', 'exp': bin(exp_bits_tiny)[2:], 'frac': frac_bits[:3]}
                       ])

The tiny representation shows that a precision loss has occurred. To figure out the precision loss, the stored number can be calculated.

In [None]:
E = exp_bits_tiny - bias_tiny
E

The normalized value in binary would now be:

In [None]:
'1.' + frac_bits[:3]

Shifting by `E` gives the value: `111.1`.

In [None]:
int('111', 2) + 2**-1