# Representing Data types

Let's take a look at natural numbers, negative integers, and real numbers.

## Representing natural numbers

* Basically, computers can handle 0 or 1 only.

* Here, to **handle** means to store in memory, to operate, and to output.

* We call the smallest unit that can handle 0 or 1 a **bit**.

* When you hear about a personal computer is for example 32 bit or 64 bit, please recall this.

* It means the number of bits that the computer can process at once.

* To handle larger (natural) numbers, we need more number of bits.

* Following table shows decimal numbers in binary and hexadecimal format.

| Decimal | Binary | no. bits | no. bytes | Hexadecimal |
|:--------------:|:------------:|:------------:|:------------:|:-------------------:|
|0|0|1|1|0|
|1|1|1|1|1|
|2|10|2|1|2|
|3|11|2|1|3|
|4|100|3|1|4|
|5|101|3|1|5|
|6|110|3|1|6|
|7|111|3|1|7|
|8|1000|4|1|8|
|9|1001|4|1|9|
|10|1010|4|1|A|
|11|1011|4|1|B|
|12|1100|4|1|C|
|13|1101|4|1|D|
|14|1110|4|1|E|
|15|1111|4|1|F|
|16|10000|5|1|10|
|17|10001|5|1|11|
|18|10010|5|1|12|
|19|10011|5|1|13|
|20|10100|5|1|14|
|127|1111111|7|1|7F|
|128|10000000|8|1|80|
|255|11111111|8|1|FF|
|256|100000000|9|2|100|
|32767|111111111111111|15|2|7FFF|
|32768|1000000000000000|16|2|8000|
|65535|1111111111111111|16|2|FFFF|
|65536|10000000000000000|17|3|10000|
|4294967295|11111111111111111111111111111111|32|4|FFFFFFFF|
|4294967296|100000000000000000000000000000000|33|5|100000000|
|9223372036854775807|111111111111111111111111111111111111111111111111111111111111111|63|8|7FFFFFFFFFFFFFFF|
|9223372036854775808|1000000000000000000000000000000000000000000000000000000000000000|64|8|8000000000000000|
|18446744073709551615|1111111111111111111111111111111111111111111111111111111111111111|64|8|FFFFFFFFFFFFFFFF|
|18446744073709551616|10000000000000000000000000000000000000000000000000000000000000000|65|9|10000000000000000|

(To make the table above, we can use the following code.)

In [None]:
# Import functions related to math
import math

# Decimal number loop
for i in list(range(0, 21)) + [127, 128, 255, 256, 32767, 32768, 65535, 65536, 2**32-1, 2**32, 2**63-1, 2**63, 2**64-1, 2**64, ]:

    # Decimal format
    d_str = str(i)

    # Binary format
    b_str = f"{i:b}"

    # Number of bits
    n_bits = len(b_str)
    n_bits_str = str(len(b_str))

    # Hexadecimal format
    h_str = f"{i:X}"

    # Number of bytes
    # Try `help(math.ceil)` to check what it does
    n_bytes = math.ceil(len(h_str) * 0.5)
    n_bytes_str = str(n_bytes)

    # Indicate a row of the table
    print('|'.join(['', d_str, b_str, n_bits_str, n_bytes_str, h_str, '']))


* We can see that one hexadecimal digit represents four binary digits.

* Because of this, frequently we group four digits of a binary number; for example `1101 0101`.

* We call a collection of 8 bits a **byte**.

* Q: One digit of which base would represent three binary digits?

## Representing negative integers

* Example above implies that computers are designed to handle 0 and positive integers.

* To represent and handle negative integers, computers convert a negative integer to **2's complement**.

### An 8 bit example

* To simplify, we will look at the 8bit case first.

* Positive integer 7 is 00000111 in an 8 bit binary number.

* To find 2's complement of binary number 00000111, change 0's to 1's and add 1.

* 2's complement of binary number 00000111 would be 11111001.

* Adding these two numbers would be follows.

In [None]:
a = int('00000111', base=2)
b = int('11111001', base=2)

c = a+b
print(f'c = {c:b}(binary)')
print(f'c = {c:d}(decimal)')
print(f'c = {c:x}(hexadecimal)')



* The result above is 256 in decimal, however, in binary `1 0000 0000` whose lower 8 bits are all zeros. In case of the 8 bit operation, we regard this result as zero.

* Following table compares 8bit bit patterns vs `unsigned int8_t` and `signed int8_t` values.



In [None]:
# https://stackoverflow.com/questions/35160256/how-do-i-output-lists-as-a-table-in-jupyter-notebook
# http://nbviewer.jupyter.org/github/ipython/ipython/blob/4.0.x/examples/IPython%20Kernel/Rich%20Output.ipynb

import IPython

n = 8

table = [ f''' {n} bit bit pattern | `unsigned int{n}_t` | `signed int{n}_t` 
:-----------------:|:--------:|:------:''']

for i in range(0, 3):
    table.append(f'{i:0{n}b} | {i} | {i}')

table.append(f' ... | ... | ... ')

for i in range(2**(n-1)-2, 2**(n-1)-1+1):
    table.append(f'{i:0{n}b} | {i} | {i}')

for i in range(2**(n-1), 2**(n-1)+2+1):
    table.append(f'{i:0{n}b} | {i} | {i-(2**n)}')

table.append(f' ... | ... | ... ')

for i in range((2**n)-2, (2**n)-1+1):
    table.append(f'{i:0{n}b} | {i} | {i-(2**n)}')

IPython.display.display(IPython.display.Markdown('\n'.join(table)))



### A 16 bit example

* Following example shows a 16bit example.

In [None]:
a = int('0000''0000''0000''0111', base=2)
b = int('1111''1111''1111''1001', base=2)

c = a+b
print(f'c = {c:b}(binary)')
print(f'c = {c:d}(decimal)')
print(f'c = {c:x}(hexadecimal)')



* The result above is 65536 in decimal, however, in binary `1 0000 0000 0000 0000` whose lower 16 bits are all zeros.

* Following table compares 16bit bit patterns vs `unsigned int16_t` and `signed int16_t` values.



In [None]:
# https://stackoverflow.com/questions/35160256/how-do-i-output-lists-as-a-table-in-jupyter-notebook
# http://nbviewer.jupyter.org/github/ipython/ipython/blob/4.0.x/examples/IPython%20Kernel/Rich%20Output.ipynb

import IPython

n = 16

table = [ f''' {n} bit bit pattern | `unsigned int{n}_t` | `signed int{n}_t` 
:-----------------:|:--------:|:------:''']

for i in range(0, 3):
    table.append(f'{i:0{n}b} | {i} | {i}')

table.append(f' ... | ... | ... ')

for i in range(2**(n-1)-2, 2**(n-1)-1+1):
    table.append(f'{i:0{n}b} | {i} | {i}')

for i in range(2**(n-1), 2**(n-1)+2+1):
    table.append(f'{i:0{n}b} | {i} | {i-(2**n)}')

table.append(f' ... | ... | ... ')

for i in range((2**n)-2, (2**n)-1+1):
    table.append(f'{i:0{n}b} | {i} | {i-(2**n)}')

IPython.display.display(IPython.display.Markdown('\n'.join(table)))



### Summary

* Computers handle negative intergers as 2's complementary, which we can find by exchanging 0's and 1's of the binary representation of the integer's absolute value and then adding one.

## Representing real numbers

### Fixed point

* We can represent a number in an integer and multiply a fixed number.

* For example, we can indicate all lengths in cm units and multipy 0.01 to find values in m units. 

* However, in this way, it may not be easy to indicate in mm units.

### Floating point

* In short, we can also represent a real number using significand and exponents.

* For example, $2.3456 \times 10^0$ m would be $2.3456 \times 10^2$ in cm, and $2.3456 \times 10^3$ in mm.

* An engineering calculator may indicate $2.3456 \times 10^3$ as `2.3456E3`. Here, we can see that `2.3456` is the significand (also mantissa, coefficient, argument or fraction) and `3` is the exponent.

* Even if the sigtificand does not change, when the exponent changes, the location of the decimal point changes.

* On the contrary, $2.3456 \times 10^0$ mm would be $2.3456 \times 10^{-1}$ in cm, and $2.3456 \times 10^{-3}$ in m.

* Computers store in binary numbers. For more informaiton, please refer to [IEEE 754, Wikipedia](https://en.wikipedia.org/wiki/IEEE_754).

* Usually we use 4Byte (32bit) single precision or 8Byte (64bit) double precision, which includes $\pm$, exponent, and significand.

* Following table shows the breakout of 32 bits of single precision.  Here, `e` and `s` indicate exponent and sigdificand, respectively.

1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32
:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:
$\pm$ | `e` | `e` | `e` | `e` | `e` | `e` | `e` | `e` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s`

In [None]:
# number of bits
n = 32
ne = 8
ns = n - 1 - ne

print(
    '\n'.join(
        [
            ' | '.join(str(k) for k in range(1, n+1)),
            '|'.join(':---:' for k in range(1, n+1)),
            ' | '.join(['$\pm$'] + ['`e`']*ne + ['`s`']*ns),
        ],
    )
)


* Please note that the exponent value $0$ means $2^{-127}$ and $2^{8}-1=255$ means $2^{128}$.

* Following table shows the breakout of 64 bits of double precision.

1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | 54 | 55 | 56 | 57 | 58 | 59 | 60 | 61 | 62 | 63 | 64
:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:
$\pm$ | `e` | `e` | `e` | `e` | `e` | `e` | `e` | `e` | `e` | `e` | `e` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s` | `s`

In [None]:
# number of bits
n = 64
ne = 11
ns = n - 1 - ne

print(
    '\n'.join(
        [
            ' | '.join(str(k) for k in range(1, n+1)),
            '|'.join(':---:' for k in range(1, n+1)),
            ' | '.join(['$\pm$'] + ['`e`']*ne + ['`s`']*ns),
        ],
    )
)


* Please note that the exponent value $0$ means $2^{-1023}$ and $2^{11}-1=2047$ means $2^{1024}$.