# Bits, Bytes, and Numbers


In order to do math on a computer, you should have some idea of how computers represent numbers and do computations.

Unlike strongly typed languages (e.g. C), you don't have to worry too much about type if you're just scripting.  However, you will have to think about this for most algorithms.

In [1]:
x = 5 # an integer type
print(type(x))
x = 5.0 # float type
print(type(x))

<class 'int'>
<class 'float'>


In [2]:
print(4/3)  # division of ints -> float
print(4//3) # integer division -> int

1.3333333333333333
1


## Bits and Bytes

A bit is a 0/1 value, and a byte is 8 bits.  Most modern computers are 64-bit architectures on which Python 3 will use 64-bits to represent numbers.  Some computers may be 32-bit architectures, and Python may use 32-bits to represent numbers - beware!

You can represent strings of bits using the `0b` prefix.  Be default, these will be interpreted as integers written in base-2.  For example, on a 32-bit system,
```
0b101 = 00000000000000000000000000000101 (base 2) = 5 (base 10)
```

In [3]:
x = 0b101
print(x)

5


It is often easier to deal with hexadecimal (base-16), denoted with the `0x` prefix. It uses `0`-`9` to represent values 0-9, and `A`-`F` to represent values 10-15. For example one can represent the bits 1111111 as hexadecimal system as FF.

In [4]:
x = 0xF
print(x)

15


In [5]:
0xF == 0b1111 == 15

True

You can also use octal (base 8) if you'd like using the `0o` prefix

In [6]:
x = 0o10
print(x)

8


hexadecimal is often used because it breaks up bits into blocks of 4 (16 = 2^4).  So a 64-bit type has some representation as a length-16 string in hexadecimal

## Integers

Integers are represented in base-2 using bits.  Most modern computers are 64-bit architectures, and other programming languages like C++ can only store a 64-bit number.  However, Python will use arbitrary precision (until your memory is full) so you don't run into overflow errors:

In [7]:
print(2**65) # ** for exponentiation

36893488147419103232


In [8]:
import sys
sys.getsizeof(2**65) # size in bytes

36

However, when we call code written in C/C++ or fortran such as numpy (we will see this package later), you can run into overflow issues

In [13]:
import numpy as np
x = np.int64(2)
x ** 63

-9223372036854775808

In [10]:
x**65

0

## Bitwise operations

You can perform operations on bit strings in Python:

In [23]:
print("This is {:04x}".format(0x1100 & 0x0101)) # bitwise and
print("{:04x}".format(0x1100 | 0x0101)) # bitwise or
print("{:04x}".format(0x1100 ^ 0x0101)) # bitwise xor (addition modulo 2)
print("{:04x}".format(~0x2)) # negation

This is 0100
1101
1001
-003


## Floating Point Numbers

Real numbers are represented as floating point numbers on a computer.  Almost all real numbers must be approximated, which means you can't always ask for exact equality

In [24]:
1.0/3.0

0.3333333333333333

In [25]:
1.2 - 1.0

0.19999999999999996

In [18]:
1.2 - 1.0 == 0.2

False

The approximation error is called **machine precision**, typically denoted $\epsilon$

In [19]:
import numpy as np
print(np.finfo(np.float16).eps) # 16-bit float
print(np.finfo(np.float32).eps) # 32-bit float
print(np.finfo(np.float64).eps) # 64-bit float
print(np.finfo(np.float128).eps) # 128-bit float

0.000977
1.1920929e-07
2.220446049250313e-16
1.084202172485504434e-19


32-bit floating point numbers corrsepond to a `float` in C, and are also known as **single precision** numbers.  64-bit floating point numbers correspond to a `double` in C, and are also known as **double precision** numbers.  16-bit floats are half-precision, and 128-bit floats are quad-precision.

Double precision is the standard for numerical codes.  Quad- (or higher) precision is sometimes useful.  A big trend in deep learning is to use lower-precision formats.

Floating point numbers are numbers written in scientific notation (in base-2).  E.g. $1.1 * 2^{10} = 1.5 * 2^2$ in base-10 = 6.  They contain a sign bit, a set of bits for the decimal (called the significand or mantissa), and a set of bits for the exponent.

For further reading on potential considerations with floating point numbers, see [the Python documentation](https://docs.python.org/3/tutorial/floatingpoint.html)

### Print Formatting

It is often convenient to format floating point numbers for printing without showing full precision.  An explanation of available options can be found in the [format specification mini-language documentation](https://docs.python.org/3/library/string.html#format-specification-mini-language).  We'll cover a few examples.

When formatting a floating point number with `format`, you put format specification in the curly braces `{}`, as in `"{:width.precision}}`

The *width* denotes the total field width.

The *precision* denotes how many digits should be displayed after the decimal.

In [20]:
pi = np.pi
print("Example of pi {:e}".format(pi))      # exponential notation
print("{:.2e}".format(pi))    # exponential, precision 2
print("{:f}".format(pi))      # fixed-point notation
print("{:.2f}".format(pi))    # fixed-point, precision 2
print("{:6.2f}".format(pi))   # minimum field width of 5
print("{:+6.2f}".format(-pi))  # minimum field width of 5, explicit sign
print("{:06.2f}".format(pi))  # 0-pad to to fill width
print("{: .2f}".format(pi))   # use space for sign alignment
print("{: .2f}".format(-pi))  # use space for sign alignment

Example of pi 3.141593e+00
3.14e+00
3.141593
3.14
  3.14
 -3.14
003.14
 3.14
-3.14


## Exercise
1. What is $\log_2(\epsilon)$ for 32 and 64-bit floats?  (hint: use `np.log2`)
2. What is the largest exponent you can have for 32- and 64-bit floats? Keep in mind that you should have an equal number of positive and negative expoenents. Design an experiment to check your answer.

Hint: `np.finfo(np.float32).max)`

In [None]:
# Your code here


In [None]:
# Part 1
np.log2(np.finfo(np.float32).eps)

In [None]:
# part 2
print(np.log2(np.finfo(np.float32).max))
print(np.float32(2.0**127))
print(np.float32(2.0**128))