# Binary representation

<div style="display:none">
$\newcommand{\nat}[1]{\overline{#1}}$
$\newcommand{\word}{\mathcal{W}}$
$\newcommand{\bits}[1]{\text{0b}{#1}}$
$\newcommand{\lift}[1]{\overline{#1}}$
$\newcommand{\bor}{\mathrel{|}}$
$\newcommand{\band}{\mathrel{\&}}$
$\newcommand{\bxor}{\mathrel{\wedge}}$
$\newcommand{\bnot}{\mathrel{\sim}}$
$\newcommand{\lsl}{\mathrel{<<}}$
$\newcommand{\lsr}{\mathrel{>>}}$
</div>

The bit is the simplest unit in a digital numbering system, which can take only two values, usually designated by the digits $0$ and $1$. A bit can represent both a logical alternative, expressed as *false* and *true*, as well as a digit of the binary system. The word *bit* is a contraction of the English words *binary digit*. It was popularized by Claude Shannon.

The bit is at the core of digital systems because there are many technical means of encoding binary information. E.g., magnetic polarization (for storage), electric current or voltage, light intensity (for transmission). Therefore, at a core level, all digital data is just a stream of $0$s and $1$s.

However, a computer does not inherently know what an integer, a string, a function, etc. are; rather, we (as computer scientist) must decide how to represent all such concepts by encoding them as sequences of $0$s and $1$s. 
(These encodings may change over time and between systems, but two systems that use the same encoding are able to exchange and process data in common).

Bits are rarely encoutered in isolation. Instead, they are packed into units of 8 bits, called *bytes*. A byte is the smallest chunk of data of computer can process and allows to encode up to $2^8 = 256$ values whose binary codes are among:

$$00000000 \quad 00000001 \quad 00000010 \quad \cdots \quad 11111110 \quad 11111111$$

The canonical interpretation of a byte is the natural number that is represented by the standard positional interpretation as a binary number. That is, if I have a byte $B$ that consists of the bits $b_7 b_6 b_5 b_4 b_3 b_2 b_1 b_0$, then its associated natural number $\nat{B}$ is:

$$
\nat{B} = \sum_{0 \leq i < 8} 2^i \cdot b_i
$$

For example, if $B = 10101010$, then $\nat{B} = 170$.

Most of the time, bytes are aggregated into a larger structure called a machine **word**. A machine word (or word) is a basic unit manipulated by a microprocessor. Most of the operations of a microprocessor works on machine words. On modern architectures, words are usually 64-bits long, i.e. 8-bytes long, and allows to encore up to $2^{64}$ different values. The canonical interpretation of words is the same as the one for bytes: if $w = b_{n-1} b_{n-2} \ldots b_1 b_0$ is a $n$-bit word, then its associated natural number $\nat{w}$ is:

$$
\nat{w} = \sum_{0 \leq i < n} 2^i \cdot b_i
$$

In a word, we distinguish two bits: the most significant bit (MSB) $b_{n-1}$ and the least signicant bit (LSB) $b_0$. The most significant bit (MSB) is the bit, in a given binary representation, having the greatest weight (i.e. power of $2$) or position (the one on the left in the usual positional notation). Likewise, the least significant bit (MSB) is the bit, in a given binary representation, having the lowest weight (i.e. power of $2$) or position (the one on the right in the usual positional notation). Moreover, for a given work $w$, we write $[w]_i$ for denoting its $i$-th bit, counting from the LSB up-to the MSB - i.e. $[w]_0$ is the LSB of $w$ and $[w]_{n-1}$ is its MSB.

We denote by $\word_n$ the set of $n$-bit words. We write $\word_\infty$ for infinity long words that only have a finite number of bits that are non-zero - i.e. they are words that are lead by an infinite number of $0$s that are left implicit. This is not different from the decimal notation where the leading $0$s are left implicit (i.e. $14 = \cdots 0 \cdots 014$). Last, we write $\word$ instead of $\word_k$ when the size is left implicit.

## Bit representation

In python, it is not possible to manipulate bytes or machine words directly. This must be done via the `int` type values (in that case, a non-negative integer represents the machine word whose canonical value (as defined above) is the integer under consideration). Thus, if we are interested in the $10101010$ byte, whose value as a natural number is $170$, we will have to manipulate the integer $170$ directly.

**Note**: in Python, the type `int` stands for the type of arbitrary long integers - the only limit is your computer memory. This is quite unusual - most programming languages use the machine words are their core `int` type and all arithmetic is done modulo $2^n$, where $n$ is the word size on the underlying architecture. As a consequence, in Python, the size of words are arbitrary too. All the sections here after can be understood regardless of wether we are using fixed-size machine words or arbitrary long words, with the notable exceptions of the right-shifts, bitwise negations.

Python offers the `bin` function that produces a Python string (i.e., a value a type `str`) made up of the characters `0` and `1` and which contains the bit representation of the argument of the function. E.g., when applied to $170$, the function outputs the string `0b10101010` - as expected, since the canonical value associated to the byte $10101010$ is indeed $170$. (Note that the string is prefixed with `0b` as a way to indicate that it represents a binary value and not the decimal number `10101010`. Also note that Python understands code literals (i.e. values that you put in your programs) in binary format, i.e. prefixed with `0b`, as examplified below:

In [1]:
print(bin(170))    # Get the binary representation of 170
print(0b10101010)  # A python literal expressed using its binary form

0b10101010
170


Dually, you can convert back a string containing the binary representation of an integer using the `int` function. (Note the second argument of `int` - without it, the string would be interpreted in decimal notation):

In [2]:
print(int('10101010'))
print(int('10101010', 2))

10101010
170


Below, I am defining a function that is a variant of `bin` that adds leading `0`s to a binary representation s.t. the number contains at least `n` digits. This function will be used later and will help with readability.

In [3]:
def binpad(x, n = 0):
    assert(0 <= x)
    return '0b' + f'{x:0b}'.rjust(n, '0')

## Manipulating binary representations

### Bitwise lifting

The simplest bit manipulation functions are the bitwise lifting ones. Assume that we have a function $f : \{0,1\}^k \to \{0,1\}$ for some fixed $k$. The lifting of $f$ to words, written $\lift{f} : \word^k \mapsto \word$, is defined as the function that applies $f$ independently on all the bits of its $k$ input word:

$$
[\lift{f}(w^1,\ldots,w^k)]_i = f([w^1]_i,\ldots,[w^k]_i)
$$

When clear from the context, we write $f$ for $\lift{f}$. In the unary case ($k = 1$), we obtain the following word transformer:

<figure>
<img src="files/bitwise.png" style="height:100px;"></img>
</figure>

and the following one in the binary case ($k = 2$):

<figure>
<img src="files/bitwise2.png" style="height:130px;"></img>
</figure>

Regarding the usual $f$, there are 4 standard operators:
  - **or** (written $\bor$ in infix form), where $b_1 \bor b_2$ is equal to $1$ iff one of $b_1$ or $b_2$ is equal to $1$;
  - **and** (written $\band$ in infix form), where $b_1 \band b_2$ if equal to $1$ iff both of $b_1$ and $b_2$ are equal to $1$;
  - **exclusive or** or **xor** (written $\bxor$ in infix form), where $b_1 \bxor b_2$ is equal to $1$ is *exactly* one of $b_1$ or $b_2$ is equal to $1$ (i.e. $1 \bxor 1 = 0$); and
  - **not** (written $\bnot$ in prefix form), where $\bnot 1 = 0$ and $\bnot 0 = 1$.

We sum up the definition of these 4 operators in the tables below:

| $b_1$ | $b_2$ | $b_1 \bor b_2$ | $b_1 \band b_2$ | $b_1 \bxor b_2$ | $\bnot b_2$ |
|:-----:|:-----:|:--------------:|:---------------:|:---------------:|:-----------:|
|  $0$  |  $0$  |       $0$      |       $0$       |       $0$       |     $1$     |
|  $0$  |  $1$  |       $1$      |       $0$       |       $1$       |     $0$     |
|  $1$  |  $0$  |       $1$      |       $0$       |       $1$       |             |
|  $1$  |  $1$  |       $1$      |       $1$       |       $0$       |             |

The respective Python operators are `|`, `&`, `^` and `~`. We give below a few examples of bitwise operators evaluations in Python:

In [4]:
i = 0b10101010
j = 0b00000111

print("i     =", binpad(i    , 8))
print("j     =", binpad(j    , 8))
print("i | j =", binpad(i | j, 8))
print("i & j =", binpad(i & j, 8))
print("i ^ j =", binpad(i ^ j, 8))

i     = 0b10101010
j     = 0b00000111
i | j = 0b10101111
i & j = 0b00000010
i ^ j = 0b10101101


The output for the bitwise negation is a bit surprising. For instance, if we bitwise negate `0b10101010`, we obtain `-0b10101011` instead of `0b01010101`. We will later why Python has such a behaviour when we come to the representation of relative integers.

In [5]:
print(bin(~ 0b10101010)) # 0b01010101

-0b10101011


### Left shift

The logical left shift (`w << n`) is the operation that shifts all the bits of its operand `w` to the left (i.e. towerd the MSB), inserting `0`s on the right. This operation is parameterized by the number `n` of bits that are shifted.

$$
\DeclareMathOperator{\lsb}{lsb}
[w \lsl n]_i = \left\{
  \begin{aligned}[]
    [w]_{i-n} && \text{if $i \geq n$} \\
    0 && \text{otherwise}
  \end{aligned} \right.
$$

In image, this gives the following word transformer:

<figure>
<img src="files/lsl.png" style="height:170px;"></img>
</figure>

It is easy to check that left-shifting $w$ by $k$ amounts to multiplying (the canonical presentation of) $w$ by $2^n$:

$$
\begin{aligned}
  2^n \cdot w
    &= 2^n \sum_i [w]_i \cdot 2^i
     = \sum_i [w]_i \cdot 2^{i+n} \\
    &= \sum_{i \geq n} [w]_{i-n} \cdot 2^i
     = \sum_{i} [w \lsl n]_i \cdot 2^i
     = w \lsl n .
\end{aligned}
$$

## Right shift

The logical right shift (`w >> n`) is the operation that shifts all the bits of its operand `b` to the right (i.e. toward the LSB), inserting `0`s on the left. This operation is parameterized by the number `n` of bits that are shifted.

$$
\DeclareMathOperator{\rsb}{rsb}
[w \lsr n]_i = [w]_{i+n}
$$

It is easy to check that left-shifting $w$ by $k$ amounts to dividing (the canonical presentation of) $w$ by $2^n$:

$$
\begin{aligned}
  \left\lfloor \frac{w}{2^n} \right\rfloor
    &= \left\lfloor \frac{\sum_i [w]_i \cdot 2^i}{2^n} \right\rfloor
     = \sum_{i \geq n} [w]_i \cdot 2^{i-n} \\
    &= \sum_i [w]_{i+n} \cdot 2^i
     = \sum_i [w \lsr n]_i \cdot 2^i
     = w \lsr n .
\end{aligned}
$$

## Bit hacking

In this section, we are going to see how we can manipulate the bits of a word one by one. For any position $i$, we are interested of being able to set the $i$-th bit (i.e. to set its value to $1$), to clear the $i$-th bit (i.e. set its value to $0$) and to toggle the $i$-nth bit of a given word.

First, note that if we have a single bit $b$, we can force it to $0$ (using $b \band 0$), force it to $1$ (using $b \bor 1$) and toogle its value (using $b \bxor 1$). We now want to lift these operations to words.

### Setting a bit

Say we want to set the $i$-th bit of a word $w$. We just saw that we can do this by $\bor$-ing the $i$-th bit of $w$ with $1$:

<figure>
<img src="files/bitset1.png" style="height:150px;"></img>
</figure>

Now, we need to find values for the `?` in $m$ s.t. $[w \bor m]_j = [w]_j$ for any $j \ne i$. Since $0$ is a neutral element for $\bor$ (you can check in the table above), it is sufficient to set all the `?` to $0$:

<figure>
<img src="files/bitset2.png" style="height:150px;"></img>
</figure>

We see that the final result $w \bor m$ is exactly $w$ where the $i$-th bit has been forced to $1$. We now need a way to express the mask $m$, i.e. the word where all the bits are $0$ but the $i$-th one. One way to achieve this is to take the word `0b1` (i.e. the word with only its LSB set to $1$) and to shift it to the left by $i$:

<figure>
<img src="files/bitset-mask.png" style="height:100px;"></img>
</figure>

If we sum up, the word $w \bor (\bits{1} \lsl i)$ is exactly the word $w$ where the $i$-th bit has been forced to $1$.

### Toggling a bit

Toggling a bit is not much different since xoring a bit with $1$ flip it and $0$ is a neutral element for $\bxor$: the word $w \bxor (\bits{1} \lsl i)$ is then exactly the word $w$ where the $i$-th bit has been toggled to $1$.

### Clearing a bit

For clearing a bit, we are going to do a similar reasoning. Say we want to clear the $i$-th bit of a word $w$. We saw in the introduction that we can do this by $\band$-ing the $i$-th bit of $w$ with $0$:

<figure>
<img src="files/bitclear1.png" style="height:150px;"></img>
</figure>

Now, we need to find values for the `?` in $m$ s.t. $[w \band m]_j = [w]_j$ for any $j \ne i$. Since $1$ is a neutral element for $\band$ (you can check in the table above), it is sufficient to set all the `?` to $1$:

<figure>
<img src="files/bitclear2.png" style="height:150px;"></img>
</figure>

We see that the final result $w \band m$ is exactly $w$ where the $i$-th bit has been forced to $0$. We now need a way to express the mask $m$, i.e. the word where all the bits are $0$ but the $i$-th one. One way to achieve this is to take the word `0b1` (i.e. the word with only its LSB set to $1$),to shift it to the left by $i$ and to (bitwise) negate it:

<figure>
<img src="files/bitclear-mask.png" style="height:150px;"></img>
</figure>

If we sum up, the word $w \band (\bnot (\bits{1} \lsl i))$ is exactly the word $w$ where the $i$-th bit has been forced to $0$.

### In Python

Finally, we give below the implementation of the bit manipulation functions:

In [6]:
def bit_set(w, i):
    return w | (0b1 << i)

def bit_clear(w, i):
    return w & (~ (0b1 << i))

def bit_toggle(w, i):
    return w ^ (0b1 << i)

i = 0b10101010
print(binpad(bit_set   (i, 1), 8))
print(binpad(bit_set   (i, 2), 8))
print(binpad(bit_clear (i, 1), 8))
print(binpad(bit_clear (i, 2), 8))
print(binpad(bit_toggle(i, 1), 8))
print(binpad(bit_toggle(i, 2), 8))

0b10101010
0b10101110
0b10101000
0b10101010
0b10101000
0b10101110


## 2-complement representation

We here assume words of fixed size $n$. To represent signed integers, the $2$-complement representation is most often used: a non-negative integer $k$ is represented using base $2$ as seen above with $n-1$ bits (hence, $k$ must be strictly smaller than $2^{n-1}$), while the (strictly) negative integer $-k$ is represented using $2^n - k$. In the last case, we need $2^n - k$ to be greater or equal than $2^{n-1}$ (otherwise, the representation would also represent the non-negative integer $2^n-k$), hence, we need $1 \leq k \leq 2^{n-1}$. In consequence, when using words of size $n$, in the $2$-complement, it is possible to represent values in the range $[-2^{n-1} \cdots 2^{n-1}-1]$ --- This interval is not symmetrical because there is only one representation of $0$.

For example, for $n=4$, `0b0101` is the representation of the canonical number $2^0 + 2^2 = 5 < 2^{n-1}$, and so, is the representation of the non-negative number $5$ in the $2$-complement representation. On the other hand, `0b1010` is the canonical representation of the natural number $2^3 + 2^1 = 10 > 2^{n-1}$, and so is the $2$-complement representation of $-k$ s.t. $2^n-k = 10$, i.e. is the $2$-complement representation of $-6$.

If you look carefully as the animation above, there is something remarkable: the MSB of the $2$-complement representation of a non-negative integer is $0$, whereas the MSB of the $2$-complement representation of a negative integer is $1$. (You can also try to prove that the MSB is always $0$ (resp. $1$) for non-negative numbers (resp. negative numbers.)) This is why in the $2$-complement representation, we call it the *sign bit*.

There is a simple way to calculate the $2$-complement of an integer: just invert all its bits and add $1$ to the result. Indeed:

$$
\begin{aligned}
  2^n - k
    &= \left( 1 + \sum_{i < n} 2^i \right) - \sum_{i < n} [k]_i \cdot 2^i
     = 1 + \sum_{i < n} (1 - [k]_i) \cdot 2^i \\
    &= 1 + \sum_{i < n} [\bnot k]_i \cdot 2^i
     = 1 + (\bnot k) .
\end{aligned}
$$

This remark leads to the following equation: $k + \bnot k = -1$. This last equality explains why Python represents `~ 0b10101010` as `-0b10101011`. Indeed, since integers in Python are of arbitrary size, when we apply bitwise not to `0b10101010`, we obtain an integer whose bit representation has the reverse of the bits of the original number. This means that all the most significant bits are `1`s... and there is an infinite number of them! Fortunately, we can use the equation $k + \bnot k = -1$ of the $2$-complement representation and define $\bnot k$ as $-(k+1)$, and this is how Python defines bitwise negation. For that purpose, instead of using the MSB to store the sign bit, Python just stores the sign bit separately and defines $\bnot k$ as the negation of $k+1$. If we come back to our example, we have that `~ 0b10101010` is the opposite of `0b10101010 + 1`, i.e. the opposite of `0b10101011`, i.e. `-0b10101011` (here, the sign `-` tells that the sign bit is on).