# 1. Introduction to Information Theory

## 1.1 Error-correcting codes for the binary symmetric channel

### 1.1.1 The Binary Symmetric Channel
The binary symmetric channel is a model describing a communication channel in which each bit is transmitted correctly with probability $\left(1-f\right)$ and incorrectly with probability $f$. This means 
$$P\left(y=0|x=0\right)=1-f$$
$$P\left(y=0|x=1\right)=f$$
$$P\left(y=1|x=0\right)=f$$
$$P\left(y=1|x=1\right)=1-f$$
When a message is sent through such a channel, the original message will be modified in such a way that it can't be read anymore. There are two solutions for this issue:

1. A physical solution: Improvement of the physical characteristics of the communication channel -> not treated here
2. A 'system' solution -> information and coding theories

A simple way to make the transmission of a message more reliable is by adding redundancy. But how to do it?

**Repetition codes**

This the most straightforward approach. For a source sequence $\mathbb{s} = {0}$, the transmitted sequence might be $\mathbb{t} = {0, 0, 0}$ for the repetition code $R_3$.

Such a function in Python can look like

In [2]:
import numpy as np

def repetition(bits, n):
    bits = np.array(bits)
    if not np.all(np.isin(bits, [0, 1])):
        raise ValueError("All elements in bits must be 0 or 1")
    repeated = np.repeat(bits, n)
    return repeated

s = [0, 0, 1, 0, 1, 1, 0]
repetition(s, 3)

array([0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0])

Using the ```coding``` module, we can simply

In [6]:
from entropy_lab.coding.code import Code
from entropy_lab.coding.encoder import Encoder

s = [0, 0, 1, 0, 1, 1, 0]
n_repetition = 3
code = Code(s)
encoder = Encoder()
repetition_code = encoder.repetition(code, n_repetition)
repetition_code.code_array

array([0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0])

Once a repeated code is generate (we call it the transmitted sequence $\mathbb{t}$), the sequence will travel through a channel. In the case of this example, it is the binary symmetric channel - in other words, each bit in the sequences as a probability of 10% to be flipped to the opposite bit during the transmission. To do so, we can write the following program

In [8]:
import numpy as np

noise_level = 0.1
# generate random flips: True when a flip occurs
flips = np.random.rand(*repetition_code.code_array.shape) < noise_level
# XOR with flips to invert bits
repetition_code.code_array ^ flips.astype(repetition_code.code_array.dtype)


array([0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0])

We can use the custom-made ```BSC``` class to simulate such a channel:

In [9]:
from entropy_lab.systems.binary_symmetric_channel import BSC

noise_level = 0.1
channel = BSC()
received_code = channel.transmit(repetition_code, noise_level)
received_code.code_array

array([0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 0, 1, 1, 1, 0, 0, 0])

The received code is now altered by a small amount and, to recover the origin message, a simple algorithm can be used, which is called the majority vote. For each of the repetitions (3 in our case), we simply recover the majority types of bits for each block of code. We can do the following:

In [10]:
# reshape into blocks of size n_repetition
blocks = received_code.code_array.reshape(-1, n_repetition)
# sum bits in each block
ones_count = blocks.sum(axis=1)
# majority vote
decoded = (ones_count > n_repetition / 2).astype(int)
decoded

array([0, 0, 1, 0, 1, 1, 0])

We can now compare the original code and transmitted, decoded code to verify if they are the same:

In [11]:
code.code_array == decoded

array([ True,  True,  True,  True,  True,  True,  True])

Close enough! This method is not perfect and the more repetition we add, the best it is. For instance, using the class ```Decoder```:

In [12]:
from entropy_lab.coding.code import Code
from entropy_lab.coding.encoder import Encoder
from entropy_lab.coding.decoder import Decoder
from entropy_lab.systems.binary_symmetric_channel import BSC

s = [0, 0, 1, 0, 1, 1, 0]
n_repetition = 10
noise_level = 0.1

code = Code(s)
encoder = Encoder()
decoder = Decoder()
channel = BSC()

repetition_code = encoder.repetition(code, n_repetition)
received_code = channel.transmit(repetition_code, noise_level)
decoded_code = decoder.majority_vote(received_code, n_repetition)

s == decoded_code.code_array



array([ True,  True,  True,  True,  True,  True,  True])

**An example with an image**

To display the power of this simple approach, we will display an example of a black-and-white image going through this channel. We will see if it is possible to recover this image using simply the repetition code and the majority vote. 
