# Noisy Typewriter Channel
## Task
Consider the _noisy typewriter channel_, mapping
$$
 \mathcal{A}_Z = \{A, B, C, \dots , Y, Z, -\}
$$
with $|\mathcal{A}_Z| = 27$, into $\mathcal{A}_Y = \mathcal{A}_Z$, where each letter is mapped with equal probabilities into the preceding, the following or the same letter (p. 41-44 of the notes). Design an efficient code by which to reliably send symbols from $\mathcal{A}_X = \mathcal{A}_Z$ through the channel (i.e., you should be able to send and retrieve a text using the 27 symbols with no error).

## Introduction, the noisy typewriter channel
In the noisy typewriter channel symbols from the English alphabet plus the space $-$ symbol are mapped onto the same alphabet, with the following probabilistic rule: each letter is mapped with equal probability onto the same letter, the preceding letter or the following letter. In other words, we have:
$$
P(Y=\texttt{A}|Z=\texttt{B})=1/3, \quad P(Y=\texttt{B}|Z=\texttt{B})=1/3, \quad P(Y=\texttt{C}|Z=\texttt{B})=1/3 
$$
$$
P(Y=\texttt{B}|Z=\texttt{C})=1/3, \quad P(Y=\texttt{C}|Z=\texttt{C})=1/3, \quad P(Y=\texttt{D}|Z=\texttt{C})=1/3 
$$
Note that the preceding symbol of $\texttt{A}$ is $-$ and the following symbol of $-$ is $\texttt{A}$. The action of the channel can be represented graphically as follows:

![ntw1](images/noisyt1.png)

For this channel, we can immediately derive a $(\log 9,1)$ block code. In such code, we use the keywords:
$$
S = \{\texttt{B}, \texttt{E}, \texttt{H}, \dots\}
$$
with $2^K=|S|$ and $K=\log 9$ (non-integer). This codes uses one every three letters as a codeword. This code uses a set of _non-confusable codewords_, i.e. codewords $z_i$ that are mapped onto disjoint subsets of $\mathcal{A}_y$ by the action of the channel.

![ntw2](images/noisyt2.png)

This implies that:
$$
\forall \mathbf{y} \in \mathcal{A}_y^{(n)}, \quad \exists \text{ unique } \textit{i},  P_{Z}(z_i|\mathbf{y}) > 0
$$
which ensures that $z_i$ can be decoded with no error from $\mathbf{y}$. This allows to reliably send information at a rate $\log |S|/n$. For the noisy typewriter channel, the rate is $R=2\log 3$, which can be proven it corresponds to the channel capacity. Note that if we have a set of non-confusable keywords, $n_Y = n_Z n_{Y|Z}$ and 
$$
n_{Z|Y} = \frac{n_Z n_{Y|Z}}{n_Y} = 1 \quad \Longleftrightarrow \quad H[Z|Y] = H[Z]+H[Y|Z]-H[Y] = 0 
$$

For the noisy typewriter example, we can create a code of length $n=1$, selecting $A_X={B,E,H,\dots}$ (every third letter). These codewords are _non-confusable_ (they cannot lead to the same received message under noise). In this case, it will be trivial to decode the output. We have $n=1$ and $K=\log 9$, so $R=\log 9$ which corresponds to the channel capacity $C$.

## Shannon's noisy channel coding theorem
Given a channel, for large enough $n$ there exists an ($nR,n$) block code (with capacity R) whenever $R<C$. This general results rests upon the fact that sequences become typical in the limit $n \to \infty$.

# Solution 1
In this exercise we want reliably send symbols from $\mathcal{A}_X = \mathcal{A}_Z$ through the channel, using all 27 symbols with no error. We can base an efficient and reliable code by using a subset of non-confusable inputs, e.g.
$$
S = \{\texttt{B}, \texttt{E}, \texttt{H}, \texttt{K}, \texttt{N}, \texttt{Q}, \texttt{T}, \texttt{W}, \texttt{Z}\}
$$
with $|S| = 9$. Since there are 81 sequences of 2 symbols from $S$, $|S^{(2)}| = 81$ and $81 > 27$, we can just select any subset of 27 pairs of symbols from $S$, i.e.
$$
S' \subset S^{(2)}
$$
with $|S'| = 27$, to reliably encode the input.

For instance, we can divide the 27 letters into 3 groups of 9 letters:
$$
g_1 = \{\texttt{A}, \texttt{B}, \dots, \texttt{I}\}, \quad g_2 = \{\texttt{J}, \texttt{K}, \dots, \texttt{R}\}, \quad g_3=\{\texttt{S}, \texttt{T}, \dots, -\} 
$$
and use the first letter of the code to encode for the group, i.e.
$$
g_1 \rightarrow B, \quad g_2 \rightarrow E, \quad g_3 \rightarrow H
$$
and the second letter of the code to encode for the specific letter within the
group:
$$
\begin{align*}
    \{\texttt{A}, \texttt{B}, \dots, \texttt{I}\} &\rightarrow \{\texttt{B}, \texttt{E}, \dots, \texttt{Z}\} \\
    \{\texttt{J}, \texttt{K}, \dots, \texttt{R}\} &\rightarrow \{\texttt{B}, \texttt{E}, \dots, \texttt{Z}\} \\
    \{\texttt{S}, \texttt{T}, \dots, \texttt{-}\} &\rightarrow \{\texttt{B}, \texttt{E}, \dots, \texttt{Z}\} 
\end{align*}
$$

arriving at the two-letter code $E: \mathcal{A}_X \rightarrow S^{(2)}$.

Notice that the average length of this code is 2, irrespective of the input distribution. The rate is:
$$
    \mathcal{R} = \frac{K}{2} = \frac{\log{27}}{2} = \frac{3\log 3}{2}
$$

## Implementation

In [1]:
import numpy as np
import string

In [2]:
list_letters = list(string.ascii_uppercase)+['-']
non_confusable_kw = [list_letters[i] for i in np.arange(1, 27, 3)]

print('There are %i symbols in the alphabet:\n' %len(list_letters), list_letters)
print('\nNon-confusable keywords:', non_confusable_kw)


def noisy_typewriter(letter_in):
    '''
    letter_in:  input letter, must be a string
    letter_out: output letter, after passing through the noisy channel
    '''
    
    # convert to upper
    # letter_in = letter_in.upper() # not needed, input of the encoding is always uppercase
      
    # channel action  
    idx_in = list_letters.index(letter_in)
    
    channel_action = np.random.choice([-1, 0, 1])
    idx_out = idx_in + channel_action
          
    idx_out = (idx_out+27)%27
    letter_out = list_letters[idx_out]
        
    return letter_out
    
    
def encoding_1(letter_in):
    '''
    letter_in: input letter, must be a string
    code_out:  output code of the encoding, to be passed through the noisy channel
    '''
    
    group_letters = ['B', 'E', 'H']
    
    # convert to upper
    letter_in = letter_in.upper()
        
    # encoding
    idx_in = list_letters.index(letter_in)
    
    if (0 <= idx_in < 9):
        group = 1        
    elif (9 <= idx_in < 18):
        group = 2
    elif (18 <= idx_in < 27):
        group = 3
    else:
        raise Exception('Something wrong in idx_in')

    idx_out = idx_in - (group-1)*9
    code_out = [group_letters[group-1], non_confusable_kw[idx_out]]
    
    return code_out
    

def decoding_1(code_in):
    '''
    code_in: input code, must be a list of strings, passed through the channel
    letter_out: decoding output, single letter
    '''
        
    group_index = list_letters.index(code_in[0])//3    
    nonconf_index = list_letters.index(code_in[1])//3   
     
    idx_out = group_index*9 + nonconf_index
    
    letter_out = list_letters[idx_out]
    
    return letter_out
    
    
def simulation_1(letter_in):
    enc = encoding_1(letter_in)
    y = []
    
    for char in enc:
        y.append(noisy_typewriter(char))

    dec = decoding_1(y)
    
    return letter_in, enc, y, dec

There are 27 symbols in the alphabet:
 ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', '-']

Non-confusable keywords: ['B', 'E', 'H', 'K', 'N', 'Q', 'T', 'W', 'Z']


In [3]:
# test the channel
tests = ['A', 'D', '-']
test_results = {}

for test in tests:
    outputs = []
    
    for i in range(100):
        output_test = noisy_typewriter(test)
        outputs.append(output_test)

    test_results[test] = list(set(outputs))
        
print('Noisy Typewriter Channel: \n')
for key in test_results:
    print(key, '->', sorted(test_results[key]))

Noisy Typewriter Channel: 

A -> ['-', 'A', 'B']
D -> ['C', 'D', 'E']
- -> ['-', 'A', 'Z']


In [4]:
# test the encoding
print('Encoding 1: \n')

enc_list = []
for letter in list_letters:
    enc = encoding_1(letter)
    enc_list.append(enc)
    
    print(letter, '->', enc)

Encoding 1: 

A -> ['B', 'B']
B -> ['B', 'E']
C -> ['B', 'H']
D -> ['B', 'K']
E -> ['B', 'N']
F -> ['B', 'Q']
G -> ['B', 'T']
H -> ['B', 'W']
I -> ['B', 'Z']
J -> ['E', 'B']
K -> ['E', 'E']
L -> ['E', 'H']
M -> ['E', 'K']
N -> ['E', 'N']
O -> ['E', 'Q']
P -> ['E', 'T']
Q -> ['E', 'W']
R -> ['E', 'Z']
S -> ['H', 'B']
T -> ['H', 'E']
U -> ['H', 'H']
V -> ['H', 'K']
W -> ['H', 'N']
X -> ['H', 'Q']
Y -> ['H', 'T']
Z -> ['H', 'W']
- -> ['H', 'Z']


In [5]:
# test the decoding, perfect case, no channel action
print('Decoding 1, only encoded sequences: \n')
for enc in enc_list:
    dec = decoding_1(enc)
    
    print(enc, '->', dec)
    
# test the decoding, general case
print('\nDecoding 1, any sequence: \n')
tests = [['A', 'A'], ['A', 'E'], ['H', '-']]
for test in tests:
    dec = decoding_1(test)
    
    print(test, '->', dec)


Decoding 1, only encoded sequences: 

['B', 'B'] -> A
['B', 'E'] -> B
['B', 'H'] -> C
['B', 'K'] -> D
['B', 'N'] -> E
['B', 'Q'] -> F
['B', 'T'] -> G
['B', 'W'] -> H
['B', 'Z'] -> I
['E', 'B'] -> J
['E', 'E'] -> K
['E', 'H'] -> L
['E', 'K'] -> M
['E', 'N'] -> N
['E', 'Q'] -> O
['E', 'T'] -> P
['E', 'W'] -> Q
['E', 'Z'] -> R
['H', 'B'] -> S
['H', 'E'] -> T
['H', 'H'] -> U
['H', 'K'] -> V
['H', 'N'] -> W
['H', 'Q'] -> X
['H', 'T'] -> Y
['H', 'W'] -> Z
['H', 'Z'] -> -

Decoding 1, any sequence: 

['A', 'A'] -> A
['A', 'E'] -> B
['H', '-'] -> -


In [6]:
# perform some simulations of the complete process
to_transmit = ['A', 'B', 'C', 'D', '-']

print('Simulation with encoding 1:\n')
for letter in to_transmit:
    sim = simulation_1(letter)
    print(sim[0], '->', sim[1], '--- Noisy Channel --->', sim[2], '->', sim[3])


Simulation with encoding 1:

A -> ['B', 'B'] --- Noisy Channel ---> ['C', 'C'] -> A
B -> ['B', 'E'] --- Noisy Channel ---> ['B', 'E'] -> B
C -> ['B', 'H'] --- Noisy Channel ---> ['A', 'G'] -> C
D -> ['B', 'K'] --- Noisy Channel ---> ['C', 'L'] -> D
- -> ['H', 'Z'] --- Noisy Channel ---> ['H', 'Z'] -> -


# Solution 2
In the previous solution we had that the rate $\mathcal{R}$ was equal to $3/2\cdot\log 3$. However, notice that the channel capacity is:
$$
    C = \max_Z I[Z:Y] = 2 \log 3 > R = \frac{3}{2}\log 3
$$
which is more than the rate obtained with solution 1. This means that when using the channel at its capacity, we should use 1.5 letters to encode a single letter (not 2). We know that we can achieve the capacity by encoding sequences of messages $\mathcal{A}^{(m)}_X$ into non-confusable sequences of inputs strings $\mathcal{A}^{(n)}_Z$, and in the limit $n\to \infty$, we should communicate information at the channel capacity. 

In this simple case it is sufficient to consider $m = 2$, $n = 3$. For $n = 3$, we can consider the set of sequences of 3 non-confusable input symbols:
$$
    z^{(3)}\in S^{(3)}
$$
with $|S^{(3)}| = 9^3 = 729$. These sequences are sufficient to reliably encode $27^2 = 729$ symbols $x^{(2)} \in \mathcal{A}^{(2)}_X$. In other words, we can map a pair of letters $X^{(2)}\in \mathcal{A}_X$ from the 27-letter alphabet into a triple of letters from $S^{(3)}$, e.g.
$$
AA\to BBB, \quad AB\to BBE, \quad AC\to BBH, \quad \dots \quad -U\to ZZT, \quad -Z\to ZZW, \quad --\to ZZZ
$$

This is $(n,K)$ block code with $n = 3$ and $K = \log 729 = 9 \log 3$. Hence the rate is:
$$
    R = \frac{K}{n} = 2\log 3 = C
$$
achieving the channel capacity. In this code, we use only 1.5 letters (on average) to encode a letter.


## Implementation 

In [7]:
def divisione_con_resto(a, b):
    return a//b, a%b

In [8]:
def encoding_2(letters_in):
    '''
    letter_in: list of 2 input letters
    code_out:  output code of the encoding, to be passed through the noisy channel; it is a list of 3 non-confusable keywords
    '''
    
    # convert to upper
    letters_in = [l.upper() for l in letters_in]
        
    # encoding
    idx_in = list_letters.index(letters_in[0])*27+list_letters.index(letters_in[1])
    
    idx_in, digit_3 = divisione_con_resto(idx_in, 9)
    idx_in, digit_2 = divisione_con_resto(idx_in, 9)
    idx_in, digit_1 = divisione_con_resto(idx_in, 9)
    
    if idx_in != 0:
        raise Exception('Something wrong in idx_in')

    code_out = []
    code_out.append(non_confusable_kw[digit_1])
    code_out.append(non_confusable_kw[digit_2])
    code_out.append(non_confusable_kw[digit_3])

    return code_out
    

def decoding_2(code_in):
    '''
    code_in: input code, must be a list of 3 elements, passed through the channel
    letter_out: decoding output, list of 2 letters
    '''
    
    digit_1_in = list_letters.index(code_in[0])//3   
    digit_2_in = list_letters.index(code_in[1])//3    
    digit_3_in = list_letters.index(code_in[2])//3    
    
    idx_out = digit_3_in*1 + digit_2_in*9 + digit_1_in*9**2
    
    idx_out, digit_2_out = divisione_con_resto(idx_out, 27)
    idx_out, digit_1_out = divisione_con_resto(idx_out, 27)
    
    if idx_out != 0:
        raise Exception('Something wrong in idx_out')
     
    letter_out = []
    letter_out.append(list_letters[digit_1_out])
    letter_out.append(list_letters[digit_2_out])
    
    return letter_out


def simulation_2(letters_in):
    enc = encoding_2(letters_in)
    y = []
    
    for char in enc:
        y.append(noisy_typewriter(char))

    dec = decoding_2(y)
    
    return letters_in, enc, y, dec

In [9]:
# test the encoding
print('Encoding 2: \n')

tests = [['A', 'A'], ['A', 'B'], ['A', 'C'], ['-', 'Y'], ['-', 'Z'], ['-', '-']]
enc_list = []
for t in tests:
    enc = encoding_2(t)
    enc_list.append(enc)
        
    print(t, '->', enc)

Encoding 2: 

['A', 'A'] -> ['B', 'B', 'B']
['A', 'B'] -> ['B', 'B', 'E']
['A', 'C'] -> ['B', 'B', 'H']
['-', 'Y'] -> ['Z', 'Z', 'T']
['-', 'Z'] -> ['Z', 'Z', 'W']
['-', '-'] -> ['Z', 'Z', 'Z']


In [10]:
# test the decoding, perfect case, no channel action
print('Decoding 2, only encoded sequences: \n')
for enc in enc_list:
    dec = decoding_2(enc)
    
    print(enc, '->', dec)
    
# test the decoding, general case
print('\nDecoding 2, any sequence: \n')
tests = [['A', 'A', 'B'], ['A', 'B', 'F'], ['-', 'Z', '-']]
for test in tests:
    dec = decoding_2(test)  
    print(test, '->', dec)

Decoding 2, only encoded sequences: 

['B', 'B', 'B'] -> ['A', 'A']
['B', 'B', 'E'] -> ['A', 'B']
['B', 'B', 'H'] -> ['A', 'C']
['Z', 'Z', 'T'] -> ['-', 'Y']
['Z', 'Z', 'W'] -> ['-', 'Z']
['Z', 'Z', 'Z'] -> ['-', '-']

Decoding 2, any sequence: 

['A', 'A', 'B'] -> ['A', 'A']
['A', 'B', 'F'] -> ['A', 'B']
['-', 'Z', '-'] -> ['-', '-']


In [11]:
# perform some simulations of the complete process
to_transmit = [['A', 'A'], ['A', 'B'], ['A', 'C'], ['-', 'Y'], ['-', 'Z'], ['-', '-']]

print('Simulation with encoding 2:\n')
for letters in to_transmit:
    sim = simulation_2(letters)
    print(sim[0], '->', sim[1], '--- Noisy Channel --->', sim[2], '->', sim[3])

Simulation with encoding 2:

['A', 'A'] -> ['B', 'B', 'B'] --- Noisy Channel ---> ['B', 'A', 'A'] -> ['A', 'A']
['A', 'B'] -> ['B', 'B', 'E'] --- Noisy Channel ---> ['A', 'A', 'E'] -> ['A', 'B']
['A', 'C'] -> ['B', 'B', 'H'] --- Noisy Channel ---> ['C', 'A', 'H'] -> ['A', 'C']
['-', 'Y'] -> ['Z', 'Z', 'T'] --- Noisy Channel ---> ['-', 'Z', 'S'] -> ['-', 'Y']
['-', 'Z'] -> ['Z', 'Z', 'W'] --- Noisy Channel ---> ['-', 'Y', 'V'] -> ['-', 'Z']
['-', '-'] -> ['Z', 'Z', 'Z'] --- Noisy Channel ---> ['-', '-', 'Z'] -> ['-', '-']
