# Block Codes, Weight, and Distance
`ECFF` `1992` Pretzel, Oliver. _Error-Correcting Codes and Finite Fields_. Oxford University Press Applied Mathematics and Computing Science Series.
```{contents}
```

---
---
---

How to assess the performance of a code over a given channel? The basis for this assessment is the Hamming distance which is the number of places in which two words differ.

The worst-case error-processing performance of a code is determined by the minimum distance between code words. Then some elementary probability theory can be used to assess the performance of a code.

---
---
---

## Block Codes

```txt
                                                         +--------------------------+
                                                         |                          |
                                                         |  Random error generator  |
                                                         |                          |
                                                         +--------------------------+
                                                                      |                                                            +-----------+
                                                                      |                                                            |           |  Message
           +----------------+           +-----------+                 |                      +-------------------+---------------->|  Decoder  |----------->
  Message  |                |  Message  |           |  Signal         v   Distorted signal   |                   |  code words u'  |           |  words x'
---------->|  Preprocessor  |---------->|  Encoder  |--------------->|+|-------------------->|  Error processor  |                 +-----------+
  string   |                |  words x  |           |  code words u       received words v   |                   |
           +----------------+           +-----------+                                        +-------------------+----------------------------------------->
                                                                                                                    v + error signal
```

### Definition: block

<fieldset style="border: 0.5px solid #0096FF; border-radius: 5px; margin: 0px 0px 15px 0px; padding: 15px 20px 0px 20px;">

The encoder's preprocessor divides the input message of arbitrary length into blocks (words) of a fixed number of symbols $m$ and then the encoder translates each word into a codeword of fixed length $n$. Such a code is called a block code.

<div class="full-width" style="color: #0096FF;">

<b>Definition: $A$-word</b>

If $A$ is an alphabet then an $A$-word of length $n$ is a sequence of $n$ symbols from $A$.

1. The set of $A$-words of length $n$ is denoted by $A^n$.

2. If $A$ has $q$ symbols then there are $q$ choices for the symbol in each place in an $A$-word of length $n$ and so the total number of such words is $q^n$.

3. Thus $|A^n| = |A|^n = q^n$.

</div>
</fieldset>

### Definition: block code

<fieldset style="border: 0.5px solid #0096FF; border-radius: 5px; margin: 0px 0px 15px 0px; padding: 15px 20px 0px 20px;">

<div class="full-width" style="color: #0096FF;">

<b>Definition: Block Code</b>

An $(n, m)$-block code $C$ over the alphabet $A$ of size $q$ consists of a set of precisely $q^m$ code words in $A^n$.

$n$ is called the code's <b><i>block length</i></b>.

$m$ is called the code's <b><i>rank</i></b>.

$m/n$ is called the code's <b><i>rate</i></b> and satisfies $m \le n \iff m/n \le 1$

</div>


We require precisely $q^m$ codewords to ensure that an encoder exists. There are $q^m$ possible message words and each must correspond to a distinct code word. Any further code words are not used and may as well be discarded.

For example, a binary code of rank $m$ must have $2^m$ code words.

</fieldset>

### Definition: encoder

<fieldset style="border: 0.5px solid #0096FF; border-radius: 5px; margin: 0px 0px 15px 0px; padding: 15px 20px 0px 20px;">

<div class="full-width" style="color: #0096FF;">

<b>Definition: Encoder</b>

An <b><i>encoder</i></b> $E$ for $C$ is a map from $A^m$ to $C$.

$\boxed{E : A^m \to C \quad\quad x \mapsto u}$

It translates any $A$-word $x$ of length $m$ into a code word $u = E(x)$. The encoder is a bijection so that each message word corresponds to one and only one codeword and each codeword represents a unique message word.

</div>

Often, the encoder preserves the message word $x$ as the first part of the code word $u = E(x)$. Such an encoder is called standard or systematic, in which case the code word is divided into message symbols and check symbols and the decoder simply strips the check symbols.

</fieldset>

### Definition: decoder

<fieldset style="border: 0.5px solid #0096FF; border-radius: 5px; margin: 0px 0px 15px 0px; padding: 15px 20px 0px 20px;">

<div class="full-width" style="color: #0096FF;">

<b>Definition: Decoder</b>

The corresponding <b><i>decoder</i></b> $D$ is the inverse map of $E$.

$\boxed{D : C \to A^m \quad\quad u \mapsto x}$

It takes every code word $u = E(x)$ back to $x$.

</div>
</fieldset>

---

<div style="font-size: 18px">

In a <b><i>binary symmetric channel</i></b> the error-processing capabilities of the coding system <u>do not depend on the encoder and decoder but only on the set of code words</u> because these are all that the channel sees; the choice of encoder and decoder is thus only a matter of practical convenience.

Most of coding theory is concerned with (1) the construction of codes $C$ and (2) efficient error processors.

</div>

---

### Definition: received word

<fieldset style="border: 0.5px solid #0096FF; border-radius: 5px; margin: 0px 0px 15px 0px; padding: 15px 20px 0px 20px;">

When errors occur in transmission, the receiver reads a word $v$ although the transmitter sent a word $u$.

<div class="full-width" style="color: #0096FF;">

<b>Definition: Received Word</b>

If $u = (u_1, u_2, \dotsc, u_n)$ and $v = (v_1, v_2, \dotsc, v_n)$ are words in $A^n$ then we refer to $u_j$ as the entry of $u$ in place $j$, and we say that $v$ differs from $u$ in place $j$ if $u_j \ne v_j$.

The word $v$ to be analyzed by the error processor is called the <b><i>received word</i></b>.

</div>

(In this context the words <i>position</i> and <i>location</i> are synonyms for the word <i>place</i>.)

</fieldset>

### Definition: error of weight $k$

<fieldset style="border: 0.5px solid #0096FF; border-radius: 5px; margin: 0px 0px 15px 0px; padding: 15px 20px 0px 20px;">

<div class="full-width" style="color: #0096FF;">

<b>Definition: Error of weight $k$</b>

If the received word $v$ differs from the transmitted one in $k$ places we say that an <b><i>error of weight $k$</i></b> occurred (or that $k$ errors occurred).

</div>

</fieldset>

<fieldset style="border: 0.5px solid #50C878; border-radius: 5px; margin: 0px 0px 15px 0px; padding: 15px 20px 0px 20px;">

<div class="full-width" style="color: #50C878;">

Suppose $u$ is transmitted and $v$ is received.

$
\begin{aligned}
u &= (1, \textcolor{red}{0}, 0, 1, \textcolor{red}{1}, 0) \\
v &= (1, \textcolor{red}{1}, 0, 1, \textcolor{red}{0}, 0) \\
\end  {aligned}
$

Then an error of weight $2$ has occurred.

</div>

</fieldset>

### Definition: Hamming distance and weight

<fieldset style="border: 0.5px solid #0096FF; border-radius: 5px; margin: 0px 0px 15px 0px; padding: 15px 20px 0px 20px;">

It is useful to formalize the notion of differing places between words by regarding the number of places in which two words differ as a distance between them.

<div class="full-width" style="color: #0096FF;">

<b>Definition: Hamming distance and weight</b>

The <b><i>Hamming distance</i></b> $d(u, v)$ between two words $u$ and $v$ is the number of entries in which they differ.

The <b><i>Hamming weight</i></b> $\text{wt}(u)$ of $u$ is the number of non-null entries in $u$.

If $\textbf{0} = (0, \dotsc, 0)$ then $\text{wt}(u) = d(u, \textbf{0})$.

</div>

</fieldset>

### Definition: distance function

<fieldset style="border: 0.5px solid #0096FF; border-radius: 5px; margin: 0px 0px 15px 0px; padding: 15px 20px 0px 20px;">

The Hamming distance satisfies the properties of a distance function.

<div class="full-width" style="color: #0096FF;">

<b>Definition: Distance Function</b>

A function $f(x, y)$ on pairs of elements of a set $X$ is a distance function if it satisfies the following conditions.

1. $f(x, y)$ is always a non-negative real number.
2. $f(x, y) = 0$ iff $x = y$.
3. $f(x, y) = f(y, x)$
4. Triangle Inequality: For any three elements $x, y, z \in X$ it is the case that $f(x, z) \le f(x, y) + f(y, z)$

</div>

<b>Proof</b>

Conditions 1, 2, and 3 follow from the definition of the Hamming distance. Condition 4:

Let

$
\begin{aligned}
x &= (x_1, \dotsc, x_n) \\
y &= (y_1, \dotsc, y_n) \\
z &= (z_1, \dotsc, z_n) \\
\end  {aligned}
$

Then $d(x, z)$ is the number of places in which $x$ and $z$ differ. If we denote the set of these places by $U$ then

$d(x, z) = |U| = |\{ i \mid x_i \ne z_i \}|$

Let

$
\begin{aligned}
S &= \{ i \mid x_i \ne z_i \land x_i =   y_i \} \\
T &= \{ i \mid x_i \ne z_i \land x_i \ne y_i \} \\
\end  {aligned}
$

Then $U$ is the disjoint union of $S$ and $T$. Thus

$d(x, z) = |S| + |T|$

It is immediate from the definition of $d(x, y)$ that $x$ differs from $y$ in all the places in $T$. Thus

$|T| \le d(x, y)$

On the other hand if $i \in S$ then $y_i = x_i \ne z_i$. Thus

$|S| \le d(y, z)$

$d(x, z) = |S| + |T| \le d(x, y) + d(y, z)$

$\blacksquare$

</fieldset>

---
---
---

## Shannon's Theorem

the theoretical optimum for average coding performance

---
---
---