# Slepian Wolf Example

In [1]:
import numpy as np
rng = np.random.default_rng()


Consider two correlated binary vectors $\bm{X}_1$ and $\bm{X}_2$ containing $7$ bits each.

- The bits $X_{1,i}$ in $\bm{X}_1$ are independently identically distributed (i.i.d.) with ${P(X_{1,i} = 1) = \frac12} \ \forall i\in\{1,\dots,7\}$.
- Given $\bm{X}_1$, the vector $\bm{X}_2$ is defined as ${\bm{X}_2 = \bm{X}_1 \oplus \bm{I}}$, where $\bm{I}$ follows a uniform distribution over all length seven binary vectors containing at most a single $1$.

The operator $\oplus$ denotes the binary addition with $0\oplus0 = 1\oplus1 = 0$ and $0\oplus1=1\oplus0=1$ (`numpy` equivalent is the `^` 'XOR' operator).

Transmitter TX1 transmits $\bm{X}_1$ to a receiver RX via a noiseless and interference free channel at rate $R_1$.
Simultaneously, TX2 transmits $\bm{X}_2$ to RX at rate $R_2$.
Note, that $\bm{I}$ is not directly available to TX1, TX2 or RX.

In [2]:
def sample_x1(num_samples: int) -> np.ndarray:
    return rng.integers(0, 2, (num_samples, 7))

def sample_x2(x1: np.ndarray) -> np.ndarray:
    num_samples = len(x1)
    i_pos = rng.integers(0, 8, num_samples)
    i = np.zeros_like(x1)
    for sample_idx, one_pos in zip(range(num_samples), i_pos):
        if one_pos < 7:
            i[sample_idx, one_pos] = 1
    return x1 ^ i

num_samples = 4
x1 = sample_x1(num_samples)
x2 = sample_x2(x1)
print(f'{num_samples} samples of x_1:')
print(x1)
print()
print(f'{num_samples} samples of x_2:')
print(x2)

4 samples of x_1:
[[1 1 1 0 0 0 1]
 [0 1 1 1 0 0 1]
 [0 0 0 0 1 1 1]
 [0 1 0 0 1 0 0]]

4 samples of x_2:
[[1 1 1 0 0 0 1]
 [1 1 1 1 0 0 1]
 [0 0 1 0 1 1 1]
 [1 1 0 0 1 0 0]]


First, let TX1 and TX2 transmit $\bm{X}_1$ and $\bm{X}_2$ assuming they are independent.
They must use $R_1 = R_2 = H(\bm{X}_1) = H(\bm{X}_2) = 7$ (bit).

In [3]:
num_samples = 100
x1_batch = sample_x1(num_samples)
x2_batch = sample_x2(x1_batch)

# TX1
tx1_batch = x1_batch

# TX2
tx2_batch = x2_batch

# Rates

r1 = tx1_batch.size / num_samples
r2 = tx2_batch.size / num_samples
print(f'R_1 = {r1}, R_2 = {r2}')

# RX

x1_hat_batch = tx1_batch
x2_hat_batch = tx2_batch

# Count transmission errors

tx1_errors = np.count_nonzero(x1_batch != x1_hat_batch)
tx2_errors = np.count_nonzero(x2_batch != x2_hat_batch)
print('Bit error counters:')
print(f'  TX1: {tx1_errors} Errors')
print(f'  TX2: {tx2_errors} Errors')

R_1 = 7.0, R_2 = 7.0
Bit error counters:
  TX1: 0 Errors
  TX2: 0 Errors


As $\bm{X}_1$ and $\bm{X}_2$ are correlated, the transmission rates can be reduced if TX2 knows both $\bm{X}_1$ and $\bm{X}_2$ and only transmitts the difference $\bm{I} = \bm{X}_1 \oplus \bm{X}_2$.
In this case we can achieve the rates $R_1 = H(\bm{X}_1) = 7$ bit and $R_2 = H(\bm{X}_2|\bm{X}_1) = H(\bm{I}) = 3$ bit.

In [4]:
num_samples = 100
x1_batch = sample_x1(num_samples)
x2_batch = sample_x2(x1_batch)

def one_hot_to_bit_vector(one_hot: np.ndarray) -> np.ndarray:
    if np.all(one_hot == 0):
        return np.array([1, 1, 1], dtype=int)
    position = np.argmax(one_hot)
    bit_encoding = {0: [0, 0, 0], 1: [0, 0, 1], 2: [0, 1, 0], 3: [0, 1, 1], 4: [1, 0, 0], 5: [1, 0, 1], 6: [1, 1, 0]}
    return np.array(bit_encoding[position], dtype=int)

def bit_vector_to_one_hot(bit_vector: np.ndarray) -> np.ndarray:
    position = np.sum(bit_vector * 2**np.arange(3)[::-1])
    zeros = np.zeros(7, dtype=int)
    if position < 7:
        zeros[position] = 1
    return zeros


# TX1
tx1_batch = x1_batch

# TX2
i_batch = x1_batch ^ x2_batch
tx2_batch = np.array([one_hot_to_bit_vector(i) for i in i_batch])

# Rates

r1 = tx1_batch.size / num_samples
r2 = tx2_batch.size / num_samples
print(f'R_1 = {r1}, R_2 = {r2}')

# RX

x1_hat_batch = tx1_batch
i_hat_batch = np.array([bit_vector_to_one_hot(tx2) for tx2 in tx2_batch])
x2_hat_batch = i_hat_batch ^ x1_hat_batch

# Count transmission errors

tx1_errors = np.count_nonzero(x1_batch != x1_hat_batch)
tx2_errors = np.count_nonzero(x2_batch != x2_hat_batch)
print('Bit error counters:')
print(f'  TX1: {tx1_errors} Errors')
print(f'  TX2: {tx2_errors} Errors')

R_1 = 7.0, R_2 = 3.0
Bit error counters:
  TX1: 0 Errors
  TX2: 0 Errors


According to the Slepian-Wolf Theorem, we can achieve the rate pair $R_1 = H(\bm{X}_1)$ and $R_2 = H(\bm{X}_2|\bm{X}_1) < H(\bm{X}_2)$ even if $\bm{X}_1$ is *not* known to TX2.
In the special case constructed here we can elegantly demonstrate this using a $(7,4)$ Hamming code.

The following construction is taken from https://en.wikipedia.org/wiki/Distributed_source_coding\#Asymmetric_case

>### Background: The Hamming Code
>
> The $(7,4)$ Hamming code is defined as the set of seven-bit vectors $\bm{x} = (x_1\ x_2\ \dots\ x_7)$ for which
>$$
>x_1 \oplus x_2 \oplus x_4 \oplus x_5 = s_1,\\
>x_2 \oplus x_3 \oplus x_4 \oplus x_6 = s_2,\\
>x_2 \oplus x_3 \oplus x_4 \oplus x_7 = s_3,
>$$
>holds with $s_1 = s_2 = s_3 = 0$. Bit vectors contained in the code are called codewords. The $(7,4)$ Hamming code has a minimum Hamming distance of $d_\mathrm{min} = 3$, which means that any two codewords differ in at least $3$ bits.
>
>For any given bit vector $\bm{x}$, the **syndrome** is defined as $\bm{s} = (s_1\ s_2\ s_3)$. (Codewords have the syndrome $\bm{s} = (0\ 0\ 0)$ by definition.)

In [5]:
def syndrome(bit_vector: np.ndarray) -> np.ndarray:
    x = bit_vector
    return np.array([
        x[0] ^ x[1] ^ x[3] ^ x[4],
        x[0] ^ x[2] ^ x[3] ^ x[5],
        x[1] ^ x[2] ^ x[3] ^ x[6]
    ])

Now, consider the following transmission scheme:

- TX1 transmits $\bm{X}_1\quad\implies R_1 = 7$ bit
- TX2 transmits the syndrome $\bm{S}_2$ of $\bm{X}_2\quad\implies R_2 = 3$ bit

Note that no knowlege of $\bm{X}_1$ or $\bm{I}$ is required to compute the syndrome.

RX must then find possible values for $\bm{I}$ such that $\hat{\bm{X}}_2 = \bm{X}_1 \oplus \bm{I}$ has the syndrome $\bm{S}_2$.
Errors occur if multiple options for $\hat{\bm{X}}_2$ lead to the same syndrome.

In [6]:
num_samples = 100
x1_batch = sample_x1(num_samples)
x2_batch = sample_x2(x1_batch)


# TX1
tx1_batch = x1_batch

# TX2
tx2_batch = np.array([syndrome(x2) for x2 in x2_batch])

# Rates

r1 = tx1_batch.size / num_samples
r2 = tx2_batch.size / num_samples
print(f'R_1 = {r1}, R_2 = {r2}')

# RX

x1_hat_batch = tx1_batch
possible_i = np.eye(8, 7, dtype=int)

x2_hat_batch = []
for x1_hat, received_syndrome in zip(x1_hat_batch, tx2_batch):
    syndromees = np.array([syndrome(x1_hat ^ i) for i in possible_i])
    # choose syndrome with minimum distance to received syndrome
    best_possible_i_idx = np.argmin(np.sum((syndromees - received_syndrome)**2, axis=1))
    i_hat = possible_i[best_possible_i_idx]

    x2_hat = i_hat ^ x1_hat
    x2_hat_batch.append(x2_hat)
x2_hat_batch = np.array(x2_hat_batch)

# Count transmission errors

tx1_errors = np.count_nonzero(x1_batch != x1_hat_batch)
tx2_errors = np.count_nonzero(x2_batch != x2_hat_batch)
print('Bit error counters:')
print(f'  TX1: {tx1_errors} Errors')
print(f'  TX2: {tx2_errors} Errors')

R_1 = 7.0, R_2 = 3.0
Bit error counters:
  TX1: 0 Errors
  TX2: 0 Errors


The simulation above shows that the proposed scheme produces no errors. Hence, the rate pair $R_1 = 7$ bit and $R_2 = 3$ bit is achievable even if both TX1 and TX2 only have access to either $\bm{X}_1$ or $\bm{X}_2$.

**But why does it work?**

We must show that there is never ambiguity between multiple reconstructions of $\bm{X}_2$. This can be shown by contradiction:

- Assumme ambiguity between two reconstructions of $\bm{X}_2$, call them $\bm{b}_1$ and $\bm{b}_2$
- $\implies$ $\texttt{syndrome}(\bm{b}_1) = \texttt{syndrome}(\bm{b}_2) = \bm{S}_2$
- Define $\bm{c}_1 := \bm{b}_1 \oplus \bm{b}_1 = \bm{0}$ and $\bm{c}_2 := \bm{b}_1 \oplus \bm{b}_2$
- Claim: $\bm{c}_1$ and $\bm{c}_2$ are both codewords of the $(7,4)$ Hamming code
    - Inserting $\bm{c}_1 = \bm{0}$ into the definition of the Hamming code immediately shows it is a codeword
    - As both $\bm{b}_1$ and $\bm{b}_2$ have the same syndrome, we can show that the syndrome of $\bm{c}_2$ is $\bm{0}$
        - Illustration for the first syndrome entry:
$$
b_{1,1} \oplus b_{1,2} \oplus b_{1,4} \oplus b_{1,5} = S_{2,1} = b_{2,1} \oplus b_{2,2} \oplus b_{2,4} \oplus b_{2,5} \\[.5em]
b_{1,1} \oplus b_{1,2} \oplus b_{1,4} \oplus b_{1,5}\ \ \oplus\ \ b_{2,1} \oplus b_{2,2} \oplus b_{2,4} \oplus b_{2,5} =b_{2,1} \oplus b_{2,2} \oplus b_{2,4} \oplus b_{2,5}\ \ \oplus\ \ b_{2,1} \oplus b_{2,2} \oplus b_{2,4} \oplus b_{2,5} \\[.5em]
(b_{1,1} \oplus b_{2,1}) \oplus (b_{1,2} \oplus b_{2,2}) \oplus (b_{1,4} \oplus b_{2,4}) \oplus (b_{1,5} \oplus b_{2,5}) = (b_{2,1} \oplus b_{2,1}) \oplus (b_{2,2} \oplus b_{2,2}) \oplus (b_{2,4}  \oplus b_{2,4}) \oplus (b_{2,5}\oplus b_{2,5}) \\[.5em]
c_{2,1} \oplus c_{2,2} \oplus c_{2,4} \oplus c_{2,5} = 0
$$
- As they are valid reconstructions of $\bm{X}_2$, both $\bm{b}_1$ and $\bm{b}_2$ differ in at most $1$ bit from $\bm{X}_1$
- $\implies$ $\bm{b}_1$ and $\bm{b}_2$ differ in at most $2$ bits
- $\implies$ $\bm{c}_1$ and $\bm{c}_2$ differ in at most $2$ bits
- $\implies$ Contradiction: any two codewords in the $(7,4)$ Hamming code differ in at least $3$ bits