# Connect: Reed-Solomon Error Correcting Codes

**Module 02** | Real-World Connections

*Polynomial evaluation and interpolation over finite fields -- Module 02 concepts -- build the error correction behind QR codes, CDs, and deep-space communication.*

## Introduction

**Reed-Solomon codes** (1960) are one of the most widely deployed error-correcting codes:

- **QR codes**: the black-and-white squares on everything from boarding passes to restaurant menus
- **CDs and DVDs**: withstand scratches by encoding data with redundancy
- **Deep-space communication**: NASA's Voyager probes use Reed-Solomon to transmit data across billions of miles
- **Blockchain**: data availability sampling in Ethereum uses Reed-Solomon

The core idea is surprisingly simple: **encode a message as a polynomial, evaluate it at
extra points for redundancy, and use Lagrange interpolation to recover the original
polynomial even if some evaluations are lost or corrupted.**

Every concept here comes directly from Module 02.

## The Idea: Polynomials as Messages

A message with $k$ symbols becomes the $k$ coefficients of a polynomial of degree $k - 1$
over a finite field. We evaluate this polynomial at $n > k$ points. The $n$ evaluation
results are the **codeword**.

Since a degree-$(k-1)$ polynomial is uniquely determined by any $k$ of its evaluations
(Lagrange interpolation), we can lose up to $n - k$ evaluation points and still recover
the original polynomial (and hence the message).

We work over GF(7) to keep numbers small and visible.

In [None]:
# === Step 1: Encode a message as a polynomial ===
F = GF(7)
R.<x> = PolynomialRing(F)

# Message: [1, 2, 3] -- three symbols from GF(7)
message = [F(1), F(2), F(3)]
k = len(message)  # message length = polynomial degree + 1

# Encode as polynomial: m(x) = 1 + 2x + 3x^2
m_poly = sum(c * x^i for i, c in enumerate(message))
print(f'Message: {[Integer(c) for c in message]}')
print(f'Message polynomial: m(x) = {m_poly}')
print(f'Degree: {m_poly.degree()} (= k - 1 = {k} - 1)')
print()

# Evaluate at n = 6 points (all nonzero elements of GF(7))
n = 6  # codeword length
eval_points = [F(i) for i in range(1, n + 1)]
codeword = [m_poly(pt) for pt in eval_points]

print(f'Evaluation points: {[Integer(p) for p in eval_points]}')
print(f'Codeword (evaluations):')
for pt, val in zip(eval_points, codeword):
    print(f'  m({Integer(pt)}) = {Integer(val)}')

print(f'\nCodeword: {[Integer(v) for v in codeword]}')
print(f'\nWe encoded {k} message symbols into {n} codeword symbols.')
print(f'Redundancy: {n - k} extra symbols. Can tolerate {n - k} erasures.')

## Step 2: Corrupt the Codeword

Suppose one of the evaluation points gets corrupted during transmission (a scratch
on a CD, a bit flip in a noisy channel). We simulate this by erasing one value.

In [None]:
# === Step 2: Corrupt one evaluation ===
F = GF(7)
R.<x> = PolynomialRing(F)

message = [F(1), F(2), F(3)]
m_poly = sum(c * x^i for i, c in enumerate(message))
eval_points = [F(i) for i in range(1, 7)]
codeword = [m_poly(pt) for pt in eval_points]

# Simulate erasure: lose the value at evaluation point 3
erased_index = 2  # index of the lost point (0-based)
erased_point = eval_points[erased_index]

print('Original codeword:')
for i, (pt, val) in enumerate(zip(eval_points, codeword)):
    marker = '  <-- ERASED' if i == erased_index else ''
    print(f'  m({Integer(pt)}) = {Integer(val)}{marker}')

# Surviving points
surviving_points = [pt for i, pt in enumerate(eval_points) if i != erased_index]
surviving_values = [val for i, val in enumerate(codeword) if i != erased_index]

print(f'\nSurviving points: {[Integer(p) for p in surviving_points]}')
print(f'Surviving values: {[Integer(v) for v in surviving_values]}')
print(f'Number of surviving points: {len(surviving_points)} (need at least k = {len(message)})')

## Step 3: Recover via Lagrange Interpolation

From Module 02, we know that a polynomial of degree $k - 1$ is uniquely determined by
$k$ evaluation points. We have $n - 1 = 5$ surviving points, which is more than enough
to recover the degree-2 polynomial ($k = 3$).

**Lagrange interpolation** reconstructs the polynomial from its evaluations:

$$p(x) = \sum_{i=1}^{k} y_i \prod_{j \neq i} \frac{x - x_j}{x_i - x_j}$$

In [None]:
# === Step 3: Lagrange interpolation to recover the polynomial ===
F = GF(7)
R.<x> = PolynomialRing(F)

message = [F(1), F(2), F(3)]
m_poly = sum(c * x^i for i, c in enumerate(message))
eval_points = [F(i) for i in range(1, 7)]
codeword = [m_poly(pt) for pt in eval_points]

# Remove erased point
erased_index = 2
surviving_points = [pt for i, pt in enumerate(eval_points) if i != erased_index]
surviving_values = [val for i, val in enumerate(codeword) if i != erased_index]

# We only need k = 3 points to recover a degree-2 polynomial.
# Use the first 3 surviving points.
k = len(message)
interp_points = surviving_points[:k]
interp_values = surviving_values[:k]

print(f'Using {k} points for interpolation:')
for pt, val in zip(interp_points, interp_values):
    print(f'  ({Integer(pt)}, {Integer(val)})')
print()

# Lagrange interpolation
recovered = R.lagrange_polynomial(list(zip(interp_points, interp_values)))

print(f'Recovered polynomial: {recovered}')
print(f'Original polynomial:  {m_poly}')
print(f'Match: {recovered == m_poly}')
print()

# Extract the message
recovered_msg = [Integer(recovered[i]) for i in range(k)]
print(f'Recovered message: {recovered_msg}')
print(f'Original message:  {[Integer(c) for c in message]}')
print(f'\nMessage recovered perfectly despite losing one codeword symbol!')

In [None]:
# === Can we survive more erasures? ===
F = GF(7)
R.<x> = PolynomialRing(F)

message = [F(1), F(2), F(3)]
k = len(message)
m_poly = sum(c * x^i for i, c in enumerate(message))
eval_points = [F(i) for i in range(1, 7)]
codeword = [m_poly(pt) for pt in eval_points]

# Erase 3 out of 6 points (the maximum we can handle)
erased_indices = {0, 2, 4}  # Erase points at positions 0, 2, 4

print('Codeword with 3 erasures (half the symbols lost!):')
for i, (pt, val) in enumerate(zip(eval_points, codeword)):
    status = 'ERASED' if i in erased_indices else f'{Integer(val)}'
    print(f'  m({Integer(pt)}) = {status}')

surviving = [(pt, val) for i, (pt, val) in enumerate(zip(eval_points, codeword))
             if i not in erased_indices]

print(f'\nSurviving: {len(surviving)} points (exactly k = {k})')

# Interpolate from exactly k points
recovered = R.lagrange_polynomial(surviving)
print(f'Recovered polynomial: {recovered}')
print(f'Original polynomial:  {m_poly}')
print(f'Match: {recovered == m_poly}')
print()
print(f'With n = 6 codeword symbols and k = 3 message symbols,')
print(f'we can tolerate up to n - k = {6 - k} erasures.')
print(f'That is {6 - k} out of 6 = {(6-k)/6*100:.0f}% data loss tolerance!')

In [None]:
# === Lagrange interpolation by hand (one basis polynomial) ===
F = GF(7)
R.<x> = PolynomialRing(F)

# The points we use: (1, 6), (2, 0), (4, 4)
points = [(F(1), F(6)), (F(2), F(0)), (F(4), F(4))]

print('Lagrange interpolation step by step over GF(7):')
print(f'Points: {[(Integer(p), Integer(v)) for p, v in points]}')
print()

# Build each Lagrange basis polynomial
total = R(0)
for i, (xi, yi) in enumerate(points):
    # L_i(x) = product of (x - xj) / (xi - xj) for j != i
    Li = R(1)
    for j, (xj, yj) in enumerate(points):
        if j != i:
            Li *= (x - xj) / (xi - xj)
    print(f'  L_{i}(x) = {Li}')
    print(f'  y_{i} * L_{i}(x) = {yi} * ({Li}) = {yi * Li}')
    total += yi * Li
    print()

print(f'Sum: p(x) = {total}')
print(f'\nAll the division (computing 1/(xi - xj)) happens in GF(7).')
print(f'This requires GF(7) to be a FIELD -- every nonzero element needs an inverse.')
print(f'Over Z/12Z (not a field), Lagrange interpolation would fail.')

## Concept Map: Module 02 Concepts in Reed-Solomon

| Module 02 Concept | Reed-Solomon Application |
|---|---|
| Polynomial ring $\mathbb{F}_p[x]$ | Messages encoded as polynomials |
| Polynomial evaluation | Encoding: evaluate message polynomial at $n$ points |
| Lagrange interpolation | Decoding: recover polynomial from $k$ evaluations |
| Finite field (not just ring) | Interpolation requires division -- needs a field |
| Irreducible polynomial | Larger Reed-Solomon codes use GF($2^8$) built via quotient ring |
| Degree of polynomial | Degree $k-1$ polynomial needs exactly $k$ points to reconstruct |

Reed-Solomon codes are Module 02 applied to reliability engineering.

## Where Reed-Solomon Uses Larger Fields

In practice, Reed-Solomon codes work over GF($2^8$) = GF(256) so that each field
element is exactly one byte. The field is built as:

$$\text{GF}(256) = \mathbb{F}_2[x] / (\text{irreducible of degree } 8)$$

This is the same quotient ring construction from Module 02. QR codes, CDs, and
space probes all rely on polynomial arithmetic in GF($2^8$) -- the identical
algebraic structure that powers AES.

In [None]:
# === Quick demo: Reed-Solomon over GF(2^8) ===
K.<a> = GF(2^8)
R.<x> = PolynomialRing(K)

# Encode a 3-byte message
msg_bytes = [K.fetch_int(0x48), K.fetch_int(0x65), K.fetch_int(0x6C)]  # 'H', 'e', 'l'
m_poly = sum(c * x^i for i, c in enumerate(msg_bytes))
print(f'Message bytes: {[hex(c.integer_representation()) for c in msg_bytes]}')
print(f'Message polynomial: {m_poly}')
print()

# Evaluate at 6 points
eval_pts = [K.fetch_int(i) for i in range(1, 7)]
codeword = [m_poly(pt) for pt in eval_pts]
print('Codeword (6 evaluations in GF(256)):')
for pt, val in zip(eval_pts, codeword):
    print(f'  m(0x{pt.integer_representation():02X}) = 0x{val.integer_representation():02X}')

# Erase 3 points and recover
surviving = list(zip(eval_pts, codeword))[:3]
recovered = R.lagrange_polynomial(surviving)
recovered_bytes = [hex(recovered[i].integer_representation()) for i in range(3)]
print(f'\nRecovered from 3 of 6 points: {recovered_bytes}')
print(f'Original:                     {[hex(c.integer_representation()) for c in msg_bytes]}')
print(f'Match: {recovered == m_poly}')

## Summary

Reed-Solomon error-correcting codes turn Module 02's polynomial algebra into a
practical tool for reliable data transmission and storage.

- **Encoding**: A $k$-symbol message becomes the coefficients of a degree-$(k-1)$ polynomial, evaluated at $n > k$ field points.
- **Redundancy**: The $n - k$ extra evaluations are redundant -- we can lose up to $n - k$ symbols and still recover.
- **Decoding**: Lagrange interpolation (from Module 02) reconstructs the polynomial from any $k$ surviving evaluations.
- **Field arithmetic is essential**: Interpolation requires division, which requires a field (not just a ring). Using a composite modulus would break the scheme.
- **Real-world scale**: Production codes use GF($2^8$), built as a polynomial quotient ring -- the same construction from Module 02.

From QR codes on your coffee cup to data from the edge of the solar system, Reed-Solomon
codes rely on the polynomial algebra you learned in Module 02.

---

*Back to [Module 02: Rings, Fields, and Polynomials](../README.md)*