# Guide to Unique Decoding of Reed-Solomon Codes

This notebook provides the explanation and implementation of the **Welch-Berlekamp algorithm** for the unique decoding of Reed-Solomon codes. We will start by defining the unique decoding problem, develop the intuition behind the algorithm using a geometric perspective, and finally build and demonstrate a working Python implementation.

## The Unique Decoding Problem

A Reed-Solomon code is created by taking a message, treating it as the coefficients of a **message polynomial** $P(X)$ of degree less than $k$, and then evaluating that polynomial at $n$ distinct points $(\alpha_1, \dots, \alpha_n)$ to create a codeword.

When this codeword is transmitted, a noisy channel may introduce errors, resulting in a **received word** $y = (y_1, \dots, y_n)$. The number of errors, $e$, is the number of positions where the received word differs from the original codeword.

The goal of **unique decoding** is to recover the *one and only* original polynomial $P(X)$ from the noisy received word $y$. This is only guaranteed to be possible if the number of errors is strictly less than half the minimum distance of the code. For a Reed-Solomon code, this condition is:

$$ e < \frac{n-k+1}{2} $$

This notebook will build the Welch-Berlekamp algorithm, an efficient method for solving this exact problem.


## A Geometric View of the Problem 

To better visualize the problem, we can perform a "syntactic shift" and think of the received word $y$ not as a vector, but as a collection of $n$ points in a 2D plane: $\{(\alpha_1, y_1), (\alpha_2, y_2), \dots, (\alpha_n, y_n)\}$.

<img src="./imgs/image_unique_decoding_1.png" width="600"/>

The image above shows an example of a received word with n=14 points and k=2. The original message was a line (a polynomial of degree k-1=1).

The original, uncorrupted codeword would consist of points that all lie perfectly on the curve defined by the message polynomial $P(X)$. The effect of noise is to knock some of these points off the curve. The decoder's job, in this geometric view, is to find the unique curve of degree less than $k$ that passes through the maximum number of these received points.

<img src="./imgs/image_unique_decoding_2.png"  width="600"/>

The above image illustrates this. The correct polynomial, $P(X)=X$, passes through the subset of "correct" points, while the "error" points lie scattered off the line.

## The Core Idea: Reverse Engineering a Solution 

The Welch-Berlekamp algorithm is designed using a "reverse engineering" approach. We start by assuming we magically know the solution—both the original polynomial $P(X)$ and the locations of the errors—and derive a mathematical property. Then, we use that property to build an algorithm that finds $P(X)$ without knowing the error locations beforehand.

### The Error-Locator Polynomial, E(X)

Let's define a special tool called the **Error-Locator Polynomial**, $E(X)$. This is a polynomial whose roots are the x-coordinates ($\alpha_i$) where an error occurred. In other words:

$$ E(\alpha_i) = 0 \quad \text{if} \quad y_i \neq P(\alpha_i) $$

If there are $e$ errors, we can construct such a polynomial of degree $e$.

### The Key Equation

With this definition, we can establish a key equation that holds true for **every single point**, whether it's an error or not:

$$ y_i E(\alpha_i) = P(\alpha_i) E(\alpha_i) \quad \text{for all } i=1, \dots, n $$


This powerful identity is easy to prove by considering two cases:
1.  **If an error occurred at $\alpha_i$**: By definition, $E(\alpha_i) = 0$. The equation becomes $y_i \cdot 0 = P(\alpha_i) \cdot 0$, which simplifies to $0=0$. The identity holds.
2.  **If no error occurred at $\alpha_i$**: In this case, we know $y_i = P(\alpha_i)$. If we multiply both sides of this by $E(\alpha_i)$, the equality is preserved. The identity also holds.

This equation is the foundation of the algorithm. The only problem is that it involves the product of two unknowns, $P(X)$ and $E(X)$, making it a difficult quadratic problem to solve directly.

## The Core Idea: Reverse Engineering a Solution 

The Welch-Berlekamp algorithm is designed using a clever "reverse engineering" approach. We start by assuming we magically know the solution—both the original polynomial $P(X)$ and the locations of the errors—and derive a mathematical property. Then, we use that property to build an algorithm that finds $P(X)$ without knowing the error locations beforehand.

### The Error-Locator Polynomial, E(X)

Let's define a special tool called the **Error-Locator Polynomial**, $E(X)$. This is a polynomial whose roots are the x-coordinates ($\alpha_i$) where an error occurred. In other words:

$$ E(\alpha_i) = 0 \quad \text{if} \quad y_i \neq P(\alpha_i) $$

If there are $e$ errors, we can construct such a polynomial of degree $e$.

### The Key Equation

With this definition, we can establish a key equation that holds true for **every single point**, whether it's an error or not:

$$ y_i E(\alpha_i) = P(\alpha_i) E(\alpha_i) \quad \text{for all } i=1, \dots, n $$


This powerful identity is easy to prove by considering two cases:
1.  **If an error occurred at $\alpha_i$**: By definition, $E(\alpha_i) = 0$. The equation becomes $y_i \cdot 0 = P(\alpha_i) \cdot 0$, which simplifies to $0=0$. The identity holds.
2.  **If no error occurred at $\alpha_i$**: In this case, we know $y_i = P(\alpha_i)$. If we multiply both sides of this by $E(\alpha_i)$, the equality is preserved. The identity also holds.

This equation is the foundation of the algorithm. The only problem is that it involves the product of two unknowns, $P(X)$ and $E(X)$, making it a difficult quadratic problem to solve directly.

## The Welch-Berlekamp Algorithm

The genius of Welch-Berlekamp is a **linearization trick** that transforms the difficult quadratic problem into a simple system of linear equations that we know how to solve efficiently.

### The Linearization Trick

We define a new "numerator" polynomial, $N(X)$, as the product of our two unknowns:

$$ N(X) \triangleq P(X) \cdot E(X) $$

By substituting this into our key equation, we get a new equation where the unknowns—the coefficients of $N(X)$ and $E(X)$—appear linearly:

$$ N(\alpha_i) = y_i E(\alpha_i) \quad \text{for all } i=1, \dots, n $$


Now, our goal is to find the two polynomials $N(X)$ and $E(X)$ that satisfy this linear system. If we can find them, we can recover the original message polynomial by simply performing a polynomial division:

$$ P(X) = \frac{N(X)}{E(X)} $$

This leads to the formal three-step algorithm.

### The Algorithm Steps

1.  **Interpolation (Solve Linear System):** Find non-zero polynomials $N(X)$ (of degree at most $k+e-1$) and $E(X)$ (of degree at most $e$) that satisfy the system of $n$ linear equations $N(\alpha_i) = y_i E(\alpha_i)$ for all received points $(\alpha_i, y_i)$.

2.  **Division (Find P(X)):** If a solution is found and $E(X)$ divides $N(X)$, compute the candidate message polynomial by division: $P(X) = N(X) / E(X)$. If not, the decoding fails.

3.  **Verification:** Check if the recovered $P(X)$ is a valid solution by ensuring it is "close" to the received word $y$ (i.e., the number of disagreements is at most $e$). If it is, return $P(X)$.

## Code Implementation & Demonstration 


In [8]:
# Import necessary libraries
import numpy as np
from sage.all import *

# Define the finite field we will work in
F = GF(257)

# Helper function to convert a message to a polynomial
def msg_to_poly(msg_coeffs, R):
    return R(list(msg_coeffs))

# Helper function to convert a polynomial back to a message
def poly_to_msg(poly, k):
    coeffs = poly.coefficients(sparse=False)
    # Pad with zeros if necessary
    while len(coeffs) < k:
        coeffs.append(0)
    
    # NEW: Descriptive print statement
    print(f"\nConverting polynomial coefficients {coeffs} back to message...")
    return "".join([chr(int(c)) for c in coeffs])

# Main Welch-Berlekamp Decoder Function
def welch_berlekamp_decode(points, n, k):
    print("\n--- Starting Welch-Berlekamp Decoder ---")
    
    e = (n - k + 1) // 2
    print(f"This [n={n}, k={k}] code can correct up to e={e} errors.")

    R_x = PolynomialRing(F, 'x')
    
    # Step 1: Solve the Linear System for N(X) and E(X)
    num_N_coeffs = k + e 
    num_E_coeffs = e 
    num_vars = num_N_coeffs + num_E_coeffs

    M = MatrixSpace(F, n, num_vars)
    matrix = M()
    b = vector(F, n)

    for i in range(n):
        alpha_i, y_i = points[i]
        
        # Build matrix row for N(X) and E(X) coefficients
        for j in range(num_N_coeffs):
            matrix[i, j] = alpha_i**j
        for j in range(num_E_coeffs):
            matrix[i, num_N_coeffs + j] = -y_i * (alpha_i**j)
        
        # Build the constant vector `b` from the E_e=1 term
        b[i] = y_i * (alpha_i**e)

    print(f"\nSolving a {n}x{num_vars} linear system for the coefficients...")
    
    try:
        solution = matrix.solve_right(b)
    except ValueError:
        print("Decoding Failed: The linear system has no unique solution.")
        return None

    N_coeffs = list(solution[:num_N_coeffs])
    E_coeffs = list(solution[num_N_coeffs:]) + [1] # Add the implicit e_e=1 coefficient
    
    N = R_x(N_coeffs)
    E = R_x(E_coeffs)
    
    print(f"Found polynomials:\n  N(X) = {N}\n  E(X) = {E}")
    
    # Step 2: Division to find P(X)
    if E == 0 or N % E != 0:
        print("\nDecoding Failed: E(X) does not divide N(X).")
        return None
        
    P_candidate = N // E
    print(f"\nFound candidate message polynomial P(X) = {P_candidate}")
    
    # Step 3: Verification
    encoded_candidate = [(alpha, P_candidate(alpha)) for alpha, y in points]
    disagreements = 0
    for i in range(n):
        if points[i][1] != encoded_candidate[i][1]:
            disagreements += 1
            
    print(f"Verification: Found {disagreements} disagreements with the received word.")
    
    if disagreements <= e:
        decoded_msg = poly_to_msg(P_candidate, k)
        print(f"\nSUCCESS! Decoded message: '{decoded_msg}'")
        return decoded_msg
    else:
        print("\nDecoding Failed: Candidate polynomial is too far from the received word.")
        return None

# --- Demonstration ---

# Parameters
msg = "abc"
k = len(msg)
n = 7

# Message -> Polynomial
R_x = PolynomialRing(F, 'x')
msg_coeffs = [ord(c) for c in msg]

# NEW: Descriptive print statement
print(f"Original message: '{msg}' (k={k})")
print(f"Message ASCII coefficients: {msg_coeffs}")
P = msg_to_poly(msg_coeffs, R_x)
print(f"Message polynomial P(X): {P}")

# Encoding
alphas = [F(i) for i in range(n)]
codeword = [(alpha, P(alpha)) for alpha in alphas]
print(f"\nEncoded codeword (n={n}): {codeword}")

# Introduce Errors
e_limit = (n - k + 1) // 2
num_errors = 2 # This is <= e_limit, so it's correctable
noisy_word = list(codeword)
noisy_word[2] = (alphas[2], F(99)) # Corrupt point 2
noisy_word[5] = (alphas[5], F(42)) # Corrupt point 5
print(f"Noisy word with {num_errors} errors: {noisy_word}")

# Decode
welch_berlekamp_decode(noisy_word, n, k)

Original message: 'abc' (k=3)
Message ASCII coefficients: [97, 98, 99]
Message polynomial P(X): 99*x^2 + 98*x + 97

Encoded codeword (n=7): [(0, 97), (1, 37), (2, 175), (3, 254), (4, 17), (5, 235), (6, 137)]
Noisy word with 2 errors: [(0, 97), (1, 37), (2, 99), (3, 254), (4, 17), (5, 42), (6, 137)]

--- Starting Welch-Berlekamp Decoder ---
This [n=7, k=3] code can correct up to e=2 errors.

Solving a 7x7 linear system for the coefficients...
Found polynomials:
  N(X) = 99*x^4 + 176*x^3 + 144*x^2 + 44*x + 199
  E(X) = x^2 + 250*x + 10

Found candidate message polynomial P(X) = 99*x^2 + 98*x + 97
Verification: Found 2 disagreements with the received word.

Converting polynomial coefficients [97, 98, 99] back to message...

SUCCESS! Decoded message: 'abc'


'abc'