# Break: Fermat Factorization When Primes Are Close

**Module 04** | Breaking Weak Parameters

*Factor $n$ when $|p - q|$ is small using Fermat's method.*

## Why This Matters

RSA's security rests on the difficulty of factoring $n = pq$. But **how** $p$ and $q$
are chosen matters enormously.

If $p$ and $q$ are close together --- say, chosen from the same narrow range, or
generated by a buggy random number generator --- then $n$ can be factored almost
instantly using a method Pierre de Fermat described in **1643**.

This is not a theoretical curiosity. Real-world RSA keys have been broken this way:
- **ROCA vulnerability (2017)**: Infineon smart cards generated primes from a
  restricted set, making them factorable.
- Poorly seeded PRNGs in embedded devices can produce near-identical primes.

## The Scenario: Close Primes

We generate $n = p \cdot q$ where $p$ and $q$ are deliberately chosen to be close
together. The attacker sees only $n$ and must factor it.

In [None]:
# === Generate n = p * q with close primes ===

# Start from a base value and find two primes near it
set_random_seed(42)
base = 2^80  # ~80-bit primes, so n ~ 160 bits

# Find p: first prime above base
p = next_prime(base + ZZ.random_element(1000))
# Find q: a prime very close to p
q = next_prime(p + ZZ.random_element(10, 5000))

n = p * q

print(f'p = {p}')
print(f'q = {q}')
print(f'n = p * q = {n}')
print()
print(f'p has {p.nbits()} bits')
print(f'q has {q.nbits()} bits')
print(f'n has {n.nbits()} bits')
print()
print(f'Gap: |p - q| = {abs(p - q)}')
print(f'sqrt(n)       = {isqrt(n)}')
print(f'|p - q| / sqrt(n) = {RR(abs(p-q)) / RR(isqrt(n)):.6e}')
print()
print('The primes are extremely close relative to their size.')
print('An attacker sees only n. Can they factor it?')

## Step 1: Fermat's Idea

Fermat observed that any odd composite $n$ can be written as a **difference of two squares**:

$$n = a^2 - b^2 = (a - b)(a + b)$$

where $a = (p + q)/2$ and $b = (p - q)/2$ (assuming $p > q$).

The algorithm:
1. Start with $a = \lceil\sqrt{n}\rceil$
2. Compute $b^2 = a^2 - n$
3. If $b^2$ is a perfect square, then $n = (a - b)(a + b)$ and we're done
4. Otherwise, increment $a$ and repeat

When $p \approx q$, we have $a \approx \sqrt{n}$, so the algorithm terminates
after very few iterations.

In [None]:
# === Verify Fermat's representation ===

# The true values (attacker doesn't know these)
a_true = (p + q) // 2
b_true = (p - q) // 2  # assuming p > q; use abs if needed
if p < q:
    b_true = (q - p) // 2

print(f'True factorization: n = {p} * {q}')
print()
print(f'a = (p + q) / 2 = {a_true}')
print(f'b = |p - q| / 2 = {b_true}')
print(f'a^2 - b^2 = {a_true^2 - b_true^2}')
print(f'n          = {n}')
print(f'Match: {a_true^2 - b_true^2 == n}')
print()
print(f'(a - b) = {a_true - b_true} = {"p" if a_true - b_true == p else "q"}')
print(f'(a + b) = {a_true + b_true} = {"q" if a_true + b_true == q else "p"}')
print()
print(f'ceil(sqrt(n)) = {isqrt(n) + (1 if isqrt(n)^2 < n else 0)}')
print(f'a_true        = {a_true}')
print(f'Distance from sqrt(n) to a: {a_true - isqrt(n)} iterations needed')

## Step 2: Implement Fermat's Factorization

In [None]:
# === Fermat's factorization algorithm ===

def fermat_factor(n, max_iterations=10^7):
    """Factor n using Fermat's method. Returns (p, q, iterations)."""
    a = isqrt(n)
    if a * a == n:
        return a, a, 0  # Perfect square
    a += 1  # Start with ceil(sqrt(n))
    
    for i in range(max_iterations):
        b_squared = a * a - n
        b = isqrt(b_squared)
        
        if b * b == b_squared:  # Is b^2 a perfect square?
            p_found = a + b
            q_found = a - b
            return p_found, q_found, i + 1
        
        a += 1
    
    return None, None, max_iterations  # Failed

# Run the attack!
import time
t0 = time.time()
p_found, q_found, iterations = fermat_factor(n)
t1 = time.time()

print(f'=== FERMAT FACTORIZATION ===')
print(f'Iterations: {iterations}')
print(f'Time: {(t1-t0)*1000:.2f} ms')
print()
print(f'Found: p = {p_found}')
print(f'Found: q = {q_found}')
print(f'p * q = n? {p_found * q_found == n}')
print()

# Sort so we can compare
factors_found = sorted([p_found, q_found])
factors_true = sorted([p, q])
print(f'Matches original primes: {factors_found == factors_true}')

## Step 3: How Many Iterations?

The number of iterations Fermat's method needs is approximately:

$$\text{iterations} \approx \frac{(p - q)^2}{4\sqrt{n}}$$

When $|p - q|$ is small, this is tiny. When $|p - q| \approx \sqrt{n}$ (as it
should be for secure RSA), the number of iterations is $\approx \sqrt{n}/4$, which
is as hard as trial division.

In [None]:
# === Cost analysis: iterations vs gap size ===

# Generate several n values with different gaps
print(f'{"Gap |p-q|":>20s}  {"Iterations":>12s}  {"Time (ms)":>12s}  {"Predicted":>12s}')

base_p = next_prime(2^60)
gaps = [10, 100, 1000, 10000, 100000, 1000000]

for gap in gaps:
    test_p = base_p
    test_q = next_prime(base_p + gap)
    test_n = test_p * test_q
    actual_gap = abs(test_p - test_q)
    
    t0 = time.time()
    _, _, iters = fermat_factor(test_n)
    t1 = time.time()
    
    predicted = Integer(actual_gap)^2 // (4 * isqrt(test_n))
    print(f'{actual_gap:>20d}  {iters:>12d}  {(t1-t0)*1000:>12.2f}  {predicted:>12d}')

print()
print('As the gap doubles, iterations roughly quadruple (quadratic in gap).')
print('For secure RSA, the gap should be ~ sqrt(n), making iterations ~ sqrt(n).')

In [None]:
# === Visualization: iterations vs gap ===

data_points = []
base_p = next_prime(2^50)

for gap_exp in range(1, 14):
    gap = 2^gap_exp
    test_q = next_prime(base_p + gap)
    test_n = base_p * test_q
    _, _, iters = fermat_factor(test_n, max_iterations=10^6)
    if iters < 10^6:
        data_points.append((gap, iters))

if data_points:
    P = list_plot(data_points, scale='loglog', plotjoined=True, marker='o',
                  axes_labels=['|p - q|', 'Iterations'],
                  title='Fermat Factorization: Iterations vs Prime Gap')
    show(P, figsize=6)

## The Fix: Ensure $|p - q|$ Is Large

The FIPS 186-4 standard for RSA key generation requires:

$$|p - q| > 2^{n_{\text{len}}/2 - 100}$$

where $n_{\text{len}}$ is the bit length of $n$. For a 2048-bit RSA key, this means:

$$|p - q| > 2^{924}$$

This makes Fermat's method require approximately $2^{924}$ iterations --- completely
infeasible.

In practice, generating $p$ and $q$ independently and uniformly from the right-sized
range almost always satisfies this bound. The primes are only dangerously close if
the random number generator is broken.

In [None]:
# === Safe vs unsafe: compare factoring times ===

# UNSAFE: primes close together
p_unsafe = next_prime(2^80)
q_unsafe = next_prime(p_unsafe + 1000)
n_unsafe = p_unsafe * q_unsafe

t0 = time.time()
_, _, iters_unsafe = fermat_factor(n_unsafe)
t_unsafe = time.time() - t0

# SAFE: primes far apart (independently generated)
p_safe = random_prime(2^81, lbound=2^80)
q_safe = random_prime(2^81, lbound=2^80)
while abs(p_safe - q_safe) < 2^40:  # Ensure they're far apart
    q_safe = random_prime(2^81, lbound=2^80)
n_safe = p_safe * q_safe

t0 = time.time()
_, _, iters_safe = fermat_factor(n_safe, max_iterations=10^6)
t_safe = time.time() - t0

print('=== UNSAFE key (close primes) ===')
print(f'|p - q| = {abs(p_unsafe - q_unsafe)}')
print(f'Fermat iterations: {iters_unsafe}')
print(f'Time: {t_unsafe*1000:.2f} ms')
print()
print('=== SAFE key (independent primes) ===')
print(f'|p - q| = {abs(p_safe - q_safe)}')
print(f'Fermat iterations: {iters_safe} (hit limit = {iters_safe >= 10^6})')
print(f'Time: {t_safe*1000:.2f} ms (gave up after 10^6 iterations)')
print()
print('Properly generated primes make Fermat factorization hopeless.')

## Exercises

1. **Vary the gap**: Generate keys with gaps of $|p - q| = 2^k$ for $k = 5, 10, 15, 20, 25$.
   Plot the number of iterations. Does it match the predicted quadratic relationship?

2. **Larger keys**: Try Fermat factorization on 256-bit and 512-bit RSA keys with
   close primes. How does the constant factor scale with key size?

3. **Comparison with trial division**: For a given $n$ with close primes, compare
   the speed of Fermat's method vs. trial division (trying all primes up to $\sqrt{n}$).
   When is each method faster?

## Summary

| Component | Role in the Attack |
|-----------|-------------------|
| $n = a^2 - b^2$ | Fermat's difference-of-squares representation |
| $a = \lceil\sqrt{n}\rceil$ | Starting point, close to $(p+q)/2$ when primes are close |
| Perfect square test | Check if $a^2 - n$ is a perfect square at each step |
| Close primes | Make the starting $a$ very close to the true $(p+q)/2$ |

**Key takeaways:**
- Fermat's factorization method (1643) factors $n = pq$ in $O((p-q)^2 / \sqrt{n})$ steps.
- When $|p - q|$ is small, factoring is **trivial** --- a few iterations suffice.
- FIPS 186-4 mandates $|p - q| > 2^{n_{\text{len}}/2 - 100}$ to prevent this.
- Independent, uniform prime generation naturally produces primes that are far apart.
- The lesson: RSA security depends not just on key size, but on **how** the primes are generated.

---

*Back to [Module 04: Number Theory and RSA](../README.md)*