# Break: Smooth-Order Attack

**Module 01** | Breaking Weak Parameters

*When $p-1$ has only small prime factors, the discrete log problem becomes easy.*

## Why This Matters

Diffie-Hellman key exchange relies on the **discrete logarithm problem (DLP)** being hard:
given $g$, $p$, and $A = g^a \bmod p$, find $a$.

But hardness depends critically on the **structure of the group**. If $p - 1$ factors
into only small primes (we call $p - 1$ **smooth**), an attacker can:

1. Solve the DLP independently in each small-prime subgroup (cheap!)
2. Combine the partial results using the **Chinese Remainder Theorem**

This is the **Pohlig-Hellman algorithm** (1978). It reduces a hard DLP into
many trivially small DLPs.

## The Scenario

Alice uses Diffie-Hellman with a deliberately bad prime where $p - 1$ is **smooth**
(has only small factors). She thinks her secret exponent is safe because $p$ is prime.

**Your job**: recover Alice's secret key using the Pohlig-Hellman attack.

In [None]:
# === The Setup: A prime with a dangerously smooth order ===

p = 2521  # A prime number
print(f'p = {p}')
print(f'Is p prime? {is_prime(p)}')
print()

# The critical weakness: look at the factorization of p - 1
print(f'p - 1 = {p - 1}')
print(f'p - 1 = {factor(p - 1)}')
print()
print('Every prime factor is at most 9. This is VERY smooth.')

In [None]:
# === Alice's key pair ===

g = primitive_root(p)
print(f'Generator g = {g}  (a primitive root mod {p})')
print(f'Order of g = {Mod(g, p).multiplicative_order()}')
print()

# Alice's secret exponent
secret = 1337
A = power_mod(g, secret, p)

print(f"Alice's public key:  A = g^secret mod p = {A}")
print(f"Alice's secret key:  secret = {secret}  (we pretend we don't know this)")
print()
print(f'Attacker sees: g = {g}, p = {p}, A = {A}')
print(f'Attacker wants: secret such that g^secret ≡ A (mod p)')

## Step 1: Factor the Group Order

The group $\mathbb{Z}/p\mathbb{Z}^*$ has order $p - 1 = 2520$.

$$2520 = 2^3 \cdot 3^2 \cdot 5 \cdot 7$$

The **largest prime power** is just $9 = 3^2$. This means we never need to brute-force
more than 9 possibilities in any subgroup. Compare this to brute-forcing all 2520!

In [None]:
# Factor the group order
order = p - 1
factored = list(factor(order))

print(f'Group order: {order}')
print(f'Factorization: {factored}')
print()

# The prime powers we'll work with
prime_powers = [(q, e, q^e) for q, e in factored]
for q, e, qe in prime_powers:
    print(f'  {q}^{e} = {qe}  →  brute force at most {qe} values')

print(f'\nTotal brute-force work: {sum(qe for _, _, qe in prime_powers)} operations')
print(f'Naive brute force:     {order} operations')
print(f'Speedup factor:        {order / sum(qe for _, _, qe in prime_powers):.1f}x')

## Step 2: Solve the DLP in Each Subgroup

For each prime power $q^e$ dividing $p - 1$, we **project** both $g$ and $A$ into
the subgroup of order $q^e$:

$$g_{q^e} = g^{(p-1)/q^e} \bmod p \qquad A_{q^e} = A^{(p-1)/q^e} \bmod p$$

Now $g_{q^e}$ has order $q^e$, and we need to find $x$ such that
$g_{q^e}^x \equiv A_{q^e} \pmod{p}$.

Since $q^e$ is small, we can just **brute force** this.

In [None]:
# Pohlig-Hellman: solve the DLP in each small subgroup
partial_logs = []

for q, e in factored:
    qe = q^e
    
    # Project into subgroup of order q^e
    g_qe = power_mod(g, (p - 1) // qe, p)
    A_qe = power_mod(A, (p - 1) // qe, p)
    
    print(f'--- Subgroup of order {q}^{e} = {qe} ---')
    print(f'  g_{qe} = g^{(p-1)//qe} mod p = {g_qe}')
    print(f'  A_{qe} = A^{(p-1)//qe} mod p = {A_qe}')
    
    # Brute force the small DLP
    for x in range(qe):
        if power_mod(g_qe, x, p) == A_qe:
            partial_logs.append((x, qe))
            print(f'  Found: secret ≡ {x} (mod {qe})')
            break
    print()

## Step 3: Combine with the Chinese Remainder Theorem

We now have a system of congruences:

$$\text{secret} \equiv x_1 \pmod{q_1^{e_1}}, \quad \text{secret} \equiv x_2 \pmod{q_2^{e_2}}, \quad \ldots$$

Since the moduli $q_i^{e_i}$ are pairwise coprime and their product is $p - 1$,
the CRT gives us a **unique** solution modulo $p - 1$.

In [None]:
# Combine partial results with CRT
remainders = [x for x, qe in partial_logs]
moduli = [qe for x, qe in partial_logs]

print('System of congruences:')
for x, qe in partial_logs:
    print(f'  secret ≡ {x} (mod {qe})')

recovered = CRT(remainders, moduli)

print(f'\n=== CRT Solution ===')
print(f'Recovered secret: {recovered}')
print(f'Actual secret:    {secret}')
print(f'Match: {recovered == secret}')

In [None]:
# Verify: does g^recovered ≡ A (mod p)?
print(f'Verification: g^{recovered} mod p = {power_mod(g, recovered, p)}')
print(f'Alice\'s public key A          = {A}')
print(f'Match: {power_mod(g, recovered, p) == A}')

## Cost Comparison

How much work did the attacker actually do?

In [None]:
# Cost analysis
brute_force_cost = p - 1  # Worst case: try all exponents
pohlig_hellman_cost = sum(q^e for q, e in factored)  # Sum of subgroup sizes

print('=== Attack Cost Comparison ===')
print(f'Brute force (worst case):   {brute_force_cost} exponentiations')
print(f'Pohlig-Hellman:             {pohlig_hellman_cost} exponentiations')
print(f'Speedup:                    {brute_force_cost / pohlig_hellman_cost:.1f}x faster')
print()
print('For a real-world smooth prime with p ~ 2^1024:')
print('Brute force:        ~2^1024 operations (impossible)')
print('Pohlig-Hellman:     ~sum of small factors (trivial!)')
print()
print('The attack scales with the LARGEST prime factor of p-1,')
print('not with p itself. Smooth p-1 = broken DH.')

## The Fix: Safe Primes

A **safe prime** is a prime $p$ such that $q = (p-1)/2$ is also prime.
Then $p - 1 = 2q$, which has only two prime factors: $2$ and the large prime $q$.

The Pohlig-Hellman attack can only extract the secret modulo $2$ (trivial)
and modulo $q$ (requires solving a DLP of size $q$ -- essentially as hard as the
original problem).

In [None]:
# Compare: a safe prime near the same size
# Find a safe prime: p such that (p-1)/2 is also prime
p_safe = 2543  # Let's find one programmatically
for candidate in range(2521, 3000):
    if is_prime(candidate) and is_prime((candidate - 1) // 2):
        p_safe = candidate
        break

print(f'Safe prime: p = {p_safe}')
print(f'p - 1 = {p_safe - 1} = {factor(p_safe - 1)}')
print()

q_safe = (p_safe - 1) // 2
print(f'Pohlig-Hellman subgroup sizes: 2 and {q_safe}')
print(f'Largest subgroup DLP: {q_safe} (almost as hard as the full DLP!)')
print()

# Compare attack costs
pohlig_safe_cost = 2 + q_safe
print(f'Smooth prime (p={p}):  Pohlig-Hellman cost = {pohlig_hellman_cost}')
print(f'Safe prime (p={p_safe}): Pohlig-Hellman cost = {pohlig_safe_cost}')
print(f'\nWith the safe prime, the attack gives you almost NOTHING.')
print(f'You learn 1 bit (secret mod 2) and still face a DLP of size {q_safe}.')

## Exercise: Try It Yourself

1. **Change the secret**: Set `secret` to a different value and re-run the attack.
   Does it still work? Why?

2. **Bigger smooth prime**: Try `p = 55441` where $p - 1 = 55440 = 2^4 \cdot 3^2 \cdot 5 \cdot 7 \cdot 11$.
   How does the attack cost change?

3. **Partially smooth**: What if $p - 1$ has one large factor and several small ones?
   How much does the attacker learn?

## Summary

| | Smooth $p-1$ | Safe prime $p = 2q+1$ |
|---|---|---|
| Factorization of $p-1$ | Many small primes | $2 \times$ (large prime) |
| Pohlig-Hellman cost | Sum of small factors | $\approx p/2$ |
| DLP difficulty | **Easy** | **Hard** |

**Key takeaways:**
- The DLP is only as hard as the **largest prime factor** of the group order.
- The **Pohlig-Hellman algorithm** decomposes the DLP into independent sub-problems.
- **Safe primes** ($p = 2q + 1$, $q$ prime) defend against this attack.
- Modern DH standards (RFC 3526, RFC 7919) mandate safe primes for exactly this reason.

---

*Back to [Module 01: Modular Arithmetic & Groups](../README.md)*