# Notebook 06d: Group Structure and Order

**Module 06 -- Elliptic Curves**

---

**Motivating Question.** We know that $E(\mathbb{F}_p)$ is a finite group with $\approx p + 1$ points. But how many points *exactly*? And what does the group *look like* — is it always cyclic, or can it be a product of two cyclic groups? The answers come from two deep theorems: **Hasse's theorem** (which bounds the count) and the **structure theorem** (which says the group is either $\mathbb{Z}/n\mathbb{Z}$ or $\mathbb{Z}/n_1\mathbb{Z} \times \mathbb{Z}/n_2\mathbb{Z}$). Understanding these is essential for choosing secure curve parameters.

---

**Prerequisites.** You should be comfortable with:
- Elliptic curves over finite fields and point enumeration (Notebook 06c)
- Cyclic groups, subgroups, and Lagrange's theorem (Module 01)
- The order of an element in a group (Module 05)

**Learning objectives.** By the end of this notebook you will be able to:
1. State and verify Hasse's theorem for concrete curves.
2. Determine the group structure of $E(\mathbb{F}_p)$ using SageMath.
3. Compute point orders and find generators.
4. Explain why cryptographic curves need prime or near-prime group orders.
5. Describe the role of cofactors in curve selection.

## 1. Hasse's Theorem

**Theorem (Hasse, 1933).** For an elliptic curve $E$ over $\mathbb{F}_p$:

$$|\, |E(\mathbb{F}_p)| - (p + 1) \,| \leq 2\sqrt{p}$$

In other words, the number of points is $p + 1 - t$ where $t$ (the **trace of Frobenius**) satisfies $|t| \leq 2\sqrt{p}$.

This means the group order is always in the interval $[p + 1 - 2\sqrt{p}, \; p + 1 + 2\sqrt{p}]$.

| $p$ | $2\sqrt{p}$ | Range for $|E|$ |
|-----|------------|----------------|
| 23 | $\approx 9.6$ | $[15, 34]$ |
| 101 | $\approx 20.1$ | $[82, 122]$ |
| $2^{256}$ | $\approx 2^{129}$ | $\approx 2^{256} \pm 2^{129}$ |

In [None]:
# Verify Hasse's theorem for several curves
print(f"{'p':>6} {'a':>3} {'b':>3} {'|E|':>6} {'p+1':>6} {'t':>5} {'2√p':>7} {'Hasse?':>8}")
print("-" * 52)

test_cases = [
    (23, 1, 1), (23, 2, 3), (23, 0, 7),
    (101, 1, 1), (101, 3, 5), (101, 0, 1),
    (1009, 1, 1), (1009, 7, 11), (1009, -1, 0),
]

for p, a, b in test_cases:
    if (4*a^3 + 27*b^2) % p != 0:
        E = EllipticCurve(GF(p), [a, b])
        n = E.cardinality()
        t = p + 1 - n
        bound = 2 * sqrt(float(p))
        ok = abs(t) <= bound
        print(f"{p:>6} {a:>3} {b:>3} {n:>6} {p+1:>6} {t:>5} {bound:>7.1f} {'✓' if ok else '✗':>8}")

In [None]:
import matplotlib.pyplot as plt

# Visualise the trace of Frobenius for many curves over F_p
p = 127
traces = []

for a in range(p):
    for b in range(min(p, 20)):  # sample b values
        if (4*a^3 + 27*b^2) % p != 0:
            E = EllipticCurve(GF(p), [a, b])
            t = p + 1 - E.cardinality()
            traces.append(int(t))

fig, ax = plt.subplots(1, 1, figsize=(10, 5))
ax.hist(traces, bins=range(int(-2*sqrt(p))-2, int(2*sqrt(p))+3), 
        color='steelblue', edgecolor='black', alpha=0.7, density=True)
bound = 2*sqrt(float(p))
ax.axvline(x=bound, color='red', linestyle='--', label=f'$2\sqrt{{p}} = {bound:.1f}$')
ax.axvline(x=-bound, color='red', linestyle='--')
ax.set_xlabel('Trace of Frobenius $t = p + 1 - |E|$', fontsize=12)
ax.set_ylabel('Density', fontsize=12)
ax.set_title(f'Distribution of traces for curves over $\mathbb{{F}}_{{{p}}}$\n(Hasse: $|t| \leq 2\sqrt{{p}} \\approx {bound:.1f}$)', fontsize=13)
ax.legend(fontsize=11)
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

print(f"All {len(traces)} traces satisfy Hasse's bound: {all(abs(t) <= 2*sqrt(float(p)) for t in traces)}")

The distribution of traces follows the **Sato–Tate conjecture** (now a theorem): for "random" curves, $t/(2\sqrt{p})$ is distributed like the semicircle distribution. The peak near $t = 0$ means most curves have $|E| \approx p + 1$.

> **Checkpoint 1.** For a 256-bit prime $p$, Hasse's theorem says $|E(\mathbb{F}_p)| = p + 1 - t$ with $|t| \leq 2\sqrt{p} \approx 2^{129}$. Since $p \approx 2^{256}$, the relative deviation $|t|/p$ is negligible — the group order is essentially $p$.

## 2. Group Structure: Cyclic or Not?

**Theorem (Structure of $E(\mathbb{F}_p)$).** The group $E(\mathbb{F}_p)$ is isomorphic to either:
- $\mathbb{Z}/n\mathbb{Z}$ (cyclic), or
- $\mathbb{Z}/n_1\mathbb{Z} \times \mathbb{Z}/n_2\mathbb{Z}$ where $n_1 | n_2$.

In practice, **most** curves over large primes are cyclic. The non-cyclic case occurs when the group has a non-trivial "extra" structure.

For cryptography, we prefer the **cyclic** case — it gives us the cleanest DLP setting.

In [None]:
# Examine the group structure of several curves
print(f"{'p':>6} {'(a,b)':>10} {'|E|':>6} {'Group structure':>30} {'Cyclic?':>8}")
print("-" * 65)

for p in [23, 67, 101, 127, 199, 251]:
    for a, b in [(1, 1), (0, 1), (-1, 0), (2, 3)]:
        if (4*a^3 + 27*b^2) % p != 0:
            E = EllipticCurve(GF(p), [a, b])
            G = E.abelian_group()
            n = E.cardinality()
            is_cyclic = len(G.invariants()) <= 1
            struct = f"Z/{G.invariants()[0]}" if len(G.invariants()) == 1 else f"Z/{G.invariants()[0]} × Z/{G.invariants()[1]}"
            print(f"{p:>6} {str((a,b)):>10} {n:>6} {struct:>30} {'Yes' if is_cyclic else 'No':>8}")
            break  # one curve per prime for readability

In [None]:
# Find a non-cyclic example
print("Searching for non-cyclic E(F_p)...\n")

found = False
for p in prime_range(10, 200):
    for a in range(p):
        for b in range(p):
            if (4*a^3 + 27*b^2) % p == 0:
                continue
            E = EllipticCurve(GF(p), [a, b])
            G = E.abelian_group()
            if len(G.invariants()) == 2:
                n1, n2 = G.invariants()
                print(f"Found! p = {p}, E: y^2 = x^3 + {a}x + {b}")
                print(f"|E| = {E.cardinality()}")
                print(f"Group structure: Z/{n1} × Z/{n2}")
                print(f"This is NOT cyclic — it has two independent generators.")
                found = True
                break
        if found:
            break
    if found:
        break

> **Misconception alert.** "Elliptic curve groups are always cyclic." Not true! While most curves over large primes are cyclic, the structure theorem allows $\mathbb{Z}/n_1 \times \mathbb{Z}/n_2$. For crypto, we specifically choose curves that are cyclic (or whose large prime-order subgroup is cyclic).

## 3. Point Orders and Lagrange's Theorem

By Lagrange's theorem, the order of any point $P$ must divide the group order $|E(\mathbb{F}_p)|$.

If $|E| = n$, then the possible point orders are the **divisors** of $n$. A point of order $n$ (a generator) exists if and only if the group is cyclic.

In [None]:
# Examine point orders on a concrete curve
p = 101
E = EllipticCurve(GF(p), [1, 1])
n = E.cardinality()
print(f"Curve: y^2 = x^3 + x + 1 over F_{p}")
print(f"|E| = {n}")
print(f"Divisors of {n}: {divisors(n)}")
print(f"Group structure: {E.abelian_group()}")
print()

# Count how many points have each order
order_counts = {}
for pt in E.points():
    if pt == E(0):
        continue
    o = pt.order()
    order_counts[o] = order_counts.get(o, 0) + 1

print(f"{'Order':>8} {'# points':>10} {'Divides |E|?':>14}")
print("-" * 35)
for o in sorted(order_counts.keys()):
    print(f"{o:>8} {order_counts[o]:>10} {'Yes' if n % o == 0 else 'NO!':>14}")

In [None]:
# Visualise the order distribution
orders = sorted(order_counts.keys())
counts = [order_counts[o] for o in orders]

fig, ax = plt.subplots(1, 1, figsize=(10, 5))
ax.bar([str(o) for o in orders], counts, color='steelblue', edgecolor='black')
ax.set_xlabel('Point order', fontsize=12)
ax.set_ylabel('Number of points', fontsize=12)
ax.set_title(f'Distribution of point orders on $E(\\mathbb{{F}}_{{{p}}})$, $|E| = {n}$', fontsize=13)
ax.grid(True, alpha=0.3, axis='y')
plt.tight_layout()
plt.show()

# How many generators?
gen_count = order_counts.get(n, 0)
print(f"Points of maximum order {n}: {gen_count}")
print(f"These are the generators — {gen_count} out of {n-1} affine points ({100*gen_count/(n-1):.1f}%)")
print(f"Euler's totient: φ({n}) = {euler_phi(n)} (should match generator count)")

> **Checkpoint 2.** In a cyclic group of order $n$, the number of generators is $\varphi(n)$ (Euler's totient). If $n$ is prime, then $\varphi(n) = n - 1$, so *every* non-identity point is a generator! This is one reason we want $|E|$ to be prime.

---

> **Bridge from Module 01.** In Module 01, we learned that $\mathbb{Z}/n\mathbb{Z}^*$ is cyclic of order $\varphi(n)$, and a generator (primitive root) exists when $n$ is prime, a prime power, or twice a prime power. For elliptic curves, the group structure is guaranteed to be "almost cyclic" (at most two factors), which is a much stronger result.

## 4. Why Prime Order Matters

For cryptographic applications, we want the group (or at least a large subgroup) to have **prime order**. Why?

1. **Every non-identity point is a generator** — no small-subgroup attacks.
2. **Pohlig-Hellman cannot exploit the structure** — recall from Module 05 that Pohlig-Hellman reduces the DLP to sub-DLPs in prime-order subgroups. If the whole group has prime order, there are no smaller subgroups to exploit.
3. **Simpler parameter selection** — no need to worry about cofactors.

In [None]:
# Find curves with prime group order
print(f"{'p':>6} {'|E|':>8} {'Prime?':>8} {'Factorization':>30}")
print("-" * 55)

for p in prime_range(200, 300):
    E = EllipticCurve(GF(p), [1, 1])
    n = E.cardinality()
    is_prime = n.is_prime()
    fac = factor(n)
    marker = " ★" if is_prime else ""
    print(f"{p:>6} {n:>8} {'Yes' if is_prime else 'No':>8} {str(fac):>30}{marker}")

In [None]:
# Demonstrate why smooth order is dangerous: Pohlig-Hellman on a smooth-order curve
import time

# Find a curve with smooth group order
p_smooth = 1009  
E_smooth = EllipticCurve(GF(p_smooth), [1, 1])
n_smooth = E_smooth.cardinality()
print(f"Smooth-order curve: |E(F_{p_smooth})| = {n_smooth} = {factor(n_smooth)}")

G = E_smooth.gens()[0]
k_secret = randint(2, n_smooth - 1)
Q = k_secret * G

start = time.time()
k_found = discrete_log(Q, G, n_smooth, operation='+')
t_smooth = time.time() - start
print(f"DLP solved in {t_smooth*1000:.1f} ms (smooth order → Pohlig-Hellman is fast)")
print(f"Correct? {k_found == k_secret}")

# Compare with a prime-order curve
p_prime = 1013
# Try different parameters to find a prime-order curve
for a in range(1, p_prime):
    if (4*a^3 + 27) % p_prime == 0:
        continue
    E_test = EllipticCurve(GF(p_prime), [a, 1])
    if E_test.cardinality().is_prime():
        E_prime = E_test
        break

n_prime = E_prime.cardinality()
print(f"\nPrime-order curve: |E(F_{p_prime})| = {n_prime} (prime)")

G2 = E_prime.gens()[0]
k_secret2 = randint(2, n_prime - 1)
Q2 = k_secret2 * G2

start = time.time()
k_found2 = discrete_log(Q2, G2, n_prime, operation='+')
t_prime = time.time() - start
print(f"DLP solved in {t_prime*1000:.1f} ms")
print(f"Correct? {k_found2 == k_secret2}")

## 5. Cofactors and Subgroups

Sometimes the group order is $n = h \cdot q$ where $q$ is a large prime and $h$ is a small integer called the **cofactor**. In this case, we work in the subgroup of order $q$ generated by a point $G$ with $\text{ord}(G) = q$.

| Curve | $|E|$ | Cofactor $h$ | Prime subgroup order $q$ |
|-------|-------|-------------|-------------------------|
| P-256 (NIST) | $\approx 2^{256}$ | 1 | $|E|$ (entire group) |
| Curve25519 | $\approx 2^{255}$ | 8 | $|E|/8$ |
| secp256k1 (Bitcoin) | $\approx 2^{256}$ | 1 | $|E|$ |

A cofactor $h > 1$ means there are small-order subgroups, which can be exploited in certain protocols if not handled carefully.

In [None]:
# Demonstrate cofactors
p = 233
E = EllipticCurve(GF(p), [1, 1])
n = E.cardinality()
fac = factor(n)

print(f"E(F_{p}): |E| = {n} = {fac}")

# Find the largest prime factor
q = max(f[0] for f in fac)
h = n // q
print(f"Largest prime factor: q = {q}")
print(f"Cofactor: h = |E|/q = {h}")

# Find a generator of the order-q subgroup
while True:
    P = E.random_point()
    G = h * P  # multiply by cofactor to project into subgroup
    if G != E(0):
        break

print(f"\nRandom point P = {P}, order = {P.order()}")
print(f"G = h·P = {h}·P = {G}, order = {G.order()}")
print(f"G generates the prime-order subgroup of order {G.order()}")

> **Checkpoint 3.** To "project" a random point $P$ into the prime-order subgroup, we compute $G = hP$ where $h$ is the cofactor. Why does this work? Because $\text{ord}(P)$ divides $n = hq$, so $\text{ord}(hP)$ divides $q$. If $P$ does not have order dividing $h$, then $hP \neq \mathcal{O}$ and has order exactly $q$.

> **Crypto foreshadowing.** In protocols like ECDH, if the cofactor $h > 1$, an attacker can send a point of small order (in the cofactor subgroup) and learn bits of the secret key. This is called a **small-subgroup attack**. The standard defense is to either use $h = 1$ curves or to multiply received points by $h$ before use.

## 6. Counting Points Efficiently: Schoof's Algorithm

For small primes, we can count points by brute force. For cryptographic-size primes ($p \approx 2^{256}$), we need efficient algorithms.

**Schoof's algorithm** (1985) counts $|E(\mathbb{F}_p)|$ in polynomial time: $O((\log p)^5)$ or better with improvements by Elkies and Atkin (SEA algorithm).

SageMath uses SEA internally when you call `E.cardinality()`.

In [None]:
# SageMath can count points on curves over large fields
import time

for bits in [32, 64, 128, 192, 256]:
    p = next_prime(2^bits)
    E = EllipticCurve(GF(p), [1, 1])
    
    start = time.time()
    n = E.cardinality()
    elapsed = time.time() - start
    
    t = p + 1 - n
    print(f"{bits:>4}-bit prime: |E| has {n.ndigits()} digits, "
          f"t = p+1-|E| ≈ 2^{float(abs(t)).bit_length()}, "
          f"time = {elapsed*1000:.0f} ms")

Even for 256-bit primes, point counting takes only seconds. This is what makes it practical to search for curves with desired properties (e.g., prime group order).

## 7. Exercises

### Exercise 1 (Worked): Verifying Hasse's Theorem

**Problem.** For $E: y^2 = x^3 + 2x + 3$ over $\mathbb{F}_{43}$:
1. Count the points (use SageMath).
2. Compute the trace of Frobenius $t = p + 1 - |E|$.
3. Verify $|t| \leq 2\sqrt{43} \approx 13.1$.

**Solution.**

In [None]:
# Exercise 1 — worked solution
p = 43
E = EllipticCurve(GF(p), [2, 3])
n = E.cardinality()
t = p + 1 - n
bound = 2 * sqrt(float(p))

print(f"Curve: y^2 = x^3 + 2x + 3 over F_{p}")
print(f"|E| = {n}")
print(f"t = p + 1 - |E| = {p} + 1 - {n} = {t}")
print(f"2√p = 2√{p} ≈ {bound:.2f}")
print(f"|t| = {abs(t)} ≤ {bound:.2f}? {abs(t) <= bound}")
print(f"\nGroup structure: {E.abelian_group()}")
print(f"Factorization of |E|: {factor(n)}")

### Exercise 2 (Guided): Finding a Prime-Order Curve

**Problem.** Search for an elliptic curve $y^2 = x^3 + ax + 1$ over $\mathbb{F}_{389}$ such that $|E|$ is prime.

*Steps:*
1. Loop over $a = 0, 1, 2, \ldots$ until you find a curve with prime $|E|$.
2. Verify that the discriminant is nonzero.
3. Check that a random point has order $|E|$ (confirming it is a generator).

In [None]:
# Exercise 2 — fill in the TODOs
p = 389

# TODO 1: Loop to find a prime-order curve
# for a in range(p):
#     disc = (4*a^3 + 27) % p
#     if disc == 0:
#         continue
#     E = EllipticCurve(GF(p), [a, 1])
#     n = E.cardinality()
#     if ???:
#         print(f"Found! a = {a}, |E| = {n} (prime)")
#         break

# TODO 2: Verify a random point is a generator
# P = E.random_point()
# print(f"Random point P = {P}, order = {P.order()}")
# print(f"P is a generator? {P.order() == n}")

### Exercise 3 (Independent): Cofactor and Subgroup

**Problem.**
1. On $E: y^2 = x^3 + x + 1$ over $\mathbb{F}_{101}$, compute $|E|$ and factor it.
2. Find the cofactor $h$ (the ratio of $|E|$ to its largest prime factor).
3. Take a random point $P$ and compute $G = hP$. Verify that $G$ has order equal to the largest prime factor.
4. Why would an attacker prefer to attack the DLP in the full group rather than the prime-order subgroup? (Hint: think about Pohlig-Hellman.)

In [None]:
# Exercise 3 — write your solution here


## Summary

| Concept | Key Fact |
|---------|----------|
| **Hasse's theorem** | $|E(\mathbb{F}_p)| = p + 1 - t$ with $|t| \leq 2\sqrt{p}$ |
| **Group structure** | Always $\mathbb{Z}/n_1 \times \mathbb{Z}/n_2$ with $n_1 | n_2$; usually cyclic |
| **Point orders** | Divide $|E|$ (Lagrange); generators have order $|E|$ |
| **Prime order** | Ideal for crypto: every point is a generator, no Pohlig-Hellman |
| **Cofactor** | $h = |E|/q$ where $q$ is the largest prime factor; prefer $h = 1$ |
| **Schoof/SEA** | Count $|E(\mathbb{F}_p)|$ in polynomial time, even for 256-bit primes |

We now understand the group structure. In the next notebook, we study **scalar multiplication** — the efficient computation of $kP$ using double-and-add — which is the core operation in all EC cryptographic protocols.

---

**Next:** [06e — Scalar Multiplication](06e-scalar-multiplication.ipynb)