# Notebook 06e: Scalar Multiplication

**Module 06. Elliptic Curves**

---

**Motivating Question.** To use elliptic curves for cryptography, we need to compute $kP$ (adding $P$ to itself $k$ times) for *very large* $k$, say $k \approx 2^{256}$. Adding $P$ one step at a time would take $2^{256}$ operations, which is impossibly slow. Is there a shortcut? Yes: the **double-and-add** algorithm computes $kP$ in only $O(\log k)$ steps, just as square-and-multiply computes $g^k$ efficiently in $\mathbb{Z}/p\mathbb{Z}^*$.

---

**Prerequisites.** You should be comfortable with:
- Point addition and doubling on elliptic curves (Notebook 06b)
- Group structure and point orders (Notebook 06d)
- Square-and-multiply for modular exponentiation (Module 04/05)

**Learning objectives.** By the end of this notebook you will be able to:
1. Implement naive repeated addition and understand why it is too slow.
2. Implement the double-and-add algorithm and prove it is $O(\log k)$.
3. Compare the performance of naive vs double-and-add.
4. Understand the analogy between double-and-add and square-and-multiply.
5. Appreciate that scalar multiplication is the one-way function underlying EC crypto.

## 1. Naive Scalar Multiplication

The simplest approach: add $P$ to itself $k$ times.

$$kP = \underbrace{P + P + \cdots + P}_{k \text{ times}}$$

This requires $k - 1$ point additions. For $k \approx 2^{256}$, this is $\approx 10^{77}$ operations, far beyond feasibility.

In [None]:
def scalar_mul_naive(P, k):
    """Compute kP by repeated addition. O(k), only for small k!"""
    E = P.curve()
    result = E(0)  # O (identity)
    for _ in range(k):
        result = result + P
    return result

# Test on a small curve
E = EllipticCurve(GF(101), [1, 1])
P = E(0, 1)

# Small k: works fine
k = 17
result = scalar_mul_naive(P, k)
print(f"P = {P}")
print(f"{k}·P (naive) = {result}")
print(f"{k}·P (SageMath) = {k * P}")
print(f"Match? {result == k * P}")

In [None]:
import time

# Measure how naive scales
print("k Time (ms) # additions")
for k in [100, 1000, 10000, 50000]:
    start = time.time()
    result = scalar_mul_naive(P, k)
    elapsed = (time.time() - start) * 1000
    assert result == k * P, "Mismatch!"
    print(f"{k} {elapsed:>12.1f} {k-1}")

print(f"\nAt this rate, k = 2^256 would take ≈ 10^{70} years. Not practical!")

> **Checkpoint 1.** Why is naive scalar multiplication $O(k)$ while we need $O(\log k)$? The key insight is the same as for exponentiation: we can *double* the point instead of adding it one at a time. Doubling $P$ to get $2P$, then doubling again to get $4P$, and so on, reaches $2^m P$ in only $m$ steps.

## 2. The Double-and-Add Algorithm

**Idea:** Write $k$ in binary: $k = \sum_{i=0}^{m} k_i \cdot 2^i$ where $k_i \in \{0, 1\}$. Then:

$$kP = \sum_{i: k_i = 1} 2^i P$$

We precompute $P, 2P, 4P, 8P, \ldots$ by repeated doubling, and add together only the terms where $k_i = 1$.

| Analogy | Multiplicative group | Additive (EC) group |
|---------|---------------------|--------------------|
| Operation | Multiplication | Addition |
| Repeated operation | $g^k$ (exponentiation) | $kP$ (scalar multiplication) |
| Efficient method | Square-and-multiply | Double-and-add |
| Complexity | $O(\log k)$ squarings + multiplications | $O(\log k)$ doublings + additions |

In [None]:
def double_and_add(P, k):
    """
    Compute kP using double-and-add. O(log k) point operations.
    
    Algorithm (right-to-left binary method):
    - R starts as O (identity)
    - Q starts as P
    - For each bit of k (LSB to MSB):
        - If bit is 1: R = R + Q
        - Q = 2Q (double)
    """
    E = P.curve()
    R = E(0)   # accumulator, starts at identity
    Q = P       # current power of 2 times P
    
    additions = 0
    doublings = 0
    
    while k > 0:
        if k % 2 == 1:  # current bit is 1
            R = R + Q
            additions += 1
        Q = Q + Q       # double Q
        doublings += 1
        k = k // 2
    
    return R, additions, doublings

# Test
E = EllipticCurve(GF(101), [1, 1])
P = E(0, 1)
k = 73

result, adds, dbls = double_and_add(P, k)
print(f"k = {k} = {bin(k)} (binary)")
print(f"Bits of k: {k.bit_length()}")
print(f"Additions: {adds}, Doublings: {dbls}")
print(f"Total operations: {adds + dbls} (vs {k-1} for naive!)")
print(f"Result: {result}")
print(f"SageMath: {k * P}")
print(f"Match? {result == k * P}")

> **Misconception alert.** "Double-and-add always does exactly $\log_2 k$ additions." Not quite, it does $\log_2 k$ doublings, but the number of additions equals the **Hamming weight** of $k$ (the number of 1-bits in the binary representation). For a random 256-bit $k$, this is about 128 additions + 256 doublings $\approx 384$ total operations.

In [None]:
# Trace the algorithm step by step for k = 43
E = EllipticCurve(GF(101), [1, 1])
P = E(0, 1)
k = 43  # binary: 101011

print(f"Computing {k}·P where P = {P}")
print(f"k = {k} = {bin(k)} in binary")
print(f"Binary digits (LSB first): {[int(b) for b in reversed(bin(k)[2:])]}")
print()

R = E(0)
Q = P
temp_k = k
step = 0

print("Step Bit Action R Q (=2^i·P)")
while temp_k > 0:
    bit = temp_k % 2
    if bit == 1:
        R = R + Q
        action = f"R += Q (2^{step}·P)"
    else:
        action = "(skip)"
    
    r_str = f"({int(R[0])}, {int(R[1])})" if R != E(0) else "O"
    q_str = f"({int(Q[0])}, {int(Q[1])})" if Q != E(0) else "O"
    print(f"{step} {bit} {action} {r_str} {q_str}")
    
    Q = Q + Q  # double
    temp_k = temp_k // 2
    step += 1

print(f"\nResult: {k}·P = {R}")
print(f"Verify:       {k * P}")
print(f"Match? {R == k * P}")

## 3. Performance Comparison

Let us compare naive vs double-and-add on increasingly large scalars.

In [None]:
import matplotlib.pyplot as plt

# Use a larger curve for meaningful timing
p = next_prime(2^32)
E = EllipticCurve(GF(p), [1, 1])
P = E.random_point()

# Double-and-add for various k sizes
bits_list = [16, 32, 64, 128, 256, 512]
daa_times = []

print("Bits of k k (hex, first 8 chars) D&A time (ms) # ops")
for bits in bits_list:
    k = randint(2^(bits-1), 2^bits - 1)
    
    start = time.time()
    result, adds, dbls = double_and_add(P, k)
    elapsed = (time.time() - start) * 1000
    
    assert result == k * P
    daa_times.append(elapsed)
    k_hex = hex(k)[:10] + "..."
    print(f"{bits} {k_hex} {elapsed:>15.2f} {adds+dbls}")

# Plot
fig, ax = plt.subplots(1, 1, figsize=(9, 5))
ax.plot(bits_list, daa_times, 'bo-', linewidth=2, markersize=8, label='Double-and-add')
ax.set_xlabel('Bits of $k$', fontsize=12)
ax.set_ylabel('Time (ms)', fontsize=12)
ax.set_title('Double-and-add: time scales linearly with bit-length of $k$', fontsize=13)
ax.legend(fontsize=11)
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

print(f"\nTime is O(log k) = O(bits of k). Doubling the bit-length roughly doubles the time.")

> **Checkpoint 2.** For a 256-bit scalar, double-and-add needs $\approx 256$ doublings and $\approx 128$ additions = $\approx 384$ point operations. Naive would need $\approx 2^{256}$ additions. The speedup factor is $\frac{2^{256}}{384} \approx 10^{74}$. This is the difference between "impossible" and "milliseconds."

## 4. The One-Way Function

Scalar multiplication is the **one-way function** of elliptic curve cryptography:

| Direction | Operation | Complexity |
|-----------|----------|------------|
| **Forward** (easy) | Given $P, k$, compute $Q = kP$ | $O(\log k)$ via double-and-add |
| **Backward** (hard) | Given $P, Q$, find $k$ such that $Q = kP$ | Best known: $O(\sqrt{n})$ (Pollard's rho) |

This asymmetry is the **Elliptic Curve Discrete Logarithm Problem (ECDLP)**.

---

> **Bridge from Module 05.** In Module 05, the one-way function was $g^k \bmod p$ (fast via square-and-multiply, hard to invert). Here the one-way function is $kP$ (fast via double-and-add, hard to invert). The mathematical structure is different, but the cryptographic paradigm is identical.

In [None]:
# Demonstrate the asymmetry: fast forward, hard backward
p = next_prime(2^40)
E = EllipticCurve(GF(p), [1, 1])
G = E.gens()[0]
n = G.order()

k_secret = randint(2, n - 2)

# Forward: fast
start = time.time()
Q = k_secret * G
forward_time = (time.time() - start) * 1000

# Backward: slow (brute force for small groups)
start = time.time()
k_found = discrete_log(Q, G, n, operation='+')
backward_time = (time.time() - start) * 1000

print(f"Curve over F_p where p ≈ 2^40, |E| ≈ {n.ndigits()} digits")
print(f"\nForward (compute kG):    {forward_time:.2f} ms")
print(f"Backward (find k from Q): {backward_time:.1f} ms")
print(f"Ratio: backward is {backward_time/forward_time:.0f}x slower")
print(f"Correct? {k_found == k_secret}")

## 5. Windowed and Montgomery Methods (Overview)

Double-and-add is the basic algorithm, but real implementations use optimizations:

| Method | Idea | Speed-up |
|--------|------|----------|
| **Double-and-add** (basic) | Process 1 bit at a time | Baseline |
| **$w$-ary method** | Process $w$ bits at a time, precompute $P, 2P, \ldots, (2^w - 1)P$ | Fewer additions |
| **NAF (Non-Adjacent Form)** | Use digits $\{-1, 0, 1\}$ to reduce additions | ~33% fewer additions |
| **Montgomery ladder** | Always do same operations regardless of bit value | **Side-channel resistant** |

The Montgomery ladder is particularly important for security: it prevents attackers from learning secret bits by measuring timing or power consumption.

In [None]:
def montgomery_ladder(P, k):
    """
    Compute kP using the Montgomery ladder.
    Constant-time: always performs both an addition and a doubling per bit.
    """
    E = P.curve()
    R0 = E(0)  # will hold floor(k/2^i) * P at each step
    R1 = P      # will hold (floor(k/2^i) + 1) * P
    
    # Process bits from MSB to LSB
    for i in range(k.bit_length() - 1, -1, -1):
        bit = (k >> i) & 1
        if bit == 0:
            R1 = R0 + R1
            R0 = R0 + R0  # double
        else:
            R0 = R0 + R1
            R1 = R1 + R1  # double
    
    return R0

# Test Montgomery ladder
E = EllipticCurve(GF(101), [1, 1])
P = E(0, 1)

for k in [1, 2, 7, 43, 73, 97]:
    result = montgomery_ladder(P, k)
    expected = k * P
    status = "✓" if result == expected else "✗"
    print(f"{k}·P: Montgomery = ({int(result[0])}, {int(result[1])}), "
          f"SageMath = ({int(expected[0])}, {int(expected[1])})  {status}")

> **Checkpoint 3.** Why is constant-time execution important? If double-and-add takes different amounts of time for bits 0 vs 1, an attacker measuring execution time can deduce the secret scalar $k$ bit by bit. This is a **timing side-channel attack**. The Montgomery ladder defeats this by performing the same operations regardless of the bit value.

> **Crypto foreshadowing.** In the next notebook, we will use scalar multiplication as the core operation in ECDH (key exchange) and ECDSA (digital signatures). The security of both protocols rests on the ECDLP: computing $kP$ is easy, but recovering $k$ from $P$ and $kP$ is believed infeasible for properly chosen curves.

## 6. Exercises

### Exercise 1 (Worked): Tracing Double-and-Add

**Problem.** On $E: y^2 = x^3 + x + 1$ over $\mathbb{F}_{23}$, compute $11P$ where $P = (0, 1)$ using double-and-add. Show each step.

**Solution.** $11 = 1011_2$, so we process bits right-to-left:

| Step | Bit | Action | $R$ | $Q$ |
|------|-----|--------|-----|-----|
| 0 | 1 | $R = R + Q$ | $P$ | $P$ |
| 1 | 1 | $R = R + Q$ | $P + 2P = 3P$ | $2P$ |
| 2 | 0 | (skip) | $3P$ | $4P$ |
| 3 | 1 | $R = R + Q$ | $3P + 8P = 11P$ | $8P$ |

Result: $R = 11P$. Let us verify:

In [None]:
# Exercise 1: verification
E = EllipticCurve(GF(23), [1, 1])
P = E(0, 1)

# Step by step
print(f"P = {P}")
print(f"2P = {2*P}")
print(f"3P = P + 2P = {3*P}")
print(f"4P = {4*P}")
print(f"8P = {8*P}")
print(f"11P = 3P + 8P = {11*P}")
print()

# Using our double-and-add
result, adds, dbls = double_and_add(P, 11)
print(f"double_and_add(P, 11) = {result}")
print(f"Additions: {adds}, Doublings: {dbls}")
print(f"Match? {result == 11 * P}")

### Exercise 2 (Guided): Operation Count

**Problem.** For a 256-bit scalar $k$:
1. How many doublings does double-and-add perform?
2. On average, how many additions? (Hint: each bit is 1 with probability 1/2.)
3. Compare the total operation count with naive scalar multiplication.

*Fill in the TODO cells below.*

In [None]:
# Exercise 2: fill in the TODOs

bits = 256

# TODO 1: doublings = ?
# doublings = ???

# TODO 2: expected additions = ?
# avg_additions = ???

# TODO 3: Compare with naive
# naive_ops = 2^bits - 1
# daa_ops = doublings + avg_additions
# speedup = naive_ops / daa_ops
# print(f"Naive: {naive_ops:.2e} operations")
# print(f"D&A:   {daa_ops} operations")
# print(f"Speedup: {speedup:.2e}x")

### Exercise 3 (Independent): ECDLP Brute Force

**Problem.**
1. On $E: y^2 = x^3 + x + 1$ over $\mathbb{F}_{101}$, pick a generator $G$ and compute $Q = 73G$ using SageMath.
2. Write a brute-force function `ecdlp_brute(G, Q)` that finds $k$ by computing $G, 2G, 3G, \ldots$ and comparing with $Q$.
3. Time your brute-force ECDLP solver for curves over $\mathbb{F}_p$ with $p$ having 10, 15, and 20 bits. Plot the time vs $p$ (or vs $\sqrt{|E|}$).
4. Based on the trend, estimate how long brute force would take for a 256-bit curve.

In [None]:
# Exercise 3: write your solution here


## Summary

| Concept | Key Fact |
|---------|----------|
| **Naive scalar multiplication** | $kP$ by repeated addition: $O(k)$, infeasible for large $k$ |
| **Double-and-add** | Process bits of $k$: $O(\log k)$ doublings + additions |
| **Operation count** | $\approx \log_2 k$ doublings + $\frac{1}{2}\log_2 k$ additions on average |
| **Analogy** | Double-and-add is to EC what square-and-multiply is to $\mathbb{Z}/p\mathbb{Z}^*$ |
| **Montgomery ladder** | Constant-time variant; resists timing side-channel attacks |
| **One-way function** | Computing $kP$ is fast; recovering $k$ from $P, kP$ is the ECDLP |

Scalar multiplication is the workhorse operation of elliptic curve cryptography. Every key generation, key exchange, and signature computation reduces to one or a few scalar multiplications. In the final notebook, we put it all together with **ECDH** and **ECDSA**.

---

**Next:** [06f: ECDH and ECDSA](06f-ecdh-and-ecdsa.ipynb)