# Set 5: Diffie-Hellman and friends

This is the first set of **number-theoretic cryptography** challenges, and also our coverage of message authentication.

This set is **significantly harder** than the last set.  The concepts are new, the attacks bear no resemblance to those of the previous sets, and... math.

On the other hand, **our favorite cryptanalytic attack ever** is in this set (you'll see it soon).  We're happy with this set.  Don't wimp out here.  You're almost done!

- [Preliminaries](#Preliminaries)
- [Challenge 33: Implement Diffie-Hellman](#Challenge-33:-Implement-Diffie-Hellman)
- [Challenge 34: Implement a MITM key-fixing attack on Diffie-Hellman with parameter injection](#Challenge-34:-Implement-a-MITM-key-fixing-attack-on-Diffie-Hellman-with-parameter-injection)
- [Challenge 35: Implement DH with negotiated groups, and break with malicious "g" parameters](#Challenge-35:-Implement-DH-with-negotiated-groups,-and-break-with-malicious-"g"-parameters)

## Preliminaries

In [1]:
from random import randint, randbytes

# From pyca/cryptography
from cryptography.hazmat.primitives import padding
from cryptography.hazmat.primitives.ciphers import Cipher, algorithms, modes
from cryptography.hazmat.primitives.hashes import Hash, SHA1

def pad_pkcs7(text):
    padder = padding.PKCS7(128).padder()
    return padder.update(text) + padder.finalize()

def unpad_pkcs7(text):
    unpadder = padding.PKCS7(128).unpadder()
    return unpadder.update(text) + unpadder.finalize()

def aes_128_cbc_encrypt(ptext, key, iv):
    encryptor = Cipher(algorithms.AES128(key), modes.CBC(iv)).encryptor()
    return encryptor.update(pad_pkcs7(ptext)) + encryptor.finalize()

def aes_128_cbc_decrypt(ctext, key, iv):
    decryptor = Cipher(algorithms.AES128(key), modes.CBC(iv)).decryptor()
    return unpad_pkcs7(decryptor.update(ctext) + decryptor.finalize())

def sha1(message):
    h = Hash(SHA1())
    h.update(message)
    return h.finalize()

def modexp(a, e, n):
    # Compute (a**e)%n where e may be very large
    r = 1
    p = a
    while e > 0:
        if e%2 == 1:
            r = (r*p)%n
        p = (p*p)%n
        e >>= 1
    return r

## Challenge 33: Implement Diffie-Hellman

For one of the most important algorithms in cryptography this exercise couldn't be a whole lot easier.

Set a variable "p" to 37 and "g" to 5.  This algorithm is so easy I'm not even going to explain it.  Just do what I do.

Generate "a", a random number mod 37.  Now generate "A", which is "g" raised to the "a" power mod 37 --- A = (g**a) % p.

Do the same for "b" and "B".

"A" and "B" are public keys.  Generate a session key with them; set "s" to "B" raised to the "a" power mod 37 --- s = (B**a) % p.

Do the same with A**b, check that you come up with the same "s".

To turn "s" into a key, you can just hash it to create 128 bits of key material (or SHA256 it to create a key for encrypting and a key for a MAC).

Ok, that was fun, now repeat the exercise with bignums like in the real world. Here are parameters NIST likes:

```
p:
ffffffffffffffffc90fdaa22168c234c4c6628b80dc1cd129024
e088a67cc74020bbea63b139b22514a08798e3404ddef9519b3cd
3a431b302b0a6df25f14374fe1356d6d51c245e485b576625e7ec
6f44c42e9a637ed6b0bff5cb6f406b7edee386bfb5a899fa5ae9f
24117c4b1fe649286651ece45b3dc2007cb8a163bf0598da48361
c55d39a69163fa8fd24cf5f83655d23dca3ad961c62f356208552
bb9ed529077096966d670c354e4abc9804f1746c08ca237327fff
fffffffffffff

g: 2
```

This is very easy to do in Python or Ruby or other high-level languages that auto-promote fixnums to bignums, but it isn't "hard" anywhere.

Note that you'll need to write your own modexp (this is blackboard math, don't freak out), because you'll blow out your bignum library raising "a" to the 1024-bit-numberth power.  You can find modexp routines on Rosetta Code for most languages.

---

The 1,536-bit prime `p` and generator value `g = 2` are among the various values recommended in [RFC 3526](https://datatracker.ietf.org/doc/html/rfc3526).  There does not seem to be a required size for the private key, but 256 bits appears as a recommendation in a few places.  That the session keys as generated by A and B are equal just follows from the law of exponents.

In [2]:
P = int(
    "ffffffffffffffffc90fdaa22168c234c4c6628b80dc1cd129024"
    "e088a67cc74020bbea63b139b22514a08798e3404ddef9519b3cd"
    "3a431b302b0a6df25f14374fe1356d6d51c245e485b576625e7ec"
    "6f44c42e9a637ed6b0bff5cb6f406b7edee386bfb5a899fa5ae9f"
    "24117c4b1fe649286651ece45b3dc2007cb8a163bf0598da48361"
    "c55d39a69163fa8fd24cf5f83655d23dca3ad961c62f356208552"
    "bb9ed529077096966d670c354e4abc9804f1746c08ca237327fff"
    "fffffffffffff",
    16
)

G = 2

def generate_dh_keypair(prime, generator):
    # Return private key, public key
    k = randint(2**255, 2**256-1)
    K = modexp(generator, k, prime)
    return k, K

def package_session_key_value(value):
    return sha1(value.to_bytes(192, "big"))[:16]

def session_key(private_key, public_key, prime):
    # My private key, their public key
    return package_session_key_value(modexp(public_key, private_key, prime))

a, A = generate_dh_keypair(P, G)
b, B = generate_dh_keypair(P, G)

session_key(a, B, P) == session_key(b, A, P)

True

## Challenge 34: Implement a MITM key-fixing attack on Diffie-Hellman with parameter injection

Use the code you just worked out to build a protocol and an "echo" bot.  You don't actually have to do the network part of this if you don't want; just simulate that.  The protocol is:

**A->B**\
Send "p", "g", "A"\
**B->A**\
Send "B"\
**A->B**\
Send AES-CBC(SHA1(s)[0:16], iv=random(16), msg) + iv\
**B->A**\
Send AES-CBC(SHA1(s)[0:16], iv=random(16), A's msg) + iv

(In other words, derive an AES key from DH with SHA1, use it in both directions, and do CBC with random IVs appended or prepended to the message.)

Now implement the following MITM attack:

**A->M**\
Send "p", "g", "A"\
**M->B**\
Send "p", "g", "p"\
**B->M**\
Send "B"\
**M->A**\
Send "p"\
**A->M**\
Send AES-CBC(SHA1(s)\[0:16], iv=random(16), msg) + iv\
**M->B**\
Relay that to B\
**B->M**\
Send AES-CBC(SHA1(s)\[0:16], iv=random(16), A's msg) + iv\
**M->A**\
Relay that to A

M should be able to decrypt the messages.  "A" and "B" in the protocol --- the public keys, over the wire --- have been swapped out with "p".  Do the DH math on this quickly to see what that does to the predictability of the key.

Decrypt the messages from M's vantage point as they go by.

Note that you don't actually have to inject bogus parameters to make this attack work; you could just generate Ma, MA, Mb, and MB as valid DH parameters to do a generic MITM attack.  But do the parameter injection attack; it's going to come up again.

In [3]:
class Person:

    def __init__(self, name):
        self.name = name
        self.name_bytes = bytes(name, encoding="ASCII")
        self.is_initiator = False

    def start(self, prime, generator, other_person):
        self.is_initiator = True
        self.prime = prime
        self.generator = generator
        self.private_key, self.public_key = (
            generate_dh_keypair(prime, generator)
        )
        self.send_open_channel_request(other_person)

    def send_open_channel_request(self, recipient):
        recipient.recv_open_channel_request(
            self,
            self.prime,
            self.generator,
            self.public_key
        )

    def recv_open_channel_request(
        self, sender, prime, generator, sender_public_key
    ):
        self.private_key, self.public_key = (
            generate_dh_keypair(prime, generator)
        )
        self.session_key = (
            session_key(self.private_key, sender_public_key, prime)
        )
        self.send_open_channel_ack(sender)

    def send_open_channel_ack(self, recipient):
        recipient.recv_open_channel_ack(self, self.public_key)

    def recv_open_channel_ack(self, sender, sender_public_key):
        self.session_key = (
            session_key(self.private_key, sender_public_key, self.prime)
        )
        self.send_message(sender, self.name_bytes + b" says hi")

    def send_message(self, recipient, message):
        iv = randbytes(16)
        ctext = aes_128_cbc_encrypt(message, self.session_key, iv)
        recipient.recv_message(self, ctext, iv)

    def recv_message(self, sender, ctext, iv):
        message = aes_128_cbc_decrypt(ctext, self.session_key, iv)
        print(f"{self.name} received: {message.decode('ASCII')}")
        if self.is_initiator:
            return
        iv = randbytes(16)
        response = message + b", " + self.name_bytes + b" says hi back"
        ctext = aes_128_cbc_encrypt(response, self.session_key, iv)
        sender.recv_message(self, ctext, iv)

A = Person("Alice")
B = Person("Bob")
A.start(P, G, other_person=B)

Bob received: Alice says hi
Alice received: Alice says hi, Bob says hi back


If the public key is set to $p$ (which is a little odd, as it's supposed to be a value modulo $p$, but let's roll with the idea), then the session key is predictable: $s = p^a \bmod p = 0$ for any private key $a$.

In [4]:
class MITM(Person):

    def __init__(self, downstream_person):
        super().__init__("MITM")
        self.downstream_person = downstream_person

    def recv_open_channel_request(
        self, sender, prime, generator, sender_public_key
    ):
        self.upstream_person = sender
        self.prime = prime
        self.generator = generator
        self.private_key = 0  # arbitrary
        self.public_key = prime  # the weird part
        self.session_key = package_session_key_value(0)
        self.send_open_channel_request(self.downstream_person)
        self.send_open_channel_ack(self.upstream_person)

    def recv_open_channel_ack(self, sender, sender_public_key):
        pass

    def recv_message(self, sender, ctext, iv):
        message = aes_128_cbc_decrypt(ctext, self.session_key, iv)
        print(f"{self.name} intercepted: {message.decode('ASCII')}")
        if sender == self.upstream_person:
            other_person = self.downstream_person
        else:
            other_person = self.upstream_person
        self.send_message(other_person, message)

A = Person("Alice")
B = Person("Bob")
A.start(P, G, other_person=MITM(downstream_person=B))

MITM intercepted: Alice says hi
Bob received: Alice says hi
MITM intercepted: Alice says hi, Bob says hi back
Alice received: Alice says hi, Bob says hi back


## Challenge 35: Implement DH with negotiated groups, and break with malicious "g" parameters

**A->B**\
Send "p", "g"\
**B->A**\
Send ACK\
**A->B**\
Send "A"\
**B->A**\
Send "B"\
**A->B**\
Send AES-CBC(SHA1(s)[0:16], iv=random(16), msg) + iv\
**B->A**\
Send AES-CBC(SHA1(s)[0:16], iv=random(16), A's msg) + iv

Do the MITM attack again, but play with "g". What happens with:

```
g = 1
g = p
g = p - 1
```

Write attacks for each.

> **When does this ever happen?**
>
> Honestly, not that often in real-world systems.  If you can mess with "g", chances are you can mess with something worse.  Most systems pre-agree on a static DH group.  But the same construction exists in Elliptic Curve Diffie-Hellman, and this becomes more relevant there.

---

There's not much to do here.  Setting $g$ to any of these values means that the session key is predictable.  As the previous challenge admitted, a man-in-the-middle doesn't need to inject parameter values because it can just create session keys as normal.  But for the record:

- If $g = 1$, then $A = B = s = 1$.
- If $g = p$, then $A = B = s = 0$.
- If $g = p-1$, then $A = B = s = \pm 1$ depending on the evenness of private keys.