It has been a while since I wrote an introduction to cryptography blogpost! My intention is bridging a big gap - namely the Diffie-Hellman Key Exchange algorithm.
While this is going to be a short blogpost, I believe it's going to be worth your while.
Historically, cryptography has spent most of its time around Symmetric cryptography.
I have mentioned what Symmetric cryptography is in a past blogpost but in essence it means that encryption and decryption use the same key.
There are many examples of Symmetric ciphers, including really old ones like Caesar cipher, which developed with time to more sophisticated ones such as Vigenere cipher.
Even today modern cryptography relies on Symmetric ciphers, with AES being a prime example of such a cipher.
In our current modern world, Symmetric ciphers are very reliable in terms of security - with a very short key length (generated with sufficient randomness!) we can rely on an attacker not being able to decrypt or break the cipher (in fact, even with Quantum computers you won't be able to do much - more on this here if you are curious).
So, the problem is not the Symmetric ciphers we use, but how to exchange the Symmetric key - it's the chicken and the egg problem in some sense - to exchange the key with you need a cipher to begin with!
Up till 1976 (or before, if you count military intelligence) there was no known way to exchange keys reliably and remotely (in an untrusted medium, that is).
If you're thinking about my previously published RSA blogpost then know that RSA does solve the problem, but it was discovered much later.
So, the motivation for us is to exchange a secret key created with sufficient randomness, without anyone in the untrusted medium being able to decipher it.
I have discussed both modular arithmetic and basic Group theory in a previous blogpost - I recommend reading those before.
However, here's a refresher:
- We work within a frame of
modular arithmetic, which means we only look at the remainder of our common operations of addition, substraction and multiplication. - Division can be achieved if and only if our modulus is a
prime number, with something called the Extended Euclidean Algorithm. - The set of numbers
{ 1, 2, ... p-1 }form amultiplicative Group, which means addition, substraction, multiplication and division work "nicely". - Our Group also has something called a
Generator, which is a member of that group that you can multiply by itself over and over again that iterates the entire Group. - In practice, we will also look at Generators that do not generate the entire Group, but only a Subgroup instead. More on that later.
To give an example, let's take p=7 (i.e. (mod 7)) - the Group members are { 1, 2, 3, 4, 5, 6 }.
The number 3 is a generator - let's multiply it by itself over and over again:
3^1 = 3 = 3 (mod 7)
3^2 = 3*3 = 9 = 2 (mod 7)
3^3 = 3*3*3 = 27 = 6 (mod 7)
3^4 = 3*3*3*3 = 81 = 4 (mod 7)
3^5 = 3*3*3*3*3 = 243 = 5 (mod 7)
3^6 = 3*3*3*3*3*3 = 729 = 1 (mod 7)
(By the way, you do not need to calculate 3^6 directly - you can just look at 3^5 = 5 (mod 7) and multiply by 3: 3^6 = 3*5 (mod 7) = 15 (mod 7) = 1 (mod 7))
Note how the generator created all of the elements in our Group, but with a "weird" order - this happens due to modular arithmetic. In the next section we'll take advantage of that.
If we take a different value (let's say g=4) then we do not get all elements, but we still get a Subgroup:
4^1 = 4 (mod 7)
4^2 = 16 = 2 (mod 7)
4^3 = 64 = 1 (mod 7)
I hope all of these definitions and arithmetic operations are clear - if not, please read my previous blogposts again.
Let us assume we have a well-known large prime number p and a generator for the group which we will mark as g.
In our illustration we'll say Alice and Bob want to share a secret, and do that over an untrusted medium (let's say, the internet).
Let's just jump right into it:
AliceandBobagree on the numberspandg, in whichpis a very large prime number andgis a generator.Alicecreates a random numberaand keeps it to herself.AlicesendsBobthe numberg^a (mod p).Bobreceivesg^a (mod p)and does something similar - he generates a random secret numberband keeps it to himself, but sendsAliceg^b (mod p).Alicetakes the input fromBoband raises it to thea-th power, effectively calculating(g^b)^a (mod p) = g^(ba) (mod p) = g^(ab) (mod p).Bobdoes the same thing but raisesAlice's input to theb-th power, calculating(g^a)^b (mod p) = g^(ab) (mod p).- Now
AliceandBobhave a joint keyg^(ab) (mod p), and they can use it for further communications (e.g. as a secret key for anAESSymmetric cipher).
Here's a nice illustration too:
Alice Bob
----- ---
Agrees on: p, g
<------------------------->
Sends: g^a (mod p)
(generates: a) -------------------------->
Sends: g^b (mod p)
<-------------------------- (generates: b)
(key = g^(ab) (mod p)) (key = g^(ab) (mod p))
Let us now imagine an evesdropper in the middle - what can they do?
One important aspect here is that g^a (mod p) appears just as a number. Attacker knows g and p, but magically cannot deduce a efficiently.
In real numbers, we use Logarithms to solve the problem of finding the value of a, but those do not have an easy-to-calculate modular counterpart.
What Diffie-Hellman exploits here is called the Discrete Logarithm problem, which is computationally believed to be difficult for a computer to solve, assuming a very large value of p.
The most obvious danger is that g generates a really small number of elements in the Group.
Therefore, there are some interesting notes I'd like to mention on the parameter selection (p and g).
- The prime
pmust be large to make the Discrete logarithm problem hard; it'd usually be 2048 bits or more. I've written a blotpost on how to generate random prime numbers, you should check it out for details on how that's achieved. - Often,
pis a safe prime, which is a prime wherep = 2q + 1andqis also prime. This ensures the multiplicative Group modulophas a large prime-order subgroup, helping avoid small subgroup attacks. - The generator
gshould be a primitive root modulop— meaning that for every numberxin the Group, there exists an integerkwhereg^k = x (mod p). - More practically:
gis often selected to generate a large cyclic subgroup of orderq(the sameqthat was used to definepifpis a safe prime). - In practice,
pandgare not just randomly selected, but only selected from a standard agreed-upon list (e.g. RFC 3526 has some of those).
That's quite important - when I learned about Diffie-Hellman I thought g is a generator of the entire multiplicative Group and that is not the case - it's a generator of a sufficiently large subgroup.
We said that an evesdropper won't be able to derive the joint key (without paying a very high computational cost), but we haven't said anything about active attackers.
Here is an example of a Man-in-The-Middle (MiTM) attack on the protocol employed by Alice and Bob:
Alicesends the correctpandgcombination toBob, butMalis lurking on the internet and intercepts that, taking note of the agreedpandg.Alicegenerates the numberaand sendsg^a (mod p), believing it was sent toBob. At this point,Malgenerates her own numberxand sendsg^x (mod p)toAlice.Alicethinks she is talking withBob, while both her andMalboth created a joint keyg^(ax) (mod p).Malcan even do the same thing withBob- generate a numberyand sendBob(g^y (mod p)), interceptingBob'sg^b (mod p)value and creating a joint key withBob:g^(by) (mod p).- Now
Malcan decrypt packets sent fromAlicetoBob, as well as the other diretion!
If you think about it - that's a problem in every key exchange protocol, and in fact - a fundemental problem - how do we validate the authenticity of the other party?
This problem is (practically) solved with Public Key Infrastructure (PKI) which I might elaborate on some day.
If you've ever heard of Certifiates (used in SSL \ TLS, for instance) then they're very much related to PKI!
While this was a short blogpost, I thought it was necessary - my plan is to blog on Elliptic Curve Cryptography soon, and I wanted everyone to have the right background on Diffie-Hellman so I could show how it's implemented in the context of Elliptic Curves.
Stay tuned!
Jonathan Bar Or