# Modern Cryptography

In this section, we will learn about modern cryptography and how it is used to protect data through the use of encryption.


## Contents

1. [Introduction](#introduction)
   - [Key Generation and Distribution:](#keygendist)
   - [Message Encryption and Decryption](#encdec)

2. [Example](#example)

3. [Problems](#problems)

## 1. Introduction <a id='introduction'></a>

Every day we generate, accumulate, and communicate tremendous amounts of data. Everything from our personal emails, ﬁnancial transactions, and medical records, to a corporation’s intellectual property, business strategy, and online sales, to the daily operation and functioning of our government and military organizations. In each of these cases we expect, even require, that our information is stored and communicated securely such that it’s available to ourselves and to our intended recipients, but to nobody else.  
Public key cryptography was developed in the 1970s by several groups. Whitﬁeld Difﬁe and Martin Hellman published the ﬁrst public key cryptography scheme, the [Difﬁe-Hellman key exchange](https://en.wikipedia.org/wiki/Diffie%E2%80%93Hellman_key_exchange) protocol, in 1976. And later in 1978, the [RSA algorithm](https://simple.wikipedia.org/wiki/RSA_algorithm) was published by Ronald Rivest, Adi Shamir, and Len Adleman. In fact, researchers at the British General Communications Headquarters, GCHQ, had developed similar public key cryptography schemes in the early 1970s, but only declassiﬁed those achievements in 1997.  
Public key cryptography is based on asymmetric keys, that is, a public key is used for encryption and a separate independent, private key is used for decryption. The keys are necessarily related to one another mathematically and the information security **relies on it being computationally impractical to determine the private key given the public key**. This is generally possible through the use of one-way mathematical functions. For example, RSA cryptography is based on the idea that it’s mathematically easy for a computer to multiply two large prime numbers, $p$ and $q$, and obtain their product, $N$. But it’s a very hard problem for a computer to do the inverse, that is, factor that number, $N$, into its constituent prime factors, $p$ and $q$. So with one-way mathematical functions, ease of use is based on the mathematical *simplicity in the forward direction* and *security is based on mathematical complexity in the inverse direction*.  

Information is protected through the use of encryption. [Encryption](https://en.wikipedia.org/wiki/Encryption) takes raw data, for example, a message, often referred to as *plain text*, and translates it or encodes it into an unrecognizable message called a *cyphertext*. It does this by using an algorithm called a *cypher*. A cypher takes as inputs the plaintext being encoded and a key, typically a numerical bit string in today’s cryptography schemes.  

**Cypher**  
Combines plain text into an unreadable cyphertext through a prescribed algorithmic protocol based on mathematical operations.  

Similarly, a decryption key is used with a corresponding decryption protocol to reconstruct the plaintext message. The most common cryptography schemes historically are called [symmetric key cryptography](https://en.wikipedia.org/wiki/Symmetric-key_algorithm) in which the encryption and decryption keys are the same or can be trivially related to one another. Examples include the older [data encryption standard](https://en.wikipedia.org/wiki/Data_Encryption_Standard), **DES, and the more modern [advanced encryption standard](https://en.wikipedia.org/wiki/Advanced_Encryption_Standard), **AES**. The [one-time pad](https://en.wikipedia.org/wiki/One-time_pad) is another example of a symmetric key cryptography scheme. Since symmetric keys are used to both encrypt and decrypt a message, they must remain private. This isn’t a serious issue on, say, a personal computer where we want to encrypt the hard drive and so the keys are stored locally. However, it raises a challenge when communicating information. When two parties want to communicate securely, they either need to meet in advance to secretly share the symmetric keys – not a very practical option – or they need to establish private channels to securely distribute the key.  
In the future, efﬁcient quantum key distribution protocols may provide a quantum enhanced level of security when transmitting a secret symmetric key. But today, secure communication links are generally established through the use of [public key cryptography](https://en.wikipedia.org/wiki/Public-key_cryptography) based on **asymmetric** keys.  

To get an idea for how this works, consider the following scenario.  


## Key Generation and Distribution <a id='keygendist'></a>
**Bob** wants to communicate a message securely to **Alice** without that message being read by an eavesdropper, Eve. To do this, Bob *requests* a public encryption key from Alice.  
The encryption and decryption keys are generated using a one-way function related to modular exponentiation. To begin, two distinct prime numbers p and q are chosen at random. Key generation then centers on the function

$$f(x) = a^x \mod N,$$
where $a \neq p,q$ is an integer number, the modulus $N = pq$. The term ”modular exponentiation” refers to there being an exponential $a^x$ that is calculated modulo $N$.  
From the [number theory of modular exponentiation](https://en.wikipedia.org/wiki/Modular_exponentiation), it is known that the function $f(x)$ is periodic with period $r$, and ﬁnding the value of $r$ is called **order ﬁnding**. Order ﬁnding is a manageable task if one knows $p$ and $q$, but not if one only knows their product $N$. This is the one-way function that underlies the security of RSA.  
To see this, consider the following. When $x = 0$ in the above function, $f(0) = a^0 \mod N = 1$. If $f(x)$ is periodic in $r$, then it follows that the function $f(r)$, $f(2r)$, $f(3r)$, ... must also equal $1$. The period $r$ can be derived using [Fermat’s little theorem](https://en.wikipedia.org/wiki/Fermat%27s_little_theorem), which uses [Euler’s totient function](https://en.wikipedia.org/wiki/Euler7s_totient_function)

$$r = (p−1)(q−1)$$
to ﬁnd the period r.  

In essence, Euler’s totient function gives the number of positive integers smaller than the product $pq = N$ which have no prime factors in common with $N$ – that is, coprime with $p, q,$ and $N$ – and Fermat’s little theorem shows that this is the period $r$. Note that $r$ may be the fundamental period, or it may be an integer multiple of the fundamental period. The fundamental period is the least common multiple of $(p − 1)$ and $(q − 1)$. Both the fundamental period or a multiple of the fundamental period will work.

With $p$ and $q$ known, the totient can be easily calculated with a classical computer. However, if one is only given $N$ (without knowledge of $p$ and $q$), then one must factor $N$ to ﬁnd $p$ and $q$ and thereby ﬁnd the period $r$. **Factoring is a hard problem for a classical computer.**

With the period $r$ in hand, one can generate the public key $\textbf{e, N}$ and private key $\textbf{d, N}$. One ﬁrst chooses a public encryption exponent $\textbf{e}$ that satisﬁes the following conditions:

$$1. ~~~3 ≤ e ≤ (N −1),$$

$$ 2. ~~~ e~~ \text{is coprime with} ~~r$$  

The second condition means that there is no integer (other than $1$) that divides both $e$ and $r$, and so their [greatest common divisor](https://mathworld.wolfram.com/GreatestCommonDivisor.html) $(\gcd)$ is $1$. This is written as $\gcd(e,r) = 1$.
Once a value of $e$ is chosen, then one calculates a private decryption exponent $d$ as the modular multiplicative inverse of $e$, modulo $r$:

$$d = e^{−1} \mod r.$$

Note that the notation for the modular multiplicative inverse above, while often found, can be misleading. $d = e^{−1} \mod r$ is another way of writing $d$ such that $$de \mod r = 1.$$ It does not mean to take the reciprocal of $e$.



## Message Encryption and Decryption <a id='encdec'></a>
Alice then sends a public encryption key, but retains her private decryption key. During transmission on a public channel, in principle, Eve, in principle, obtains the encryption key. When Bob receives, he encrypts his plaintext message using the encryption key, creating cyphertext. Bob then sends the cyphertext back to Alice on a public channel. Although Eve can obtain the cyphertext in principle, she can’t unlock the padlock since she only has the public encryption key. The security here is based on it being very hard for Eve to ﬁgure out the decryption key based on what she has in her possession. Alice, on the other hand, is able to trivially unlock the cyphertext with her decryption key, thereby converting the cyphertext to plaintext and reading the message. This type of public key encryption scheme is used ubiquitously today for secure communication. In many cases, it’s used to establish a private channel that provides [authentication](https://en.wikipedia.org/wiki/Message_authentication_code) and enables symmetric keys to be exchanged securely. This is because the public key approach based on asymmetric keys is generally slower to implement than symmetric schemes. Public key encryption is a foundation for our information security today.  
This is how it is realized step by step:  
Having received the public encryption key $\{e,N\}$, Bob he encrypts his message $m$ to create cyphertext $c$ using the public key and modular exponentiation:  

$$
c = m^e \mod N.
$$  
After the message is sent to Alice, she decrypts it with her private key $\{d,N\}$ using the properties of modular exponentiation,  

$$
c^d \mod N = (m^e)^d \mod N = m
$$  
provided the original message $m$ is smaller than the modulus $N$. Eve has access to the public key $\{e,N\}$ , but this is no use to her in trying to decrypt the cyphertext, since $c^e \mod N \neq m.$

In summary, the number theory and properties of modular exponentiation underlie the security of the RSA cryptosystem. As you have seen, the challenge of order ﬁnding can be related to the problem of factoring. While it is straightforward to multiple two numbers p and q to give the modulus N, the inverse problem is hard on a classical computer. And, if N is chosen sufﬁciently large, then a brute-force search to ﬁnd p and q is also inefﬁcient. Thus, the RSA cryptosystem is believed to be secure provided the prime numbers p and q, and the private exponent d are not made public. However, the encryption exponent e and modulus N can be made public. This type of cryptosystem, where the encryption key is public and anyone can encrypt a message, while the decryption key is private and only the receiver can decrypt the cyphertext, is the backbone of our modern day communication systems.

## 2. Example <a id='example'></a>

Let’s consider a simple example. We’ll take the number $a=3$ and its exponents, modulo $10$ to calculate $f(x) = 3^x \mod 10$, where $x$ is an integer. 

$$3^1 = 3 ~ \mod 10 = 3$$
$$3^2 = 9 ~ \mod 10 = 9$$
    $$3^3 = 27 ~\mod 10 = 7$$
    $$3^4 = 81 ~\mod 10 = 1$$
    $$3^5 = 243 ~\mod 10 = 3$$
    $$3^6 = 729 ~\mod 10 = 9$$
    $$3^7 = 2187 ~ \mod 10 = 7$$
    $$3^8 = 6561 ~\mod 10 = 1$$   
    
We see that the numbers repeat with periodicity $r=4$. 
Let’s see how this plays out in our above example. The modulus is $N = 10$, which has prime factors $p = 2$, $q = 5$ so $2\times 5 = 10$. Then $p−1 = 1$, and $q−1 = 4$. Thus, $(p−1)(q−1) = 1\times 4 = 4$. And this evenly divides the period $r = 4$.  

**Eve could try to find the period but there is no efﬁcient algorithm to do so with a classical computer. We will later see that if Eve has a quantum computer, she can find the period, with Shor's algorithm, in polynomial time.**

## 3. Problems <a id='problems'></a>

Alice: Create a function that generates key, $e$ and $N$. To do that here are the steps:  
 
     - create a function, `is_prime(number)`, that checks primality of a (positive) number
     - choose two numbers, $p,q$ randomly and check their primality with `is_prime(p)`, is_prime(q)
     - multiply $p$ and $q$ to produce $N=pq$ to be used for modular exponentiation
     - produce $r = k(p-1)(q-1)$, the period of the function $f(x) = a^x mod N$. 
         The fundamental period is the least common multiple of (p−1) and (q−1).

Here import the modules you need in the code below

In [None]:
import random
import math

In [1]:
def is_prime(n):
    """
    checks whether n is a prime number
    
    Arguments
        n: a positive integer
    returns True or False
    """
    # YOUR CODE HERE

In [2]:
def lcm(a, b):
    """
    computes the least common multiple (lcm) of a and b
    
    Arguments
        a: an integer
        b: an integer
    returns lcm of a and b
    
    """
    # YOUR CODE HERE

Did you use `math.gcd()` in the implementation of the funciton lcm() above? Now try to implement your own function for GCD calculation using [Euclid's algorithm](https://en.wikipedia.org/wiki/Euclidean_algorithm)

In [None]:
def euclid_gcd(a, b):
    """
    computes the GCD for a and b using Euclid's algorithm
    
    Arguments
        a: an integer
        b: an integer
    returns gcd of a and b
    """
    # YOUR CODE HERE

In [None]:
def modular_exp(base, degree, modulo):
    """
    implements modular exponentiation base^degree mod modulo
    
    Arguments
        base: the number to be exponentiated
        degree: the degree of exponentiation
        modulo: the modulo
    returns the result of the modular exponentation
    """
    # YOUR CODE HERE
    

    
def inv_modular_exp(a, modulo):
    """
    computes the modular multiplicative inverse of a mod modulo 
    
    Arguments
        a: an integer
        modulo: modulo
    returns inverse of a such that inva * a = 1 mod modulo. Returns -1 if inverse does not exist
    """   
    # YOUR CODE HERE
    # Hint: Use extended Euclidean algorithm

In [1]:
def text_to_int(text):
    """
    converts text to integer
    
    Arguments
        text: a string
    returns integer representation of text
    """
    # YOUR CODE HERE
    


def int_to_text(n):
    """
    converts integer to text
    
    Arguments
        n: an integer
    returns the textual representation of n
    """
    # YOUR CODE HERE

In [2]:
def random_prime(s, e):
    """
    generates a random prime number in the given interval
    
    Arguments
        s: an integer specifying the start of the interval
        e: an integer specifying the end of the interval
    returns the generated prime number
    """
    # YOUR CODE HERE


def generate_key():
    """
    generates public keys {e,N} and private key d by following the protocol 
    described in the beginning of the notebook and summarized below
    N = p*q (with p and q being random prime numbers)
    r = lcm(p - 1, q - 1)
    
    e = random.randint(3, N-1)
    
    create decryption key d = e^(-1) mod r
    d = inv_modular_exp(e, r)
    
    return e, N, d
    """
    # YOUR CODE HERE

Bob: encrypts his message with the public keys provided by Alice and sends back to Alice

In [3]:
def encrypt(text, e, N):
    
    m = text_to_int(text)
    c = modular_exp(m, e, N)
    
    return c

Alice: receives the encrypted cyphertext and decodes it

In [4]:
def decrypt(c, d, N):
    
    m = modular_exp(c, d, N)
    text = int_to_text(m)
    
    return text

In [None]:
e, N, d = generate_key()
text = 'hi'
print("message encoded and sent by Alice: {}".format(text))
c = encrypt(text, e, N)
m = decrypt(c, d, N)
print("message received and decoded by Bob: {}".format(m))