# RSA Cryptography explanation

RSA cryptography remains a mathematically secure method of performing asymmetric cryptography.  In this notebook, we will provide an example and explain the mathematical fundamentals of the encryption decryption process.

## Background

Data sent through the internet will be go through various nodes, i.e. (switches, routers and hubs) before reaching the final destination.  It must be assumed that at every node, there exists an opportunity for an eavesdropper to monitor and log all data that goes through.  Thus, data encryption is necessary to prevent the interception of sensitive data in this manner.  A sender and receiver both possess a key that will encrypt data when sent through the internet and decrypt data after it has reached its destination.  Any data intercepted in between will be meaningless to people without the encryption/decryption keys.

This introduces another logistical issue.  An encryption key sent through the internet (for other users to encrypted data with) will also be intercepted by eavesdroppers thus defeating the purpose of any further encryption.  To counter this, is necessary for the encryption key to be *different* than the decryption key.  For example, Person A could broadcast an encryption key on the internet that the public can use to encrypt data to send to him.  But only Person A will have the decryption key to decrypt the encrypted data sent to him by the public.

This is the basis of asymmetric cryptography.  When the Encryption and Decryption keys are different.



## Mathematical concept

RSA operates on the mathematical principle so that with 3 numbers *a*, *b*, and *c*, an encryption and decryption cycle can be completed as below:

$$ plaintext^{\small a} \enspace mod \enspace c = ciphertext $$

<p style="text-align: center;">and</p>

$$ ciphertext^{\small b} \enspace mod \enspace c = plaintext $$


A very simple example can be demonstrated with *a* = 13, *b* = 37, *c* = 77, and *plaintext* = 2.

Since 2 is the number we want to keep secret and encrypt, we will take *plaintext* (2) to the *a* power first and the encrypted number (ciphertext) will be the remainder when divided by *c*.

$$
2^{13} \enspace mod \enspace 77
$$

<p style="text-align: center;">or</p>

$$
8192 \enspace mod \enspace 77 = 30
$$

The ciphertext is 30 since that is the remainder when 8192 is divided by 77.

To revert the number 30 back to the plain text, then the we take it to the *b*th power and do a modulo division with *c*

$$
30^{37} \enspace mod \enspace 77
$$

<p style="text-align: center;">or</p>

$$
4502839058909973630000000000000000000000000000000000000 \enspace mod \enspace 77 = 2
$$

from which the above equation results in 2, same as the original plaintext.

Since *a* *b* and *c* are necessary to complete an encryption and decryption cycle, publicly broadcasting *a* and *c* would only provide the public the ability to encrypt data, whereas the person who secretly keeps *b* private, would only be able to decrypt data that was encrypted by a and c.

The above example only demonstrates a simple case.  In practicality, *a* *b* and *c* are hundreds or even thousands of digits long.

**Thus *a* and *c* are the public keys, the numbers necessary to encrypt data and *b* is the private key, the number required to decrypt data.**

While the public and private keys are mathematically linked, determining b from a and c requires knowing the factors of c.  If c were a product of two very large prime numbers, then determining b would be extremely difficult as there exists no efficient way to factor a product of two large prime numbers.  A person can create its own public and private key set by generating two large prime numbers, calculating *a*,*b*,*c* from it and its safety its guaranteed so long as nobody can factor it.

## Python Implementation

First we will create three functions, the Modulo Exponent function, the Miller Rabin Primality test and the Euler's GCD theorem.  The Modulo Exponent function is necessary for the Miller Rabin Primality test.


In [13]:
import random

def ModuloExponent(base, exponent, modulus):
    if modulus == 1:
        return 0
    result = 1
    base = base % modulus
    while exponent > 0:                                 
        #Check if the right most byte is a 1.  If so the result is multiplied by the current base modulo
        if exponent % 2 == 1:                           
            result = (result * base) % modulus
        #Shift bytes right
        exponent //= 2
        #Update the base to the next squared exponent
        base = (base*base) % modulus
    return result

def miller_rabin(n, k):

    if n == 2 or n == 3:
        return True

    if n % 2 == 0:
        return False

    r, s = 0, n - 1
    while s % 2 == 0:
        r += 1
        s //= 2
    for _ in range(k):
        a = random.randrange(2, n - 1)
        x = ModuloExponent(a, s, n)
        if x == 1 or x == n - 1:
            continue
        for _ in range(r - 1):
            x = pow(x, 2, n)
            if x == n - 1:
                break
        else:
            return False
    return True

def Eulers_GCD(a, b):
    r = a % b
    q = a//b
    while r != 0:
        a = b
        b = r
        q = a//b
        r = a - b*q
    return b

The Miller Rabin primality test will return true if an input number n has an overwhelmingly high probability of being prime, while Euler's GCD will return the greatest common denominator between two input numbers or simply 1 if there are none.

The next step is to find two prime numbers p and q.  In the line below, p is already set to an initial value, but you can set p to any number simply by modifying the line.

In [14]:
#You can modify this to whatever number you like
p = 2344190

In the next block of code, p will be incremented until it passes the Miller Rabin primality test.

In [15]:
while miller_rabin(p, 40) == False:
    p += 1

Another prime number q is needed.  Similar to p before, q is given an initial value, but it can be modified to any other number.

In [16]:
#You can modify this to whatever number you like
q = 8099343

Like before, q will be incremented until it passes the Miller Rabin primality test.

In [17]:
while miller_rabin(q, 40) == False:
    q += 1

print(p,q)

2344193 8099347


These are the two prime numbers we obtained.  

Using p and q, we acquire two other numbers of importance, the **modulus**:

In [18]:
modulus = p*q

and the **totient**

In [19]:
totient = (p-1)*(q-1)

print("The modulus and totient respectively:", modulus,totient)

The modulus and totient respectively: 18986432541971 18986422098432


Next, we need a prime number n that is coprime with the totient, i.e. the greatest common factor between them is 1.  

Here we increment n until it passes the Miller Rabin test (so that it is prime) and has a value of 1 returned from the Euler's GCD function (shares no common factors with the totient).  n should also be smaller than the modulus

In [20]:
#You can modify this number to whatever you like so long as it is < modulus
n = 65537

while miller_rabin(n, 40) == False or Eulers_GCD(totient, n) != 1:
    n += 1 

print(n)

65537


Now that we have our n, and totient, it is time to calculate the private key.
The private key (pk) is such that the below equation is satisfied.

n * pk mod totient = 1

$$
n \times private \enspace key \enspace mod \enspace totient = 1
$$

Simply, when n * private key is divided by the totient, the remainder must be 1.  To accomplish this, we need will increment an integer i until the below equation is satisfied

$
(i \times totient + 1) \enspace mod \enspace n = 0
$
 and the quotient will be the private key.  To accomplish this, we use the code below

In [21]:
def Get_Private_Key(totient, n):
    for i in range(0, n):
        if ((i * totient)+1) % n == 0:
            return ((i * totient)+1)//n
        
pkey = Get_Private_Key(totient, n)

print("The Modulus", modulus)
print("The n exponent", n)
print("The private key", pkey)

The Modulus 18986432541971
The n exponent 65537
The private key 6313259843585


Now we have all the numbers we need.  The modulus, and n is publicly broadcast as the public key, while the private key is kept secret.  To encrypt data, the formula is (plain text)^n mod modulus.

To efficiently take the modulus of a large exponential number, we use the following properties of modulo division as demonstrated in this example:

$$
a \enspace mod \enspace b = c
$$
$$
a^2 \enspace mod \enspace b = c^2 \enspace mod \enspace b = d
$$
$$
a^4 \enspace mod \enspace b = d^2 \enspace mod \enspace b = e
$$
$$
a^8 \enspace mod \enspace b = e^2 \enspace mod \enspace b = f
$$

To calculate 
$
a^{13} \enspace mod \enspace b
$
would be equivalent to 
$
(a^8 \enspace mod \enspace b \times a^4 \enspace mod \enspace b \times a \enspace mod \enspace b ) \enspace mod \enspace b
$

Which can be expressed as 
$
(f \times e \times c) \enspace mod \enspace b
$
using the coefficients in the table above

The trick to tackling a large exponential modulo function (even when the numbers are hundreds of digits long) is to determine all necessary squared modulo coefficients, then multiply the necessary corresponding coefficients so that the product equates to the original number, and determine its modulus.

This effect is reflected in the ModuloExponent function defined earlier in the first code box.


Now we have all the necessary fuctions and variables defined.  We will attempt to encrypt a number

In [22]:
#Replace the plaintext variable with any number you like
plaintext = 2

Now to encrypt it using the modulo exponent function.  We will take the plaintext to the nth power mod Modulus

In [23]:
ciphertext = ModuloExponent(plaintext, n, modulus)
print("The plaintext number has been changed to: ",ciphertext)

The plaintext number has been changed to:  11201462320055


The ciphertext variable is now the encrypted plaintext number.  The cipher text can be safely sent throughout the internet and only the person with the private key can decrypt it.  To decrypt the ciphertext, we take the cipher text to the private key's power mod Modulus.

In [24]:
decryptedtext = ModuloExponent(ciphertext, pkey, modulus)
print("The decrypted number is: ", decryptedtext)
print("which should match with the original plaintext", plaintext)

The decrypted number is:  2
which should match with the original plaintext 2
