# Public Key Cryptography (RSA)

Adapted from _Mathematics: A Discrete Introduction_ by Edward R. Scheinerman

## The Problem

Suppose Wally the Wart (Wally) is at Harvey Mudd College in Claremont California and the Three-Eyed Alien (TEA) is on the moon! TEA has discovered the Answer to the Ultimate Question of Life, the Universe, and Everything; and it's not 42!

TEA desparately wants to tell Wally the _real_ answer, but there's a problem; their communications are being monitored by the dastardly picobot.

**How can TEA and Wally communicate without picobot knowing the contents of their messages?**

We don't want picobot to know the letters in a message. We might consider encrypting the letters somehow (ceasar cipher??). However, this would require Wally and TEA to already have some sort of key to the encryption. The problem is that they cannot communicate to create a key because picobot could intercept it!

The key insight to RSA Key Cryptography is that _everyone_ knows _part_ of how it works! However, it is practically impossible for anyone to decrypt a message without knowing _all_ of how it works.

The RSA system (named after its inventors R. Rivest, A. Shamir, and L. Adleman) does this by taking advantage of the difficulty of finding the prime factorization of large numbers!

First, we must be able to convert letters to numbers and vice versa

### Task #1

Most characters (letters of the alphabet, numbers, kanji, etc) have some unique ASCII (American Standard Code for Information Interchange) designation number. 
- For example, the letter 'e' has number 101, the letter 'a' has number '97', and the letter 'r' has letter 114 (and yes capitalization makes a difference!). 

Our first step towards a solution is to convert strings to numbers and vice versa. Converting strings to numbers is easy, but converting numbers to strings requires a little more thought. Most important English characters can be represented as 2 or 3 digit numbers. For consistency, we will make every number a 3 digit number by adding leading zeroes
- For example, the word 'ear' can be represented as the number '101097114' 
- NOT '10197114'

Unfortunately, python does not allow integers to have leading zeroes... so let's create strings of numbers instead!

Your first task is to write two functions: 
- `letterToNumber(string)` takes a string and converts it to a long number that holds the ASCII representations of the characters in the string. For example, 'ear' should become '101,097,114'

- `numberToLetter(num)` does the reverse! It takes a long number that holds the ASCII representations of the characters in the string and converts it to a string. For example, '101,097,114' should become 'ear'

In [None]:
%pprint

In [None]:
# ord(letter) gives the ASCII number
# chr(number) gives the ASCII letter
def letterToNumber(string):
    """Converts a string into a long number based on ASCII.
    Make sure that each number has three digits! 
    For example, 'a' should be '097', not '97'"""
    
    print(f"Converting '{string}' to number...")
    codedMessage = ''                      # the number we will return
    for s in string:                       # loop through string chrs
        order = f"{ord(s)}"                # get the ASCII number
        while len(order) < 3:               
            order = '0' + order            # add leading 0s if needed
        codedMessage += order              # add number for return
    print(f"Converted to {codedMessage}\n")
    return codedMessage

In [None]:
def numberToLetter(num):
    """Converts a number into a string based on ASCII.
    Make sure that you account for leading zeroes if needed!
    Consider checking the length of the number..."""
    
    print(f"Converting {num} to message...")
    codedMessage = ''             # the message we will return
    for s in range(0,len(num),3):
        character = chr(int(num[s:s+3]))
        codedMessage += character
    print(f"Converted to '{codedMessage}'\n")
    return codedMessage

Once we have the string as a long number, what happens next? 

Well, he can create two functions - one to encrypt a large number E(M), and one to decrypt it D(N). He will then send the encryption function E(M) to TEA. TEA can encrypt the message M using the function E(M) and send the results N back to Wally. Wally can then use the decryption function D(N) to see the message M!

Here's how it works:  

|     Wally     |     picobot     |     TEA     |
|:-------------:|:---------------:|:-----------:|
| creates functions |             |             |
| sends encryption to TEA --> | read by picobot| receives encryption|
|  |             |  encrypts message  |
| receives message | read by picobot| <-- sends message|
|decrypts message  |             |  |

We see that picobot will know the encryption function E(M) and will also know the encrypted message N. However, we want this to be insufficient to figure out the decrypted message!

The goal is to create functions that satisfy the following conditions:
- D(E(M)) = M; the functions are inverses!
- It is difficult to create D even if you know E

Here are the functions used in the RSA method:  
- Encryption function: $E(M) = M^e \text{ mod } n$  
- Decryption function: $D(N) = N^d\text{ mod } n$

What does this mean?

- $M$ is the decrypted message.  
- $N$ is the encrypted message that TEA will send to Wally
- $n$ is a very large number that is the product of two primes $p$ and $q$ (each hundreds of digits long)
- "mod" is simply a mathematical function (used in modular arithmetic) that is equivalent to finding the remainder of a number when doing division. For example, 5 divided by 2 leaves a remainder of 1, so we say that 5 = 1 mod 2
- The numbers $e$ and $d$ are chosen in a special way, but we won't go into the details... Feel free to return to this problem in Math55!

The point is that $n$ is a really large number with only two very large prime factors. In order to crack the code, picobot would need to be able to factor $n$. This is not yet feasible with modern technology...

Let's try it out! 

Below is a function that helps find $e$ and $d$. For the interested, it's the euclidean algorithm! Either way, run the cells below...

In [None]:
# Euclidean Algorithm
def euclidAl(a,b):
    """Performs the Euclidean Algorithm on two numbers"""
    # find greatest common denominator
    num1 = a
    num2 = b
    eqt = []
    r = 42
    while r != 0:
        q = num1 // num2 # quotient
        r = num1 % num2  # remainder
        eqt += [[num1,q,num2,r]]
        num1 = num2
        num2 = r
    L = eqt[:-1]   # list of equations to find linear combo
    # print(f"The greatest common denominator of {a} and {b} is {L[-1][-1]}")
    # if L[-1][-1] == 1:
        # print(f"{a} and {b} are relatively prime!")
    # print()
    # format lists nicely to find linear combo
    if L == []:
        if a >= b:
            return (b,[])
        else:
            return (a,[])
        
    revL = []
    for x in L:
        remSolve = [[x[3],x[0],-x[1],x[2]]]
        revL += remSolve
    revL.reverse()
    
    # Bezout's identity algorithm (back-substitution)
    coeff1 = 1
    coeff2 = revL[0][2]
    # print(f"Starting with {L[-1][-1]} = {coeff1}x{revL[0][1]} + {coeff2}x{revL[0][3]}\n")
    for i in range(0,len(revL)-1):
        # print(f"Substituting in {revL[i+1][0]} = 1x{revL[i+1][1]} + {revL[i+1][2]}x{revL[i+1][3]}")
        # print(f"Now we have {L[-1][-1]} = {coeff1}x{revL[i][1]} + {revL[i][2]}x(1x{revL[i+1][1]} + {revL[i+1][2]}x{revL[i+1][3]})")
        oldCoeff1 = coeff1
        coeff1 = coeff2
        # print(f"coeff1 is now {coeff1}")
        coeff2 = oldCoeff1 + (coeff2*revL[i+1][2])
        # print(f"coeff2 is now {coeff2}")
        # print(f"Simplifying gives us {L[-1][-1]} = {coeff1}x{revL[i+1][1]} + {coeff2}x{revL[i][1]}\n")
    return (L[-1][-1],[coeff1, coeff2])   

In [None]:
import random

In [None]:
p = 41
q = 43
print(f"Our prime numbers are {p} and {q}")
n = p*q
print(f"n (their product) is {n}")
phi = (p-1)*(q-1)
Z_mod = list(range(1,phi))
e = random.choice(Z_mod)
lc = euclidAl(phi,e)
while lc[0] != 1:
    e = random.choice(Z_mod)
    lc = euclidAl(phi,e)
print(f"We randomly chose e to be {e}")
# print(f"The euclidean algorithm tells us that {lc[1][0]}x{phi}+{lc[1][1]}x{e}=1")
d = lc[1][1]
while d not in Z_mod:
    if d < 0:
        d += phi
    elif d >= phi:
        d -= phi
print(f"Therefore d is {d}")

### Task #2

Your next task is to write two more functions: 
- `encrypt(num)` takes a string (the output of `letterToNumber()`) and converts it to a _new_ long number using the RSA formula

- `decrypt(num)` takes a string (the output of `encrypt()`) and concerts it _back_ to the regular ASCII number string (the output of `letterToNumber()`)

Here are some hints:
- If you use the encryption and decryption functions on a single large number, your computer may never finish running the calculations! Consider having the functions run on one "character" at a time
- If you do this, be sure to check that the functions "read" the strings correctly. For example, if `encrypt()` converts '108' to '1365', you will have to make sure that `decrypt()` is "aware" of this length (`encrypt()` might not output three-digit numbers!
- Although `encrypt()` might convert 3-digit numbers to 4-digit numbers, make sure that `decrypt()` only returns 3-digit numbers! Otherwise we won't be able to convert it back to letters!

Once you have done this, you should be able to use all four functions as follows:
1. Think of a message (ex "CS")
2. Call `letterToNumber('CS')`; it should output something like '067083'
3. Call `encrypt('067083')`; it should then output something like '7911600' - this is the encrypted message!
4. Call `decrypt('7911600')`; it should output '067083' again!
5. Call `numberToLetter('067083')`; it should finally output 'CS'

In [None]:
def encrypt(num):
    """Encrypts a message (in number form)"""
    print(f"Encrypting...")
    secret = ''
    L = []
    while len(num) > 0:
        piece = str((int(num[:3])**e)%n)
        secret += piece
        # print(f"Exchanging {piece} with {num[:3]}")
        L += [len(piece)]
        num = num[3:]
        
    print(f"Encrypted message is {secret}\n")
    return secret, L

In [None]:
def decrypt(num,L):
    """Decrypts a message (in number form)"""
    print(f"Decrypting...")
    secret = ''
    while len(num) > 1:
        piece = str((int(num[:L[0]])**d)%n)
        while len(piece) < 3:
            piece = '0' + piece
        secret += piece
        # print(f"Exchanging {piece} with {num[:L[0]]}")
        num = num[L[0]:]
        L = L[1:]
        
    print(f"Decrypted message is {secret}\n")
    return secret

In [None]:
send = 'omega poggers'
num = letterToNumber(send)
encrypted, eList = encrypt(num)
decrypted = decrypt(encrypted, eList)
receive = numberToLetter(decrypted)

## Have Fun!

Below is a function that opens txt files and gets a string of everything inside!

We've also included some txt files for you to play around with...

Feel free to encrypt and decrypt whatever you want! 

Now you can have some privacy when communicating with other people!

In [None]:
def openFile(filename):
    openMe=open(filename, 'r', encoding='latin1')
    contont=openMe.read()
    return contont

In [None]:
send = openFile('DOI.txt')
num = letterToNumber(send)
encrypted, eList = encrypt(num)
decrypted = decrypt(encrypted, eList)
receive = numberToLetter(decrypted)

In [None]:
send = openFile('important.txt')
num = letterToNumber(send)
encrypted, eList = encrypt(num)
decrypted = decrypt(encrypted, eList)
receive = numberToLetter(decrypted)

In [None]:
send = openFile('important.txt')
num = letterToNumber(send)
encrypted, eList = encrypt(num)
decrypted = decrypt(encrypted, eList)
receive = numberToLetter(decrypted)

In [None]:
send = openFile('song.txt')
num = letterToNumber(send)
encrypted, eList = encrypt(num)
decrypted = decrypt(encrypted, eList)
receive = numberToLetter(decrypted)