# The RSA Cryptosystem

In this Python notebook, we provide tools to implement the RSA cryptosystem.  This cryptosystem is one of the most popular for securely sending a short message.  In practice, this is used in the initial "handshake" of the TLS (Transport Layer Security) protocol.  See https://en.wikipedia.org/wiki/Transport_Layer_Security#Key_exchange_or_key_agreement for some details.  You have almost certainly used the TLS protocol, and thus RSA, when visiting secure websites, chatting, etc.  For example, check your email -- do you see a padlock icon next to the URL to the left of https?  Click on it... you might see something about TLS.

We begin by importing the random package for Python.  Security depends on choosing large *random* numbers.  Significant breaches of security have occurred when parties have not used good randomization methods.  

In [None]:
from random import *

Now we provide a function to test whether a number is prime, using the Miller-Rabin algorithm.  We enhance it a bit by checking for small factors first.  Using 50 witnesses is typically sufficient to be sure of primality.

In [None]:
small_primes = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31] # A list of small primes.

def is_prime(n, number_witnesses = 50):
    '''
    This function carries out the Miller-Rabin test
    to determine whether the input number n is prime.
    The test is probabilistic, and the probability of a 
    false-positive is less than 1 out of 4^number_witnesses.
    At the default 50 witnesses, the probability of a 
    false positive is less than 10^(-30).
    The code below is adapted from Gareth Rees, at
    http://stackoverflow.com/questions/14613304/rabin-miller-strong-pseudoprime-test-implementation-wont-work
    '''
    if n < 2: 
        return False  # numbers smaller than 2 are not prime.  Don't waste effort.
    if n in small_primes:
        return True
    for p in small_primes: # It's worth the time to check for small prime factors first.
        if n % p == 0: 
            return False
    if n >= 37*37:  # The previous tests will work for numbers up to 37*37
        # Now we carry out the Miller-Rabin test.
        # Step one:  Decomposing n-1 as a power of 2 times an odd.
        e = 0
        m = n - 1
        while m % 2 == 0: #As long as m is even.
            e += 1
            m = m//2 # Integer division, to be safe in Python 2.7 or 3.x.
        # The result of the above process is that
        # n-1 = 2^e * m, and m is odd.
        
        for t in range(number_witnesses): # Repeat number_witnesses times.
            # Step two: computing witness^m mod n
            witness = randint(2, n - 1) # Choose a random witness each time.
            s = pow(witness,m,n) # Compute witness^m power to start.
            # Step three:  successive squaring, to look for ROO violations.
            k = 0
            while (k < e):
                ss = (s*s)%n # square the number.
                if (ss == 1) and (s != 1) and (s != n-1):  # the expression != means "is not equal to" in Python.
                    return False # A ROO violation implies n is not prime.
                s = ss
                k += 1
            # Step four:  if no ROO violations, check FLT.
            if s != 1:  
                return False # A FLT violation implies n is not prime.
        
    return True

Now we test out a few examples.

In [None]:
is_prime(10)

In [None]:
is_prime(17)

In [None]:
is_prime(100000000000000123918739183)

In [None]:
is_prime(1000000000000066600000000000001) # Belphegor's prime.  See https://en.wikipedia.org/wiki/Belphegor%27s_prime

In [None]:
is_prime(2**127 - 1) # The largest known prime, from 1876 - 1950

In [None]:
is_prime(2**67 - 1) # A big composite number

In [None]:
193707721 * 761838257287 # It's pretty hard to find this prime factorization!!

In [None]:
2**67 - 1 # But it works.  Cole found the above factorization in 1903!  (No computers!)

In [None]:
is_prime(2**521 - 1) # The first record prime found by computer.  Found by Turing in 1952.

The following function cooks up a random prime number, with a desired number of binary digits (bits).

In [None]:
def make_prime(b):
    '''
    This function creates a random prime number
    whose binary expansion has b bits.'''
    if b > 3000:
        print "I don't want to make such a big prime."
        return None
    else:
        max_attempts = 10*b # This is usually enough tries.
        t = 0
        while t < max_attempts:
            t += 1
            wanna_be_prime = getrandbits(b)
            if is_prime(wanna_be_prime):
                return wanna_be_prime
    return None # Return None if no prime was found within max_attempts tries.
        

Let's test it out!  Don't try to make primes more than 2000 bits.  It might get slow.

In [None]:
make_prime(500) # Make a random five-hundred-bit prime.
    

Back to basics.  This is the Euclidean algorithm, applied to find the GCD of two numbers.

In [None]:
def gcd(a,b):
    if a == 0:
        return b
    elif b == 0:
        return a
    else:
        divisor = a
        dividend = b
        while dividend != 0:
            remainder = divisor%dividend
            divisor, dividend = dividend, remainder
        return divisor

In [None]:
gcd(91,221)

Recall that x is called the multiplicative inverse of a, mod m, if ax is congruent to 1 mod m.  Equivalently, if ax - 1 = my for some integer y.  Equivalently, if ax - my = 1.

Finding a multiplicative inverse is thus equivalent to solving a linear Diophantine equation.  The Euclidean algorithm can be used to solve such equations.  The code below uses the "extended Euclidean algorithm" to quickly find modular inverses.  It's a bit hard to follow -- don't worry about it, unless you have extra time to figure it out.

In [None]:
# The following code is from 
# https://en.wikibooks.org/wiki/Algorithm_Implementation/Mathematics/Extended_Euclidean_algorithm

def egcd(a, b):
    if a == 0:
        return (b, 0, 1)
    else:
        g, y, x = egcd(b % a, a)
        return (g, x - (b // a) * y, y)

def modinv(a, m):
    g, x, y = egcd(a%m, m)
    if g != 1:
        raise Exception('modular inverse does not exist')
    else:
        return x % m

Let's find the modular inverse of 3 modulo 40.  It should be 27, since 3 * 27 = 81, which is congruent to 1 mod 40.

In [None]:
modinv(3,40)

## The cipher.
To send text messages, we need to translate strings of text into numbers, and be able to translate those numbers back into text.  The following functions carry this out in a simple fashion.  Don't worry about how it works... but if you're interested, the algorithm converts each character to a number between 0 and 255 using ASCII, and then pastes the numbers together byte-by-byte (with an initial 1).  Decoding breaks up the number into a sequence of bytes then uses ASCII to convert these to bytes to a string.

In [None]:
def str_to_num(s): # Converts a string to a number.
    bitlist = [1]
    for c in s:
        bits = bin(ord(c))[2:]
        bits = '00000000'[len(bits):] + bits
        bitlist.extend([int(b) for b in bits])
    out = 0
    for bit in bitlist:
        out = (out << 1) | bit
    return out

def num_to_str(n):  # Converts a number back to a string.
    bitlist = [1 if digit=='1' else 0 for digit in bin(n)[3:]]
    chars = []
    for b in range(len(bitlist) // 8):
        byte = bitlist[b*8:(b+1)*8]
        out = 0
        for bit in byte:
            out = (out << 1) | bit
        chars.append(chr(out))
    return ''.join(chars)

In [None]:
coded = str_to_num("Hello there!")
print coded # Here's the coded message.
decoded = num_to_str(coded)
print decoded # I hope it says "Hello there!"

Now we're ready to carry out the RSA cryptosystem.  The code below can be used by Alice and Bob.  Some team members (with one computer) should play Alice and other team members (with another computer) should play Bob.  In this scenario, Bob wants to send an encrypted message to Alice.  Here are the steps.

Bob should wait for Alice and then start at Step 2.

# Step 1:  Alice creates her private key, and announces her public key.

Alice should create two large (e.g. 512 bit) prime numbers, called p and q.  She keeps these private.  Don't tell anyone (not even Bob) these prime numbers!

Alice:  complete the commands below to create the two prime numbers.

In [None]:
p = make_prime(512)
q = make_prime(512)

Now Alice computes her public key.  She declares N to be the product of p and q.  She uses the exponent e = 65535 for traditional reasons.  (It's not so important).  She tells the public (including Bob) the numbers N and e.

Alice:  run the following commands and communicate N and e to Bob.  You can email them to Bob if you want.  

In [None]:
N = p*q
e = 65535
print N
print e

Alice:  Now wait for Bob's secret message!

## Step 2:  Bob receives the public key, decides on his message, encrypts it using the public key, and sends the result back to Alice.

When Bob receives the public key, enter the long number as N, and the exponent (probably 65535) as e below.

In [None]:
N = 

e = 

Bob:  Choose a short message and convert it to a number with the str_to_num() command.  Change "Hello!" to your secret message, in the command below.  Each letter of the message requires 8 bits.  The number of bits of your message must be less than the number of bits in N.  So if Alice uses p,q with 512 bits, then you can use up to 128 characters in your message.

Then compute the ciphertext c = m^e modulo N, where (N,e) is the public key you receive from Alice.

In [None]:
m = str_to_num("Hello!")
c = pow(m,e,N)
print c

Bob:  send your ciphertext c to Alice.  You might use e-mail to do this.  This c is your encrypted message.

## Step 3:  Alice decrypts the message.

Alice:  you should have received ciphertext (a big number) from Bob.  Now it's time to decode the message!

As a first step (you can do this while waiting for Bob), compute the multiplicative inverse of e, modulo the totient of N.  Since you know N = pq, you know phi(N) = phi(p) phi(q) = (p-1) (q-1)

In [None]:
totient = (p-1)*(q-1) # Thankfully, you know p and q.  Nobody else does, so nobody else can decode the message!
d = modinv(e,totient) # The modular inverse of e, modulo the totient.

Now enter the encrypted message that Bob sends you.

In [None]:
c = 

Now it's time to decode the message!  pow(c,d,N) computes the decrypted number, and num_to_str turns it back into a string.

In [None]:
num_to_str(pow(c,d,N))

Did you get Bob's message?  

Try switching roles, and sending messages until you understand the whole process.