# RSA tutorial

#### Caution: Fidelio is an educational program. Do not use it as a serious encryption tool!

It has known weaknesses, including:
- The primes used are too small to protect against skilled attackers.
- Fidelio's padding scheme is <a href="https://en.wikipedia.org/wiki/Padding_(cryptography)#Public_key_cryptography">not secure</a>.
- Python's `random` module [should not be used for security purposes](https://docs.python.org/3/library/random.html).

In [1]:
from fidelio_functions import *

## Public-key encryption
Alice wants to send Bob a message. She does not want anyone else to read it.  

Bob buys a padlock and keeps the only key. He mails the padlock to Alice.  
Alice locks the message in a sturdy box and mails it to Bob.  
If the lock and box are very difficult to break, then nobody but Bob can read the message.

To save shipping costs, Alice and Bob decide not to use a physical padlock.  
Instead, Bob sends Alice instructions for creating a mathematical puzzle.  
The puzzle is easy to create but hard for anyone (including Alice) to solve.  
Bob keeps a secret hint which helps him solve the puzzle.

In RSA encryption, the puzzle is the [RSA problem](https://en.wikipedia.org/wiki/RSA_problem).  
The instructions are Bob's **public key** $k$ and **RSA number** $n$. The hint is Bob's **private key** $x$.

In [2]:
message = "WHERE IS RPT WHERE IS TASK FORCE THIRTY FOUR RR THE WORLD WONDERS?"
print(message)

WHERE IS RPT WHERE IS TASK FORCE THIRTY FOUR RR THE WORLD WONDERS?


## Key generation
The RSA number $n$ is generated by multiplying two randomly-chosen primes.  
Finding the public and private keys is more complicated. (Scroll down for details.)

Fidelio uses prime numbers from [this list](https://www.math.utah.edu/~pa/math/p10000.html) compiled by Peter Alfeld of the University of Utah Mathematics department.

In [3]:
n, public_key, private_key = generate_keys(verbose=True)

Loading prime numbers from Primes.txt
RSA number is 42437 * 29569 = 1254819653
Euler totient is 42436 * 29568 = 1254747648
Public key is 24473
Private key is 856425641


## RSA encryption
Alice converts the message to a sequence of integers $M_1, M_2, \ldots$ For each integer $M$, she creates a cipher $C$ by [modular exponentiation](https://en.wikipedia.org/wiki/Modular_exponentiation) of each $M$ to the $k$th power (mod $n$), where $k$ is Bob's public key:

$$
C = M^k \ \% \ n
$$

Bob decrypts $C_1, C_2, \ldots$ by raising each $C$ to to the $x$th power (mod $n$), where $x$ is his secret private key:

$$
C^x \ \% \ n = M^{kx} \ \% \ n = M
$$

It takes some effort to find a public key $k$ and private key $x$ which make this scheme work. The details are explained below.

Python's built-in [`pow()`](https://docs.python.org/3/library/functions.html#pow) function can do modular exponentiation. Undoing this operation requires solving the [discrete logarithm problem](https://en.wikipedia.org/wiki/Discrete_logarithm_records), which is extremely difficult if you don't know the prime factorization of $n$.

It is important that $M < n$ for all $M$'s in the message. By default, Fidelio chooses primes between 10K and 100K, which ensures 100,000,000 < $n$ < 10,000,000,000. The `packetize()` function breaks a message into a sequence of "packets," each of which is an 8-digit (or less) decimal number. The largest possible packet is 99,999,999, so Fidelio guarantees that $M < n$.

In [4]:
cipher = rsa_encrypt(message,n,public_key)
print(cipher,'\n')

plaintext = rsa_decrypt(cipher,n,private_key)
print(plaintext)

[1008586977, 1084826872, 39941220, 235358966, 281264475, 806136397, 854497424, 328830749, 449789158, 606065983, 744965254, 1225276320, 586114684, 592182828, 888815245, 694012932, 671701982] 

WHERE IS RPT WHERE IS TASK FORCE THIRTY FOUR RR THE WORLD WONDERS?


In [5]:
# Close doesn't count in RSA encryption. You need the exact private key.
rsa_decrypt(cipher,n,private_key-1)

"d$g%$*fiJiS N1'+f$l$I-[O%0\\^)+ku)NH[?? UGv|!{-j;%*mVhV*ZI!*X^^y:t+ El$Z#g-K4'To"

In [6]:
# Good luck guessing the private key. There are many possibilities.
for j in range(10):
    badkey = random.randint(0,n)
    print( rsa_decrypt(cipher,n,badkey) )

/Nd0(,Ghz4[\"+:|'!+_BA7.DPr(K#H0_-_5 +19Qe^*_ +f8`YW\)^!qSlW#@C}'$AwH'*s)-tu^f
%x$n_\= ewBE$*5SV9M/B'oGG!H{>$#-T9M!+X2J+s(1}'1T%SvvI%+4^mmBpv'rKq/(`av-"
:UDz)KKZ:$*91D2G?=)%|p_{'hdSL$Ov<VZ|#!+(-\=-,xr%KZ(%$etGw(Ptd(3G@6$Is<t'>=/t&iO
9&V<w8=r),9N8}mK7q!O8%<)+Sg&4?&-#T)4M WxI#cQ%)F-_(,s=?#[#hq ?qvd)\xe$,Qu>un|b
+C(Rb+yk+w+&-,g,eb"*Xf!l>K/%vZF(qGh/"?oTy"vYtb^YqD'oWf/%0M4/$<7Qo<!a nbJK&Xo1%
*;)w2p"y #<WkI)"em^mZt$#_%3 "Qgw /<fs(+\&Xd4]]+#Fc{&^,ib#7j\#OouA%Hj!&E'M/&m,s 
GjkO&6X/^"jqX)z3*F'b&0"=lKZ 89s Z^V@#S h0!82 {!r)F5#*M1t*n+ER)buh,&l=iI%R8L%Q
+[(1k`,N' ?uA!&+bFb Q+>+(,MCZ>,Adu-[q&B#|s@E *b'7=/t6'+[Nzz-}1;&[g>X&zb?&#J82L"vr.)
L}lP!*zU=yk$*L5=|;j$u)g1r)9IJ N)BS Q/6+%vNmM%2o>p",&<XM`|om'rG7O(+lL.y,Ey)o6(
W! *a;"ciS]P(*F*QpbD<\!LOo *@IN[ge1m$WLV:!7m.X"^d/E$j/{? k*6$ ,I7uT3U!8(,;GjF?4


In [7]:
# Fidelio's RSA encryption can work with other alphabets if they have at most 100 chars.
# With shorter alphabets, decryption using the wrong key can make the message appear shorter.
# This is a side effect of Fidelio storing chars as 2-digit decimal numbers.
# It could be a security vulnerability!

cipher = rsa_encrypt(message,n,public_key,ALL_CAPS)
print(cipher,'\n')

for j in range(10):
    badkey = random.randint(0,n)
    print( rsa_decrypt(cipher,n,badkey,ALL_CAPS) )
print( )
          
plaintext = rsa_decrypt(cipher,n,private_key,ALL_CAPS)
print(plaintext)

[975131927, 409879555, 1226374712, 371039167, 520541530, 1023373321, 620514875, 578991679, 1209815208, 448039866, 220105078, 208799213, 543628307, 80032088] 

OGDIFIFKOJGFLGCCF
DVCVGMNCEITEQEBBD
LMGLPRMCJLYECBBJCLMWIUEA
NBSAUCBOFLCTDKIKIOYHKIQBXALC
KODOQFKQHDKSKWCRCJIXSJEDYBKG
FIDPVJECACAJOJJBYVFPOTJ
IFOCNAAJAYIYHDALVLZKBLRH
LNMOHKVMGTKDLLKAKHKNRUEPJLI
QPCILYHKMIGHCDQALGUCHDNATBL
KOCJOMKOYGHVECLSKMGEFHNDB

WHEREISRPTWHEREISTASKFORCETHIRTYFOURRRTHEWORLDWONDERS


## Packing and padding integers
Fidelio's other schemes represent text as 2-digit integers. For RSA encryption, we'd prefer bigger numbers.

The `packetize()` function converts a list of 2-digit integers into a list of larger integers. It pads the last packet (or creates a new packet) with random digits. The last digit of the last packet is how many random digits (including itself) were added.

In [8]:
digits = text_to_digits(message)
print(digits,'\n')

packets = packetize(digits)
print(packets,'\n')

test_digits = unpacketize(packets)
print(test_digits,'\n')

test_text = digits_to_text(test_digits)
print(test_text)

[55, 40, 37, 50, 37, 0, 41, 51, 0, 50, 48, 52, 0, 55, 40, 37, 50, 37, 0, 41, 51, 0, 52, 33, 51, 43, 0, 38, 47, 50, 35, 37, 0, 52, 40, 41, 50, 52, 57, 0, 38, 47, 53, 50, 0, 50, 50, 0, 52, 40, 37, 0, 55, 47, 50, 44, 36, 0, 55, 47, 46, 36, 37, 50, 51, 31] 

[55403750, 37004151, 504852, 554037, 50370041, 51005233, 51430038, 47503537, 524041, 50525700, 38475350, 505000, 52403700, 55475044, 36005547, 46363750, 51316434] 

[55, 40, 37, 50, 37, 0, 41, 51, 0, 50, 48, 52, 0, 55, 40, 37, 50, 37, 0, 41, 51, 0, 52, 33, 51, 43, 0, 38, 47, 50, 35, 37, 0, 52, 40, 41, 50, 52, 57, 0, 38, 47, 53, 50, 0, 50, 50, 0, 52, 40, 37, 0, 55, 47, 50, 44, 36, 0, 55, 47, 46, 36, 37, 50, 51, 31] 

WHERE IS RPT WHERE IS TASK FORCE THIRTY FOUR RR THE WORLD WONDERS?


## How decryption works
Bob decrypts each $C$ by exponentiating it to the power $x$ mod $n$:

$$
C^x \ \% \ n
= (M^k)^x \ \% \ n
= M^{kx} \ \% \ n
$$

The prime factorization of $n$ is $pq$. The [Chinese remainder theorem](https://en.wikipedia.org/wiki/Chinese_remainder_theorem) says that $M^{kx} \ \% \ n = M$ if and only if

$$
M^{kx} \ \% \ p = M
\qquad \textrm{AND} \qquad
M^{kx} \ \% \ q = M
$$

Let's do the mod $p$ test first. In the unlikely event that $M$ is a multiple of $p$, we know $M \ \% \ p = 0$ and it's easy:

$$
M^{kx} \ \% \ p
= 0^{kx} \ \% \ p
= 0
$$

What if $M$ is not a multiple of $p$? The trick is to choose a private key $x$ such that

$$
kx \ \% \ (p-1)(q-1) = 1
$$

which means $kx-1 = h(p-1)(q-1)$ for some $h$. We don't know what $h$ is, but we do know that

$$
M^{kx}
= M \cdot M^{kx-1}
= M \cdot M^{h(p-1)(q-1)}
$$

Since $p$ is prime and $M$ is not a multiple of $p$, we can quote [Fermat's Little Theorem](https://en.wikipedia.org/wiki/Fermat's_little_theorem):

$$
M^{p-1} \ \% \ p = 1
$$

which means

$$
M \cdot M^{h(p-1)(q-1)} \ \% \ p
= M \cdot 1^{h(q-1)} \ \% \ p
= M \ \% \ p
$$

To prove that $M^{kx} \ \% \ q = M$, repeat the same logic with $p$ and $q$ trading places.  

In [9]:
packets = packetize(text_to_digits("Hello, world!"))
print(packets,'\n')

cipher = [ pow(m,public_key,n) for m in packets ]
print(cipher,'\n')

decipher = [ pow(c,private_key,n) for c in cipher ]
print(decipher,'\n')

plaintext = digits_to_text(unpacketize(decipher))
print(plaintext)

[40697676, 79120087, 79827668, 1504146] 

[595727890, 1250557756, 653430825, 568244566] 

[40697676, 79120087, 79827668, 1504146] 

Hello, world!


## Generating the RSA number and key pair

Generating $n$ is easy: just choose two primes, multiply them together, and don't tell anyone what the two primes are.  
Choosing a good public key $k$ and private key $x$ is more complicated.

When Fidelio generates $n$, it uses the **Euler totient method** to calculate the keypair $(k,x)$. This is only possible because Fidelio knows the $p$ and $q$ it chose to generate $n = pq$. The method is a bit complicated, but can be computed quickly:

1. Calculate [Euler's totient](https://en.wikipedia.org/wiki/Euler's_totient_function) of $n$: $\phi(n) = (p-1)(q-1)$.  

2. Choose a public key $k$ such that:  
  a. $k$ is prime  
  b. $k < \phi(n)$  
  c. $k$ is not a factor of $\phi(n)$.  

3. Find $x$ such that $kx \ \% \ \phi(n) = 1$.

This $x$ is the multiplicative inverse of $k$ using modular arithmetic (mod $\phi(n)$).

In the decryption proof above, we used Steps 1 and 3 when we assumed

$$
kx \ \% \ (p-1)(q-1) = 1
$$

Step 2 guarantees $x$ exists and is unique. There is a unique positive $x < \phi(n)$ such that $xk \ \% \ \phi(n) = 1$ if and only if $x$ and $\phi(n)$ are [relatively prime](https://en.wikipedia.org/wiki/Modular_multiplicative_inverse#Modular_arithmetic). Since $k$ is chosen from a list of primes, it's enough to check that $k < \phi(n)$ and $k$ is not a factor of $\phi(n)$.

Note that anyone can replicate this method if they can factor $n$ into its prime factors $p*q$. This is why it's important to use large prime numbers. If $n$ is large enough, then figuring out $p$ and $q$ is extremely slow - unless you have a reliable quantum computer, in which case you can use [Shor's algorithm](https://en.wikipedia.org/wiki/Shor%27s_algorithm).

In [10]:
# Let's use tiny primes for this example
small_primes = load_primes(too_large=50)
print(small_primes)

Loading prime numbers from Primes.txt
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47]


In [11]:
# Choose p and q at random from our list of primes
small_n, small_totient = choose_rsa_number(small_primes,verbose=True)

RSA number is 11 * 13 = 143
Euler totient is 10 * 12 = 120


In [12]:
# Choose a public key which meets the criteria
small_public_key = choose_public_key(small_primes,small_totient,verbose=True)

Public key is 31


In [13]:
# Check that the public key and totient are relatively prime
check_gcd = gcd(small_public_key,small_totient)
show_numbers = (small_public_key,small_totient)
if check_gcd == 1:
    print( "%s and %s are relatively prime" % show_numbers )
else:
    raise ValueError( "%s and %s are not relatively prime!" % show_numbers )

31 and 120 are relatively prime


## Finding the private key
Finding the inverse of $k$ mod $\phi(n)$ takes some work.  
Fidelio's `gcd_and_inverse()` function uses the [extended Euclidean algorithm](https://en.wikipedia.org/wiki/Extended_Euclidean_algorithm) to find $x$.  
It also checks that $\gcd(k,\phi(n)) = 1$. This guarantees that $k$ is not a factor of $\phi(n)$.  







In [14]:
# Find the private key x such that kx % n = 1
check_gcd, small_private_key = gcd_and_inverse(small_public_key,small_totient)
print( "Private key is %s" % small_private_key )

Private key is 31


In [15]:
# Is the private key really the inverse of the public key (mod totient)?
check_inverse = (small_public_key * small_private_key) % small_totient
show_numbers = (small_private_key,small_public_key,small_totient)
if check_inverse == 1:
    print( "%s is the multiplicative inverse of %s (mod %s)" % show_numbers )
else:
    raise ValueError( "%s is not the multiplicative inverse of %s (mod %s)" % show_numbers )

31 is the multiplicative inverse of 31 (mod 120)
