# Basic functions for “clasical”/historical ciphers

## Simple stuff for converting letters to numbers mod 26, etc: 

In [1]:
from math116 import cleanstring, tochar, tonum

cleanstring("This is a test")

'THISISATEST'

In [2]:
tonum("A"), tonum("B"), tonum("Z")

(0, 1, 25)

In [3]:
tochar(0) + tochar(1) + tochar(2)

'ABC'

## Caesar (shift) cipher

In [4]:
from math116 import caesar_encrypt, caesar_decrypt

plaintext = cleanstring("This is a test")
ciphertext = caesar_encrypt(plaintext, 11)
ciphertext

'ESTDTDLEPDE'

In [5]:
caesar_decrypt(ciphertext, 11)

'THISISATEST'

## Vigenère (polyalphabetic shift) cipher

In [6]:
from math116 import vigenere_encrypt, vigenere_decrypt, vigenere_crack

plaintext = cleanstring("The quick brown fox jumped over the lazy dog.")
ciphertext = vigenere_encrypt(plaintext, "BROWN")
ciphertext

'UYSMHJTYXEPNBBBYAIICFUCRRSKVAYBQMZBH'

In [7]:
vigenere_decrypt(ciphertext, "BROWN")

'THEQUICKBROWNFOXJUMPEDOVERTHELAZYDOG'

### The `vigenere_crack` function, demostrated below, 

puts together all the tricks we learned for doing a ciphertext-only attack on the Vigenère cipher (using letter frequency analysis). It just takes a ciphertext, and optionally a bound on the key length, and returns its best guess at the key, and the plaintext. 

In [8]:
ciphertext = cleanstring("""
    YSFWPYGHZZUBUECFDDXDSGIMAQYKRPDSXDYKBRBBQMVVYTAKDQRHVOPGKBMG
    TUUCTEAEQQPZPVCPHTXDHTCNCTKSEBGFQSMCZIIMTTZIMDSDIPPXIXWOFOEB
    KBSHLMWEMUAMCCGGIFTYEGXBMPXFRJWZKHDILWBZCGHTMOAMCOMGKVZOBSJM
    HIZXEIWWOIVXGUMJSDTMCZMQMQNFJMNSYCAEKSGMQYVFLQFFWLXPIRQNEBFR
    JSDTLJMIMAGKJVGPHTTLVFMZAQYHYCECZUPHXVVBGLHRRGGAGTRTRTUKEWKY
    TMOPAEVMOGYAOKCXSDPREBRNBVASLLKHQSDXTXZAQYODCTWOPZVKIILGKGRG
    FHAISIXRZUKXGFDVVQALXMIMCPMWCNTCBTCXRISKJTBXCFWZHFGAQVVPXFRQ
    UVMAWFXQPBWTZCWCDBGZZXHWGVASICUDQREMOIVCVACIGVWQH
""")
vigenere_crack(ciphertext)

('COMPLETEIICTORY',
 'WETHEUNDRRSIGNEDPRISONEESOFWARBELONGINTTOTHEARMYOFNORGHERNVIRGINIAHAIINGBEENTHISDAYFURRENDEREDBYGEAERALROBERTELEEPOMMANDINGSAIDAEMYTOLIEUTENANTTENERALULYSSESSTRANTCOMMANDINGNRMIESOFUNITEDSGATESDOHEREBYGIIEOURSOLEMNPAROYEOFHONORTHATWEJILLNOTHEREAFTEESERVEINTHEARMIRSOFTHECONFEDERNTESTATESORINANLMILITARYCAPACIGYWHATEVERAGAINFTTHEUNITEDSTATRSOFAMERICAORREADERSAIDTOTHEENRMIESOFTHELATTEEUNTILPROPERTYEKCHANGEDINSUCHMNNNERASSHALLBEMHTUALLYAPPROVEDOYTHERESPECTIVENUTHORITIES')

### The above example *almost* worked perfectly. 

In this case, the plaintext was encrypted with a 15-letter key, and the ciphertext just isn't long enough for a fully automated attack based strictly on letter frequencies to work with a key that long. However, the result is close enough to be able to easily figure out the one incorrect letter in the key! 


<br>
<br>
<br>
<br>
<br>


## Later on, when dealing with RSA, some textbook problems use an encoding 

in which each letter is mapped to a number in the range$^*$ 1 .. 26, and then all of these numbers are just concatenated together, in base 10. So, for example, the word BADGER encodes as follows: B $\to 02$, A $\to 01$, D $\to 04$, G $\to 07$, E $\to 05$, and R $\to 18$, so all together 
$$ \mathtt{BADGER} \to 20104070518 $$

$^*$ Note that this is different than the above encoding of individual letters, in which we map each letter to a number between 0 and 25, inclusive. 

The following two functions can automate this encoding/decoding for you: 

In [9]:
from math116 import text_to_num, num_to_text

text_to_num("BADGER")

20104070518

In [10]:
num_to_text(4152515210705200920)

'DOYOUGETIT'

<br>
<br>
<br>
<br>
<br>

<br>
<br>
<br>
<br>
<br>

# Basic number theory functions

## GCDs, the Extended Euclidean Algorithm, and one immediate application of it

In [11]:
from math116 import gcd, euclidean, inverse

In [12]:
# Do the “basic” Euclidean algorithm, i.e. just compute the gcd
help(gcd)

Help on function gcd in module math116:

gcd(a, b)
    Returns the GCD of a and b



In [13]:
gcd(42, 182)

14

In [14]:
gcd(-42, 182)

14

In [15]:
gcd(42, -182)

14

In [16]:
# Do the Extended Euclidean Algorithm
help(euclidean)

Help on function euclidean in module math116:

euclidean(a, b)
    Returns g, x, y, such that g = gcd(a, b) and ax + by = g



In [17]:
euclidean(42, 182)

(14, -4, 1)

In [18]:
g, x, y = euclidean(42, 182)
print(f"gcd(42, 182) = {g}, and, like, oh my gosh, 42×({x}) + 182×({y}) = {42*x + 182*y}. Wowie!")

gcd(42, 182) = 14, and, like, oh my gosh, 42×(-4) + 182×(1) = 14. Wowie!


In [19]:
# Compute the multiplicative inverse of a (mod n)
help(inverse)

Help on function inverse in module math116:

inverse(a, n)
    Returns the inverse of a modulo n, raises ValueError if not coprime



In [20]:
inverse(15, 26)

7

In [21]:
7 * 15 % 26

1

In [22]:
inverse(6, 26) # Should raise a ValueError!

ValueError: 6 is not invertible modulo 26

## Solving systems of congruences: the Chinese Remainder Theorem

In [23]:
from math116 import crt, crt_basic

In [24]:
# Do the basic Chinese Remainder Theorem, on two congruences, with coprime moduli
help(crt_basic)

Help on function crt_basic in module math116:

crt_basic(a1, a2, m1, m2)
    Performs the Chinese Remainder Theorem on a system of TWO congruences
    
    This function ONLY WORKS with a system of two congruences, and only works 
    if the two moduli m1 and m2 are coprime. In that case, the modulus of the 
    result should just be m1 * m2, so this function doesn't bother returning 
    that modulus. It just computes and returns a residue x that is congruent to 
    a1 modulo m1 and is congruent to a2 modulo m2. 
    
    This will raise ValueError if the moduli m1 and m2 are not relatively prime.



In [25]:
crt_basic(3, 5, 11, 15)

80

The above means that $\begin{cases} x \equiv 3 \pmod{11} \\ x \equiv 5 \pmod{15} \end{cases} \qquad \iff \qquad x \equiv 80 \pmod{165}$

In [26]:
# The following fails because 10 and 15 are not coprime, even though the system has a solution
crt_basic(3, 8, 10, 15)

ValueError: 10 and 15 are not coprime

In [27]:
# A slightly fancier version of the CRT, that handles more than two congruences
help(crt)
# If you look at the source code for this, you might notice that it implements 
# the algorithm exactly as described in problem 7 from Homework 4. 

Help on function crt in module math116:

crt(residues, moduli)
    Performs the Chinese Remainder Theorem
    
    Returns a pair (x, n) where n is the product of the moduli in the second 
    argument, and x is congruent to each residue in the first argument modulo 
    the corresponding modulus from the second argument. 
    
    Note that if you have the residues and moduli already paired off, you can 
    just "un-pair" them with zip, e.g.: 
        crt(*zip((2, 5), (4, 7), (1, 11))) # returns (67, 385)
    
    This will raise ValueError if the moduli are not pairwise relatively prime.



In [28]:
crt([2, 4, 1], [5, 7, 11])

(67, 385)

The above means that $\begin{cases} x \equiv 2 \pmod{5} \\ x \equiv 4 \pmod{7} \\ x \equiv 1 \pmod{11} \end{cases} \qquad \iff \qquad x \equiv 67 \pmod{385}$

In [29]:
# As explained in the documentation, if you have the a_i's and m_i's already 
# paired together, you can call crt with 'zip', as follows: 
cong1 = (2, 5)    # 2 (mod 5)
cong2 = (4, 7)    # 4 (mod 7)
cong3 = (1, 11)   # 1 (mod 11)
crt(*zip(cong1, cong2, cong3))

(67, 385)

In [30]:
# The following will fail since the moduli are not pairwise coprime, even though 
# the system of congruences does have a solution. 
cong1 = (4, 12)    # 4 (mod 12)
cong2 = (28, 50)   # 28 (mod 50)
cong3 = (13, 45)   # 13 (mod 45)
crt(*zip(cong1, cong2, cong3))

ValueError: 2250 is not invertible modulo 12

In [31]:
from math116 import crt_general

In [32]:
# If you want to solve systems with non-coprime moduli, you can use crt_general
help(crt_general)

Help on function crt_general in module math116:

crt_general(residues, moduli)
    Performs the Chinese Remainder Theorem
    
    Returns a pair (x, n) where n is the least common multiple of the moduli in 
    the second argument, and x is congruent to each residue in the first 
    argument modulo the corresponding modulus from the second argument. 
    
    Note that if you have the residues and moduli already paired off, you can 
    just "un-pair" them with zip, e.g.: 
        crt_general(*zip((2, 5), (4, 7), (1, 11))) # returns (67, 385)
    
    This version works even if the moduli are not pairwise relatively prime, if 
    there is a solution. It will raise ValueError if there is no solution.



In [33]:
# The example above that failed with 'crt_basic'
crt_general((3, 8), (10, 15))

(23, 30)

The above means that $\begin{cases} x \equiv 3 \pmod{10} \\ x \equiv 8 \pmod{15} \end{cases} \qquad \iff \qquad x \equiv 23 \pmod{30}$

In [34]:
# But the following fails because there actually is no solution
crt_general((3, 7), (10, 15))

ValueError: No solution to this system of congruences

In [35]:
# The example above that failed with 'crt'
cong1 = (4, 12)    # 4 (mod 12)
cong2 = (28, 50)   # 28 (mod 50)
cong3 = (13, 45)   # 13 (mod 45)
crt_general(*zip(cong1, cong2, cong3))

(328, 900)

<br>
<br>
<br>
<br>
<br>

<br>
<br>
<br>
<br>
<br>

# Factoring and primality testing

## The Miller–Rabin strong pseudoprime test

In [36]:
from math116 import pseudoprime, is_probably_prime

In [37]:
help(pseudoprime)

Help on function pseudoprime in module math116:

pseudoprime(n, base)
    Perform the Miller–Rabin (strong pseudoprime) test on n
    
    Note that in some (rare) cases, the Miller–Rabin test can actually produce 
    a non-trivial factorization of n, however this function does not return 
    those factors. If you want to run this same algorithm, but obtain that 
    factorization when possible, you can use the 'universal_exponent_factor' 
    function in this library, and pass it n and n - 1. But note that it is 
    likely to raise ValueError if n is not prime. See its documentation. 
    
    Returns True or False
    If this returns True, n is either prime or a strong pseudoprime for 'base'. 
    If this returns False, n is guaranteed to be composite.



In [38]:
pseudoprime(251521, 5)

True

In [39]:
# 251521 appears to be prime, because it's a strong (Miller–Rabin) pseudoprime for base 5
# But... not for many other bases
pseudoprime(251521, 2)

False

In [40]:
pseudoprime(251521, 3)

False

In [41]:
pseudoprime(251521, 7)

False

In [42]:
# is_probably_prime is a one-line function that just calls pseudoprime for several bases
help(is_probably_prime)

Help on function is_probably_prime in module math116:

is_probably_prime(n, bases=(2, 3, 5, 7, 11))
    Check if n is prime, by running Miller–Rabin test for several bases



In [43]:
# Here's its full source code: return all(pseudoprime(n, base) for base in bases)
is_probably_prime??

[0;31mSignature:[0m [0mis_probably_prime[0m[0;34m([0m[0mn[0m[0;34m,[0m [0mbases[0m[0;34m=[0m[0;34m([0m[0;36m2[0m[0;34m,[0m [0;36m3[0m[0;34m,[0m [0;36m5[0m[0;34m,[0m [0;36m7[0m[0;34m,[0m [0;36m11[0m[0;34m)[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mSource:[0m   
[0;32mdef[0m [0mis_probably_prime[0m[0;34m([0m[0mn[0m[0;34m,[0m [0mbases[0m[0;34m=[0m[0;34m([0m[0;36m2[0m[0;34m,[0m [0;36m3[0m[0;34m,[0m [0;36m5[0m[0;34m,[0m [0;36m7[0m[0;34m,[0m [0;36m11[0m[0;34m)[0m[0;34m)[0m[0;34m:[0m[0;34m[0m
[0;34m[0m    [0;34m"Check if n is prime, by running Miller–Rabin test for several bases"[0m[0;34m[0m
[0;34m[0m    [0;32mreturn[0m [0mall[0m[0;34m([0m[0mpseudoprime[0m[0;34m([0m[0mn[0m[0;34m,[0m [0mbase[0m[0;34m)[0m [0;32mfor[0m [0mbase[0m [0;32min[0m [0mbases[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mFile:[0m      ~/Home/Teaching/UCLA/2024-01 (Winter) Math 116/Computer materials/B

In [44]:
p = 722449162587487
is_probably_prime(p)

True

In [45]:
q = 929168008237529
is_probably_prime(q)

True

Both of the above numbers are apparently prime (or, at least, it's extremely likely they are). 

But if we multiply them, we get a number that's hard to factor by trial division (because it only has these two large prime factors). 

However, our primality-testing function should still tell us that it's not prime: 


In [46]:
n = p*q
is_probably_prime(n)

False

## Getting the complete factorization of a (small) number

In [47]:
from math116 import factor

In [48]:
help(factor)

Help on function factor in module math116:

factor(n)
    Returns the factorization of n by trial division, using the PRIMES list



In [49]:
factor(3000)

Counter({2: 3, 5: 3, 3: 1})

The `factor` function just does trial division, and only tries primes up to 4 digits ($< 10,000$). So don't expect to be able to use it to reliably factor any really large numbers. 

The nice thing about it, though, is that when it can factor the number, it actually gives a complete factorization of it, as a `Counter` object (like a Python dictionary). So the above result says that $3000 = 2^3 \cdot 3^1 \cdot 5^3$. 

This function will still work for a lot of random numbers that are in the 10-15 digit range. But it won't work for most random numbers that are much larger than that. 

In [50]:
factor(2938470912) # I just generated this by typing random digits, and it happens to work

Counter({2: 9, 3: 3, 13: 1, 83: 1, 197: 1})

A positive side-effect of this function having a limit, though, is that it will fail gracefully if $n$ has any prime factors that are above this limit, rather than just running for a million years. It will raise a `ValueError` exception in this case. 

In [51]:
factor(57238153663297070418971354211881034720)

ValueError: n has a 'large' prime factor. Known factorization Counter({2: 5, 3: 2, 7: 2, 5: 1, 11: 1, 31: 1, 101: 1}). Remaining unfactored part 23553274379156591769681969107

In [None]:
is_probably_prime(23553274379156591769681969107)

So in this example, there's some larger primes that still divide that big remaining factor...

## Other factoring algorithms

The `math116` library now also includes functions for the “universal exponent” factoring algorithm (which is very similar to how Miller–Rabin works, but specifically looking for a factor, rather than testing primality) and the Pollard $p - 1$ factoring algorithm. You can play around with these yourself, but remember that they're both sort of “special purpose” factoring algorithms. Each one only works efficiently in special circumstances. 

Remember, of course, that there is no known efficient general-purpose factoring algorithm. The running time of all known general-purpose factoring algorithms increases exponentially with the length (number of digits, or number of bits) of the number that you're trying to factor. 

In [52]:
from math116 import universal_exponent_factor, pollard_factor

In [53]:
help(universal_exponent_factor)

Help on function universal_exponent_factor in module math116:

universal_exponent_factor(n, exponent, bases=(2, 3, 5, 7, 11))
    Factor 'n', given that 'exponent' is a universal exponent for n
    
    A universal exponent is a positive integer e for which 
        b^e = 1 (mod n)
    for any b that is relatively prime to n. If n is prime, then e is a 
    universal exponent for n iff e is a multiple of n - 1 (Fermat's Little 
    Theorem for one direction, existence of primitive roots for the other). 
    More generally, for any n, \phi(n) is a universal exponent for n (Euler's 
    Theorem). 
    
    This algorithm is not guaranteed to produce a factorization of n, but it's 
    very likely to. If it cannot find one, it will return None. In particular, 
    providing this function with a number n, and n - 1 as the (supposed) 
    universal exponent will do the equivalent of the Miller–Rabin (strong 
    pseudoprime) test on n for the bases in 'bases'. In that case, if the 
    func

In [54]:
help(pollard_factor)

Help on function pollard_factor in module math116:

pollard_factor(n, bases=(2,), bound=100000)
    Runs the Pollard p - 1 factoring algorithm on n

