# Problem1: Affine Cipher

> Affine cipher is a type of monoalphabetic substitution cipher that uses modular arithmetic to encrypt the letters of a message. The mathematical module formula is c = ap + b mod n where input p is plaintext, the output c is the ciphertext, n is a modular integer, a and b are non-negative integers less than n, a and n are relatively prime (to do decryption). We assume all the letters are encoded to unique integers.

## 1a

> Describe affine cipher in more detail

## 1b
> What is the size of key space for a fixed integer n? Hint: Use Euler's totient Φ(n)

## 1c

> Let us assume the plaintext is made of 26 capital letters only. So, the $n=26$. Given the affine cipher $c = 5p + 9 \mod 26$, what is the ciphertext for the plaintext “CRYPTO IS FUN”. Here we keep space as is

The ciphertext is `TQZGAB XV IFW`

In [31]:
import string
from typing import List, Tuple

Coefficients = Tuple[int, int]
Keyspace = List[str]

def check_assertions(n: int, a: int, b: int):
    if a < 0 or b < 0 or a >= n or b >= n:
        raise ValueError(f'a and b must be non-negative and less than {n}')

def encrypt(plaintext: str, coefficients: Coefficients, keyspace: Keyspace) -> str:
    n = len(keyspace)
    a, b = coefficients
    check_assertions(n, a, b)

    plaintext_words: List[str] = plaintext.split(' ')
    ciphertext_words: List[str] = []

    for word in plaintext_words:
        indicies = [ keyspace.index(p) for p in word ]
        ciphertext_words.append(
            ''.join(keyspace[(a * idx + b) % n] for idx in indicies)
        )

    return ' '.join(ciphertext_words)

def decrypt(ciphertext: str, coefficients: Coefficients, keyspace: Keyspace) -> str:
    n = len(keyspace)
    a, b = coefficients
    check_assertions(n, a, b)
    
    def inverse(a, n) -> int:
        import math

        possible = [ a for a in range(n) if math.gcd(a, n) == 1]
        for p in possible:
            if (a * p) % n == 1:
                return p
            
        raise ValueError(f'No inverse for {a} mod {n}')
    
    ciphertext_words: List[str] = ciphertext.split(' ')
    plaintext_words: List[str] = []
    a_inverse = inverse(a, n)

    for word in ciphertext_words:
        indicies = [ keyspace.index(p) for p in word ]
        plaintext_words.append(
            ''.join(keyspace[(a_inverse * (idx - b)) % n] for idx in indicies)
        )

    return ' '.join(plaintext_words)
    

coefficients: Coefficients = (5, 9)
keyspace: Keyspace = list(string.ascii_uppercase)

c = encrypt("CRYPTO IS FUN", coefficients, keyspace)
assert(c == "TQZGAB XV IFW")
assert("CRYPTO IS FUN" == decrypt(c, coefficients, keyspace))

## 1d

> Eve has the ciphertext “QJKES REOGH GXXRE OXEO”. She magically knows the cipher is an affine cipher and the letter T is encrypted to H and O to E. Recover the decryption function and decipher the message. Students shall solve it manually first and then use code to solve it. They both shall give the same results. The code shall be more general, not just in this case.

In [37]:
ciphertext = "QJKES REOGH GXXRE OXEO"
O_indicies = [idx for idx, c in enumerate(ciphertext) if c == "O"]
H_indicies = [idx for idx, c in enumerate(ciphertext) if c == "H"]

for a in range(len(keyspace)):
    for b in range(len(keyspace)):
        try:
            plaintext = decrypt(ciphertext, (a, b), keyspace)
            assert(all(
                [plaintext[idx] == "E" for idx in O_indicies]
            ))
            assert(all(
                [plaintext[idx] == "T" for idx in H_indicies]
            ))

            print(a, b, plaintext)

        except (AssertionError, ValueError) as e:
            pass

assert(
    "WLUSO FSEKT KHHFS EHSE" == decrypt(ciphertext, (3, 2), keyspace)
)

3 2 WLUSO FSEKT KHHFS EHSE


## 1e

> What is the affine formula if we want to include the space and little letter case in the encde set?

$$
n = \lvert \{A, \dots, Z\} \cup \{a, \dots, z\} \cup \{ \text{ } \} \rvert = 53 \\
E = (ap + b)\mod{n} \\
D = a^{-1}(p - b)\mod{n}
$$

# Problem 2 Frequency Analysis

> Alice uses a simple substitution cipher to send her message to Bob. It reads as “TNFOS FOZSW PZLOC GQAOZ WAGQR PJZPN ABCZP QDOGR AMTHA RAXTB AGZJO GMTHA RAVAP ZW”. Space is treated as is. Eve gets the ciphertext and she also heard the word “liberty” appears in the plaintext.

## 2a

> Describe the substitution cipher.

## 2b

> What is the size of key space

## 2c

> Use the frequency of English letters as reference to recover the plaintext. We can do it manually. Optionally, we can do it by coding. It is a bit of a challenge. It is doable.

In [34]:
import math
from typing import List, Tuple
import itertools

ciphertext = "NFOS FOZSW PZLOC GQAOZ WAGQR PJZPN ABCZP QDOGR AMTHA RAXTB AGZJO GMTHA RAVAP ZW"
ciphertext_condensed = ciphertext.replace(' ', '')
known_word = 'LIBERTY'
known_word_chars = set([*known_word])
known_word_length = len(known_word)

english_letter_frequency = {
    'E' : 12.0,
    'T' : 9.10,
    'A' : 8.12,
    'O' : 7.68,
    'I' : 7.31,
    'N' : 6.95,
    'S' : 6.28,
    'R' : 6.02,
    'H' : 5.92,
    'D' : 4.32,
    'L' : 3.98,
    'U' : 2.88,
    'C' : 2.71,
    'M' : 2.61,
    'F' : 2.30,
    'Y' : 2.11,
    'W' : 2.09,
    'G' : 2.03,
    'P' : 1.82,
    'B' : 1.49,
    'V' : 1.11,
    'K' : 0.69,
    'X' : 0.17,
    'Q' : 0.11,
    'J' : 0.10,
    'Z' : 0.07
}

english_letter_frequency_sorted: List[Tuple[str, int]] = sorted(english_letter_frequency.items(), key=lambda x: x[1], reverse=True)

letter_counts = {
    k: ciphertext.count(k) for k in set([*ciphertext_condensed])
}
letter_counts_sorted: List[Tuple[str, int]] = sorted(letter_counts.items(), key=lambda x: x[1], reverse=True)

potential_plaintexts = []

# Find potential slice where 'LIBERTY' could be
for i in range(len(ciphertext_condensed) - len(known_word) + 1):
    characters = ciphertext_condensed[i:i+known_word_length]
    unique_characters = set([*characters])

    # 'LIBERTY' could potentially fit here
    if len(unique_characters) == len(known_word_chars):

        mapping = {k: v for (k, v) in zip(characters, known_word)}

        # Most likely when the two most frequent letters in the ciphertext
        # are the two most frequent letters in the english language
        if mapping.get(letter_counts_sorted[0][0]) == english_letter_frequency_sorted[0][0] and\
            mapping.get(letter_counts_sorted[1][0]) == english_letter_frequency_sorted[1][0]:

            output = ciphertext
            for k, v in mapping.items():
                output = output.replace(k, v.lower())

            potential_plaintexts.append(output)



# Remove LIBERTY from choices
letter_counts_sorted = list(filter(lambda x: x[0] not in known_word, letter_counts_sorted))

for potential_plaintext in potential_plaintexts:
    characters = [
        (c, potential_plaintext.count(c)) 
        for c in set([*potential_plaintext])
        if c.isupper()
    ]

    characters.sort(key=lambda x: x[1], reverse=True)
    print(potential_plaintext, 'Before')

    output = potential_plaintext
    for idx, (c,_) in enumerate(characters):
        output = output.replace(c, letter_counts_sorted[idx][0].lower())
        print(output)

    print('After:', output)



NFrS FrtSy PtLrl ibert yeibR PJtPN eBltP bDriR eMTHe ReXTB eitJr iMTHe ReVeP ty Before
NFrS FrtSy atLrl ibert yeibR aJtaN eBlta bDriR eMTHe ReXTB eitJr iMTHe ReVea ty
NFrS FrtSy atLrl ibert yeibz aJtaN eBlta bDriz eMTHe zeXTB eitJr iMTHe zeVea ty
NFrS FrtSy atLrl ibert yeibz aJtaN eBlta bDriz eMoHe zeXoB eitJr iMoHe zeVea ty
NprS prtSy atLrl ibert yeibz aJtaN eBlta bDriz eMoHe zeXoB eitJr iMoHe zeVea ty
NprS prtSy atLrl ibert yeibz aJtaN eBlta bDriz egoHe zeXoB eitJr igoHe zeVea ty
Nprq prtqy atLrl ibert yeibz aJtaN eBlta bDriz egoHe zeXoB eitJr igoHe zeVea ty
Nprq prtqy atLrl ibert yeibz aJtaN eBlta bDriz egowe zeXoB eitJr igowe zeVea ty
Nprq prtqy atLrl ibert yeibz aJtaN eflta bDriz egowe zeXof eitJr igowe zeVea ty
Nprq prtqy atLrl ibert yeibz amtaN eflta bDriz egowe zeXof eitmr igowe zeVea ty
sprq prtqy atLrl ibert yeibz amtas eflta bDriz egowe zeXof eitmr igowe zeVea ty
sprq prtqy atLrl ibert yeibz amtas eflta bDriz egowe zehof eitmr igowe zeVea ty
sprq prtqy atLrl ibert yeibz amta

## 2d

> (optional) The following message is from a Vigenère cipher with a 3-letter English keyword: “CTMYR DOIBS RESRR RIJYR EBYLD IYMLC CYQXS RRMLQ FSDXF OWFKT CYJRR IQZSM X”. Recover the plaintext.