# Task 2: Vigenere
<i>Krypto Lab, Solution by Felix Kleinsteuber, 185 709</i>

Schreibt ein Programm zur automatische Entschlüsselung von Vigenère.
- Input: Verschlüsselter Lorem Ipsum Text
- Gebt den Schlüssel und die Schlüssellänge aus.

## 1. Recycling from last week: Additive decryption

In [22]:
# Reading a file completely
def readFile(filename):
    with open(filename, "r") as f:
        return f.read()

def writeFile(filename, content):
    with open(filename, "w") as f:
        return f.write(content)

from collections import Counter

# Automatic additive decryption
def findKeyAdditive(content, most_common=" "):
    counts = Counter(content).most_common(1)
    return (ord(counts[0][0]) - ord(most_common)) % 128


## 2. Calculate Index of Coincidence (IC)

In [4]:
# Returns the IC value for a given text
def calc_ic(content):
    counts = Counter(content).most_common()
    n = len(content)
    # IC formula
    return 1 / (n * (n - 1)) * sum(h * (h - 1) for _, h in counts)

# Test using Lorem ipsum:
lorem = r"Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet."
print("Lorem ipsum IC:", calc_ic(lorem))
print("Random text IC:", 1/26)

Lorem ipsum IC: 0.07351515672947317
Random text IC: 0.038461538461538464


## 3. Find out key length
As we can see, the IC for Lorem ipsum texts is >0.07, whereas the IC of a random text is < 0.04. We can use this clear gap to define the rule:

    The smallest key length with average IC > 0.07 is most likely the true key length.

In [16]:
# Returns the shortest key length with an average IC value >= threshold
def calcKeyLength(content, min_val=1, max_val=100, threshold=0.07, verbose=False):
    opt_key_length = 1
    opt_ic = 0
    for key_length in range(min_val, max_val + 1):
        # Use average IC for all subtexts
        avg_ic = sum(calc_ic(content[offset::key_length]) for offset in range(key_length)) / key_length
        if verbose:
            print(f"Key length: {key_length} - IC: {avg_ic}")
        # IC high enough? Stop searching
        if avg_ic >= threshold:
            return key_length
        elif avg_ic > opt_ic:
            opt_key_length = key_length
            opt_ic = avg_ic
    return opt_key_length

First test: We load the given Lorem ipsum files (Lorem1.txt - known key length: 3) and try to determine the key length automatically.

In [17]:
for filename in ["Lorem1.txt", "Lorem2.txt", "Lorem3.txt"]:
    print(filename, "-", calcKeyLength(readFile(filename), verbose=True))

Key length: 1 - IC: 0.037950122108870096
Key length: 2 - IC: 0.03793914432725792
Key length: 3 - IC: 0.07074343034592807
Lorem1.txt - 3
Key length: 1 - IC: 0.030770178565867428
Key length: 2 - IC: 0.03599529731138498
Key length: 3 - IC: 0.039557492746836544
Key length: 4 - IC: 0.04145850919874842
Key length: 5 - IC: 0.030813316645997006
Key length: 6 - IC: 0.05102520195161165
Key length: 7 - IC: 0.030758813396837222
Key length: 8 - IC: 0.04144259459424841
Key length: 9 - IC: 0.03953363117164174
Key length: 10 - IC: 0.036071837252544675
Key length: 11 - IC: 0.03073619546768608
Key length: 12 - IC: 0.07066709618131563
Lorem2.txt - 12
Key length: 1 - IC: 0.029161516652549367
Key length: 2 - IC: 0.02915982062244037
Key length: 3 - IC: 0.03045065206044041
Key length: 4 - IC: 0.029154361843660322
Key length: 5 - IC: 0.029190944589104612
Key length: 6 - IC: 0.030439953303931917
Key length: 7 - IC: 0.029146568061626552
Key length: 8 - IC: 0.02913614722304237
Key length: 9 - IC: 0.0304461518746

Successful! (key length of 3 for Lorem1.txt is correct)

## 4. Find the key
We define the encryption and decryption functions for the Vigenere chiffre. They work the same way as the additive chiffre functions, but expect the key to be an array, cycling through the array's entries using a modulo operation.

In [19]:
def encryptVigenere(content, key):
    key_length = len(key)
    return "".join(chr((ord(c) + key[i % key_length]) % 128) for i, c in enumerate(content))

def decryptVigenere(content, key):
    key_length = len(key)
    return "".join(chr((ord(c) - key[i % key_length]) % 128) for i, c in enumerate(content))

# Finds the optimal key length and then decrypts each subtext as an additive chiffre (see first task, 01_Additive)
def autoDecryptVigenere(content):
    key_length = calcKeyLength(content)
    key = [findKeyAdditive(content[offset::key_length]) for offset in range(key_length)]
    # Return key and decrypted text
    return key, decryptVigenere(content, key)


Test: We open all files and automatically decrypt them. The first 100 characters are printed, the whole text will be written to "decrypted_{filename}".

In [23]:
for filename in ["Lorem1.txt", "Lorem2.txt", "Lorem3.txt"]:
    key, decrypted = autoDecryptVigenere(readFile(filename))
    asciiKey = "".join(chr(i) for i in key)
    print(f"{filename} - Key: {key}, ASCII: {asciiKey}")
    print("Decrypted message:", decrypted[:100], "...")
    writeFile("decrypted_" + filename, decrypted)

Lorem1.txt - Key: [115, 116, 122], ASCII: stz
Decrypted message: QUISQUE RUTRUM. AENEAN IMPERDIET. ETIAM ULTRICIES NISI VEL AUGUE. CURABITUR ULLAMCORPER ULTRICIES NI ...
Lorem2.txt - Key: [115, 116, 122, 117, 118, 120, 121, 115, 118, 117, 115, 116], ASCII: stzuvxysvust
Decrypted message: ETIAM ULTRICIES NISI VEL AUGUE. CURABITUR ULLAMCORPER ULTRICIES NISI. NAM EGET DUI. ETIAM RHONCUS. M ...
Lorem3.txt - Key: [115, 116, 122, 117, 118, 120, 121, 115, 118, 117, 115, 116, 115, 116, 117, 121, 122, 120, 119, 122, 121, 117, 115, 116, 118, 122, 119, 120, 121, 115, 119, 119, 116, 120, 121, 117, 122, 120, 119, 115, 116, 119, 120, 122, 121, 117, 115, 115, 116, 122, 117, 119, 119, 120, 117, 121, 122], ASCII: stzuvxysvuststuyzxwzyustvzwxyswwtxyuzxwstwxzyusstzuwwxuyz
Decrypted message: CURABITUR ULLAMCORPER ULTRICIES NISI. NAM EGET DUI. ETIAM RHONCUS. MAECENAS TEMPUS. TELLUS EGET COND ...


All results look well!