# **Lab 3**: February 6, 2019.
# Topic: Vigenère Ciphers

Main concept: Known ciphertext attack on a Vigenère cipher.

You access the ciphertext from a Vigenère cipher and are tasked with decrypting it. This worksheet will guide you through performing a known ciphertext attack

### Part 1: The ciphertext



The ciphertext you find is:

### Part 2: Finding the key length

For a Vigenère cipher it is important to find the *key length* before finding the key. Since the Vigenère cipher is a collection of interwoven shift ciphers, we can find the key by looking at coincidences of the ciphertext with a shifted version of itself. The displacements with the most coincidences are *most likely* to be the multiples of the key length. 

The exercises below will guide you through this process.

**Exercise 1**: Fill in the "Displacement-Coincidences" table below. To do this, run the $\texttt{coincidenceCount}$ function for displacements $1 \leq d \leq 15$. You could also write the ciphertext on two pieces of paper and do it manually, your choice! Record the output of the function in the table below.

In [62]:
def coincidenceCount(L,d):
    count = 0
    for i in zip(L[:-d],L[d:]):
        if i[0] == i[1]:
            count += 1
    return count

| Displacement  | Coincidences  |
| ------------- | ------------- |
|       1       |               |
|       2       |               |
|       3       |               |
|       4       |               |
|       5       |               |
|       6       |               |
|       7       |               |
|       8       |               |
|       9       |               |
|       10      |               |
|       11      |               |
|       12      |               |
|       13      |               |
|       14      |               |
|       15      |               |

**Exercise 2:** Analyze the table. What is the key length for this ciphertext? Write it down below.

##### *CHECK IN: call me over!* ######

### Part 3: Finding the key

Now that we have an educated guess on the key length, it is time to find the key. This will involve frequency analysis as well, however since adjacent letters are encrypted with different shift keys, we have to split the ciphertext accordingly.

Let the key length you found in Part 2 be $k$. To find the key, follow the next steps.

**Exercise 3:** First, we must split the ciphertext $k$-many ways by grouping all letters in the same position modulo $k$ together. Run the code below to create $k$-many lists. Label them appropriately.

In [2]:
#string should be what you are splitting, i is the position modulo k, and k is the key length.
#This function splits the string provided by taking only letters in position i modulo k.
def splitString(string,i,k):
    L = []
    for j in range(len(string)):
        if j % k == i:
            L.append(string[j])
    return "".join(L)

**Question:** Why did we reorganzie the ciphertext this way? Explain.

Now we want to do frequency analysis on these individual substrings of the ciphertext. To find the frequency for each letter in each substring, do the following exercises.

**Exercise 4:** For each substring, count the number of times each letter occurs and divide it by the total number of letters in that substring. You may use the code below or work by hand.

In [3]:
#This function takes in a string and outputs the frequency count for each letter in that string.
def frequencyCount(string):
    alphabet = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
    n = len(string)
    freq_dict = {key: 0.0 for key in alphabet}
    for char in string:
        freq_dict[char] += 1.0/n
    return freq_dict
    

If you are working by hand, it might be helpful to keep track of the frequencies in this table. Each row should correspond to a different substring.

| | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z |
|-| - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
||   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |
||   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |
||   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |
||   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |

**Exercise 5:** Now you have to do some computations with the list of frequencies found in Exercise 4. To do that, for each substring we enter its frequences in a list in Python (see example below). Label the list so that you know which substring it belongs to. If you would like to not do this by hand, use the code below.

**Exercise 6:** We can now determine the key! For each list created in Exercise 5, use the $\texttt{findthekey}$ function on that list. The maximum number in the list that is outputted corresponds to the letter used for the key.

In [5]:
#The list below is the list of frequencies of each letter in the English language. For example, the 'C' is used 2.78% of the time.
f=[0.0816, 0.0149, 0.0278, 0.0425, 0.127, 0.0222, 0.0201, 0.0609, 0.0696, 0.0015, 0.0077, 0.0402, 0.024, 0.0674, 0.075, 0.0192, 0.0009, 0.0598, 0.0632, 0.0905, 0.0275, 0.0097, 0.0236, 0.0015, 0.0197, 0.0007]

#The function below compares the frequencies in the parameter list L to the list above of English frequencies.
def findthekey(L):
    M = []
    for i in range(26):
        K = [f[(j-i)%26] for j in range(26)]
        M.append(round(sum([x*y for x,y in zip(L,K)]),5))
    return filter(lambda (i, val): val == max(M), zip(range(len(M)), M))

**Exercise 7:** By putting all of the pieces of the key together, we will have the key for the encryption of this Vigenère cipher. Write the key below.

#### *CHECK IN: call me over so we can discuss finding the key* ######

### Part 4: Decrypt the ciphertext

Now that you know the key, decrypt the message using the Vigenère table or by using the code below.

In [1]:
#Enter the ciphertext as "string" in all capitals and the key should be a string as well (the letters represent the shifts). Note that the key should be lowercase.
def decodeVigenere(string, key):
    let = 'abcdefghijklmnopqrstuvwxyz'
    plaintext = []
    for k,i in enumerate(string):
        plaintext.append(let[(let.index(i.lower())-let.index(key[k % len(key)])) % 26])
    return "".join(plaintext)

**Exercise 8:** The plaintext is: 

In [0]:
decodeVigenere('')

#### *CHECK IN: call me over so I know you were able to decrypt* ######

### Part 5: Why does this work?

**Exercise 9:** Write down any ideas you have about why this process works. We will discuss together at the end of class.