In [None]:
# Initialize Otter
import otter
grader = otter.Notebook("hw01.ipynb")

# Homework 01: Monoalphabetic Ciphers and Cryptanalysis

At this point, we know enough about monoalphabetic substitution ciphers and how to analyze them to complete a homework to showcase you also know enough Python to program them. If you've been keeping up with the activities for each lesson, this homework should not take you long at all to complete!

## Part 1: Programming With Loops, Functions, and Conditional Branches

### Question 1.1: Triangular Numbers

A *triangular number*, $T_n$ is a number obtained by adding all positive integers less than or equal to $n$. For example:
  * $T_1 = 1$ 
  * $T_2 = 3 = 1 + 2$
  * $T_3 = 6 = 1 + 2 + 3$
  * $T_4 = 10 = 1 + 2 + 3 + 4$

are all triangular numbers.

In the code cell below write a function named `triangular_number` that generates triangular numbers and **returns** them as integers. The function only needs to work for non-negative integer inputs (0 or above).

**Hint:** A loop should make short work of this!

In [None]:
def triangular_number(n):
    # YOUR CODE GOES BELOW THIS LINE
    

In [None]:
grader.check("q1_1")

### Question 1.2: Conditional Branches

In the code cell below, write a function named `integer_compare()` that compares two integer values and returns a string depending on their relative size.

The string the function should **return** should be:
  * `'x is larger than y'` if $x \gt y$
  * `'y is larger than x'` if $x \lt y$ 
  * `'x and y are equal'` if $x = y$

In [None]:
def integer_compare(x, y):
    # YOUR CODE GOES BELOW THIS LINE
    

In [None]:
# try out your code here
print( integer_compare(3, 7) )

In [None]:
grader.check("q1_2")

## Part Two: Function Checkpoint

You've been asked to write several functions throughout the course so far. Please use that work to quickly provide code for the following functions.

### Question 2.1: `text_clean`

Text clean is the function we use to prepare text for enciphering / deciphering. As a reminder, this function takes in a string `text`, makes all of the characters uppercase,  then only returns those characters that are also in the `LETTERS` string. By default, `LETTERS` only contains the 26 uppercase English letters but this value could be overridden by providing an argument to the `LETTERS` parameter.

In [None]:
def text_clean( text, LETTERS = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'):
    """
    Arguments:
        text (str): a piece of text for cleaning
        LETTERS (str, optional): defines the alphabet of allowable characters
    Returns:
        (str): text with only the characters also found in LETTERS
               lower-case letters in text will be made upper-case  
    """
    # YOUR CODE GOES BELOW THIS LINE
    

In [None]:
grader.check("q2_1")

### Question 2.2: `text_block`

This function should take in a string and return the same exact string, only grouped into blocks of `size` characters. By default it should be blocks of 5 characters, but this could be overridden by passing a different integer argument to the parameter `size`.

You can assume that any input to this function is already "clean" so you don't need to clean `text` against a `LETTERS` string inside this function.

In [None]:
def text_block( text, size = 5 ):
    """
    Arguments:
        text (str): text to block
        size (int, optional): # of characters in a block
    Returns:
        (str): text blocked into groups of specified size
    """
    # YOUR CODE GOES BELOW THIS LINE
    

In [None]:
grader.check("q2_2")

### Question 2.3: `caesar`

The `caesar` function takes in a key and message and returns the enciphered or deciphered version of that message depending on if the parameter `encipher` is set to `True` or `False`. This function should use `text_clean` to clean the message before attempting to encipher or decipher the message.

Strings returned that are ciphertext should only contain those characters specified in `LETTERS` and should be blocked into groups of 5 characters.

Strings returned that are plaintext should only contain those characters specified in `LETTERS`, but lowercase, and should **not** be blocked into groups.

In [None]:
def caesar(key, message, encipher=True, LETTERS='ABCDEFGHIJKLMNOPQRSTUVWXYZ'):
    """
    Arguments:
        key (int): an integer used as the key to encipher the message
        message (str): the message to encipher
        encipher (bool, optional): True --> encipher the message, False -- decipher
        LETTERS (str, optional): defines the alphabet of allowable characters
    Returns:
        (str): encrypted / decrypted version of message
    """
    # YOUR CODE GOES BELOW THIS LINE
    

In [None]:
grader.check("q2_3")

### Question 2.4: `affine`

The `affine` function takes in a multiplicative key, an additive key, and message and returns the affine enciphered or deciphered version of that message depending on if the parameter `encipher` is set to `True` or `False`. This function should use `text_clean` to clean the message before attempting to encipher or decipher the message. The decipher process will need to compute a multiplicative inverse in the mod of the length of LETTERS. You can use the provided `multiplicative_inverse` function to perform this step.

Strings returned that are ciphertext should only contain those characters specified in `LETTERS` and should be blocked into groups of 5 characters.

Strings returned that are plaintext should only contain those characters specified in `LETTERS`, but lowercase, and should **not** be blocked into groups.

In [None]:
def multiplicative_inverse(n, m):
    row1 = [m, 1, 0] 
    row2 = [n, 0, 1] 
    
    while row2[0] != 1: 
        k = row1[0] // row2[0] 

        row3 = [ row1[0] - k*row2[0], row1[1] - k*row2[1], row1[2] - k*row2[2] ] 

        row1 = row2 
        row2 = row3 
    
    inverse = row2[2] % m 
    
    return inverse

In [None]:
def affine(km, ka, message, encipher=True, LETTERS='ABCDEFGHIJKLMNOPQRSTUVWXYZ'):
    """
    Arguments:
        km (int): an integer used as the multiplicative key to encipher the message
        ka (int): an integer used as the additive key to encipher the message
        message (str): the message to encipher
        encipher (bool, optional): True --> encipher the message, False -- decipher
        LETTERS (str, optional): defines the alphabet of allowable characters
    Returns:
        (str): encrypted / decrypted version of message
    """
    # YOUR CODE GOES BELOW THIS LINE
    

In [None]:
grader.check("q2_4")

### Question 2.5: `chi_squared_score`

The `chi_squared_score` function should take in a candidate plaintext, clean it, and then score it using the $\chi^2$ method. Your function should use the provided letter frequencies in the list `standard_frequencies`, since there is no agreed upon set of frequencies of each letter in the English language.

In [None]:
def chi_squared_score( candidate, LETTERS = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ' ):
    candidate = text_clean( candidate, LETTERS )
    standard_frequencies = [0.08167, 0.01492, 0.02782, 0.04253, 0.12702, 0.02228, 0.02015, 0.06094, 0.06966, 0.00153, 0.00772, 0.04025, 0.02406, 0.06749, 0.07507, 0.01929, 0.00095, 0.05987, 0.06327, 0.09056, 0.02758, 0.00978, 0.02360, 0.00150, 0.01974, 0.00074]
    # YOUR CODE GOES BELOW THIS LINE
    

In [None]:
grader.check("q2_5")

## Part 3: Cryptanalysis

You'll be using the ciphertext contained in `hw01.txt` to complete the following questions. Run the cell below to load it as the string `ciphertext`. The first 100 characters will be printed to verify it's been loaded correctly.

In [None]:
with open('hw01.txt') as f: 
    ciphertext = f.read() 

print(ciphertext[0:100], '...')

### Question 3.1

In the cell below, create a list of frequencies for each character in `LETTERS`. Then, use that list to create a bar chart that displays the frequency of the 26 English letters in the current `ciphertext` message. Each frequency should be between 0 and 1 (a proportion, not a percentage) and the bar chart should have labels for both the x-axis and y-axis, as well as a title for the entire chart.

This question will be graded manually, but there are a few grader checks provided to help you check for some common mistakes.

In [None]:
import matplotlib.pyplot as plt
LETTERS = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
frequencies = []
ciphertext_length = ...
# YOUR CODE GOES BELOW THIS LINE

    
# Everything above this comment should be about computing the frequencies    
# Everything below this comment should be about creating the bar chart

...
...
...
...

...

In [None]:
grader.check("q3_1")

### Question 3.2

Use the `chi_squared_score` function and the provided loops to score all 312 possible candidates for the correct plaintext. Store each of the 312 individual results to a variable named `one_result` in the format:

`[<multiplicative-key>, <additive-key>, <chi-squared score>]` 

which should then appended to the list named `results` which will eventually hold all 312 result lists.

The last 4 lines of code will sort `results` by the chi-squared score as long as `results` is constructed correctly, and then extract the most likely km and ka values from the top single result.

In [None]:
results = []

for km in [1, 3, 5, 7, 9, 11, 15, 17, 19, 21, 23, 25]:
    for ka in range(26):
        candidate_text = ...
        one_result = ...
        ...
        
# DON'T CHANGE THE CODE BELOW THIS COMMENT
# THE CODE BELOW WILL SORT YOUR RESULTS FOR YOU
from operator import itemgetter
results = sorted(results, key=itemgetter(2))

likely_km = results[0][0]
likely_ka = results[0][1]

In [None]:
print( "Likely multiplicative key:", likely_km)
print( "Likely additive key:", likely_ka )

In [None]:
grader.check("q3_2")

## Submission

Make sure you have run all cells in your notebook in order before running the cell below, so that all images/graphs appear in the output. The cell below will generate a zip file for you to submit. **Please save before exporting!**

In [None]:
# Save your notebook first, then run this cell to export your submission.
grader.export(pdf=False)