In [None]:
# Initialize Otter
import otter
grader = otter.Notebook("hw02.ipynb")

# Homework 02: Polyalphabetic Ciphers and Cryptanalysis

At this point, we know enough about polyalphabetic substitution ciphers and how to analyze them to complete a homework that covers the Vigenere cipher, Autokey cipher, and some concepts from the One Time Pad (OTP).

## Imports

To get you started, run the cell below to load functions that have been written in earlier homework assignments. Functions that are included:
* `text_clean`
* `text_block`

You can use these functions in any of your code below after you've imported them.

In [None]:
from hw02toolkit import text_clean, text_block

## Question 1: Vigenère Key Generation

When using the Vigenère cipher, you have the luxury of knowing exactly how long the plaintext / ciphertext are before you start to encrypt or decrypt. You'll see in other cipher types, like stream ciphers, you don't always have the complete message before you start your encryption or decryption process. For the Vigenère cipher, that means we can create the entire keystream from the keyword or primer before you get started with the message.

In the cell below write a function that will take in a Vigenère primer / keyword and generate the correct Vigenère keystream when provided the length of the message.

**NOTE:** You can assume that `primer` is already "cleaned" before it's received in this function, so do NOT clean the primer string.

**Example**:
```
>>> print( vigenere_keygen('TEST', 10) )
TESTTESTTE
```

In [None]:
def vigenere_keygen(primer, message_length):
    """
    Arguments:
        primer (str): the primer / keyword that will create the entire keystream
        message_length (int): # of characters in the CLEANED message (no spaces or punctuation)
    Returns:
        keystream (str): The entire keystream that can be combined with the message to encipher or decipher
    """

    keystream = ''
    # Your code after this comment
    

print(vigenere_keygen('test', 10))

In [None]:
grader.check("q1")

## Question 2: Vigenère Cipher Function

Write a function that implements the Vigenère Cipher. The function should be able to encipher and decipher messages depending on the values of `encipher` parameter. Your function should clean the provided message based on the provided `LETTERS` string using the `text_clean` function. Ciphertext output should be blocked into groups of 5 uppercase characters and plaintext output should be returned as lowercase characters with no spaces.

**Hint:** You can use your `vigenere_keygen` function from the previous question if you'd like, but there are no tests that will specifically ensure that you do.

**Examples**:
```
>>> print( vigenere('hospital', 'onaplaneaplaneisdue') )
VBSET TNPHD DPVXI DKIW

>>> print( vigenere('hospital', 'VBSET TNPHD DPVXI DKIW', encipher=False) )
onaplaneaplaneisdue
```

In [None]:
def vigenere(keyword, message, encipher=True, LETTERS = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'):
    """
    Arguments:
        keyword (str): the primer / keyword that will be used to create the entire keystream
        message (str): either the plaintext or ciphertext to work with
        encipher (bool, optional): True --> encipher the message, False --> decipher
        LETTERS (str, optional): defines the alphabet of allowable characters
    Returns:
        (str): encrypted / decrypted version of message formatting to specifications
    """
    cleaned_keyword = ...
    cleaned_message = ...
    keystream = ...
    output = '' 
    
    # Your code below this comment
    

print( vigenere('hospital', 'onaplaneaplaneisdue') )
print( vigenere('hospital', 'VBSET TNPHD DPVXI DKIW', encipher=False) )

In [None]:
grader.check("q2")

## Question 3: Autokey Cipher

The autokey cipher does not allow you to compute the entire keystream from the start when deciphering messages, since you need to recover some of the plaintext before can continue constructing the keystream. As a result the function for completing the autokey cipher will need to be a bit different than the others you've already written for the caesar, affine, and now Vigenère ciphers. You will need to think carefully about how to modify the keystream after each letter you encipher or decipher to ensure that it has sufficient characters to finish creating the message.

**Examples**
```
>>> print( autokey('UNICORN', 'acceptthegreaterchallenge' ) )
UPKGD KGHGI VTTML VIYEL EIEIL

>>> print( autokey('unicorn', 'UPKGD KGHGI VTTML VIYEL EIEIL', False) )
acceptthegreaterchallenge
```

In [None]:
def autokey(keyword, message, encipher=True, LETTERS = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'):
    """
    Arguments:
        keyword (str): the primer / keyword that will be used to create the keystream
        message (str): either the plaintext or ciphertext to work with
        encipher (bool, optional): True --> encipher the message, False --> decipher
        LETTERS (str, optional): defines the alphabet of allowable characters
    Returns:
        (str): encrypted / decrypted version of message formatting to specifications
    """
    cleaned_keyword = ...
    cleaned_message = ...
    output = ''
    
    keystream = cleaned_keyword
    
    # Your code below this comment
    
    
print(autokey('UNICORN', 'acceptthegreaterchallenge'))
print(autokey('unicorn', 'UPKGD KGHGI VTTML VIYEL EIEIL', False))

In [None]:
grader.check("q3")

## `random`

In order to use a One Time Pad with the Vigenère cipher, we need a way to generate random letters for the keystream. You can create "random" values using the `random` module in Python. We say "random" in quotes, because it's very very difficult to create truly random numbers but Python can do a fairly good job creating *pseudorandom* values, sometimes abbreviated as PR.

Run the cell below to import the `random` module.

In [None]:
import random

The `randint` function in the `random` module will output a random integer between the two provided arguments (inclusive of those values). You can call this function by first specifying the module the function is a part of (`random`), then a `.`, and then the function name (`randint`) with any required arguments. For example, running the cell below will create a value between 0 and 9. Run it a few times to confirm that it does in fact produce pseudorandom outputs.

In [None]:
random.randint(0, 9)

We can combine this ability to produce random integers with our existing ability to convert an integer to a character to produce random letters. For example:

In [None]:
LETTERS = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
print( LETTERS[ random.randint(0, 25) ] )

And combined with a loop, you can create a pseudorandom keystream of characters:

In [None]:
LETTERS = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
keystream = ''
for i in range(100):
    keystream += LETTERS[ random.randint(0, 25) ]

print(keystream)

But one important thing we need to consider with OTP is that both the enciphering and deciphering individuals need the **same** pseudorandom keystream. Luckily, there is a way to control the randomness in the `random` library using what's called a "seed". The seed value syncs up the random number generator to a certain point and can be set before creating the keystream. The ability to set a seed value is how we know `random.randint` is not *truly* random.

Notice when you run the code below that includes `random.seed(n)` where `n` is an integer value you always get the same output. Changing the value of `n` can produce a different pseudorandom keystream. In that sense, `n` is the *actual* key, since it will determine the pseudorandom keystream that's generated.

In [None]:
LETTERS = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
keystream = ''

random.seed(5)

for i in range(100):
    keystream += LETTERS[ random.randint(0, 25) ]

print(keystream)

## Question 4: OTP Key Generation

Write a function `otp_keygen` that takes as it's arguments an integer that represents the length of a cleaned message, an integer that is used as the seed value, and a string that represents the alphabet it should use to create a one time pad key of equal length of the message.

**Example:**

```
>>> print( otp_keygen(1000, 17354763458) )
HAVETKTBPNZAIQLUZMNUTBDXEETQUJHLXQLRTXVFXQYEUPLEUAYUGBZEVDFXVIJIGLOVTIYBIQYYPIGEKGRDTBSYZXXAXFPXIYUO
```

In [None]:
def otp_keygen(message_length, seed_value, LETTERS = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'):
    """
    Arguments:
        message_length (int): the length of the cleaned message that needs an OTP key
        seed_value (int): a number to set the seed for the random module
        LETTERS (str, optional): defines the alphabet of allowable characters
    Returns:
        (str): a pseudorandom one time pad keystream that contains only the characters in LETTERS
    """
    
    # Your code below this comment
    

print( otp_keygen(100, 17354763458) )

In [None]:
grader.check("q4")

## Question 5: The OTP Vigenère Cipher

Write a function, `otp_vigenere`, that takes in an integer `seed_value` that represents the seed value to be used when creating random characters, a string `message` that represents the message to be enciphered or deciphered, a boolean `encipher`, and a string `LETTERS`.

**Hint:** Use the functions you've already written in this homework! It should make this a very straightforward problem to solve.

**Examples:**
```
>>> print( otp_vigenere(42, 'I have no special talent. I am only passionately curious. Albert Einstein') )
RQSZM UNHDO DHCGY EXFUN HAZNK QCYTM RDSDT KDDZV MURXM WISAT FDFHP NYIUO

>>> print( otp_vigenere(42, 'CKASM UVWMH XFRNL NMEPZ PQFOE RUJWJ FPCFI SEJXH QESWY YYVEG DWPTG ASFFB', False) )
ihavenospecialtalentiamonlypassionatelycuriousalberteinstein
```

In [None]:
def otp_vigenere(seed_value, message, encipher=True, LETTERS = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'):
    """
    Arguments:
        seed_value (int): the seed value for the random module
        message (str): either the plaintext or ciphertext to work with
        encipher (bool, optional): True --> encipher the message, False --> decipher
        LETTERS (str, optional): defines the alphabet of allowable characters
    Returns:
        (str): encrypted / decrypted version of message formatting to specifications
    """
    cleaned_message = ...
    keystream = ...
    output = '' 
    
    # Your code below this comment
    

print( otp_vigenere(42, 'I have no special talent. I am only passionately curious. Albert Einstein') )
print( otp_vigenere(42, 'CKASM UVWMH XFRNL NMEPZ PQFOE RUJWJ FPCFI SEJXH QESWY YYVEG DWPTG ASFFB', False) )

In [None]:
grader.check("q5")

<!-- BEGIN QUESTION -->

## Question 6: Character Frequency

One goal of the one time pad is to help disguise character frequencies, hopefully creating an almost uniform distribution of each letter in the alphabet making it impossible for an attacker to use frequency analysis to help crack the message. Let's see how well your OTP function will disguise a message.

Run the code cell below to load the sample plaintext found in `hw02plaintext.txt` to the variable `plaintext`. The file contains the entire book, *The Scarlet Letter*.

In [None]:
with open('hw02plaintext.txt') as f: 
    plaintext = f.read() 

In the code cell below, encipher this message using the `otp_vigenere` function with a seed of your choosing. Save this result to the variable `ciphertext`.

**Note:** This may take some time as the computer will need to generate a very long OTP keystream and then use it to encipher the entire book. Test runs while developing this assignment usually took around 5 minutes, so hit that run button and go grab a cup of tea while you wait!

_Type your answer here, replacing this text._

In [None]:
ciphertext = ...

In [None]:
# Your bar chart code should go in this cell


<!-- END QUESTION -->



## Submission

Make sure you have run all cells in your notebook in order before running the cell below, so that all images/graphs appear in the output. The cell below will generate a zip file for you to submit. **Please save before exporting!**

Before exporting, SAVE your notebook, then RESTART AND RUN ALL CELLS. This will run the export cell. Make sure you submit the most recent copy by using the date-time stamp on the file.

In [None]:
# Save your notebook first, then run this cell to export your submission.
grader.export(pdf=False)