# Caesar and Vigenere Ciphers
# Galen Wilkerson



To land an interview, please include (in addition to your resume):

    Part 1: What experience do you have programming? What languages? If you have any links to past projects, we’d love to check them out.

I have experience programming in Python for roughly 8 years (in various roles).
Recently, I use it every day to do Deep Learning, but also have written a variety of demos.

Also, R, matlab, C++, Objective C, perl, and racket (a variant of scheme).

My github repositories:

https://github.com/galenwilkerson

    Part 2: What experience do you have teaching and/or tutoring?

I have taught robotics for the Johns Hopkins Center for Talented Youth to 11-year old children, during which we taught them Python and Arduino circuitry.

I have also worked as an outdoor instructor for Outward Bound, which was very useful, especially for the above position.

And I have tutored and taught undergraduate level Pre-Calculus and Calculus at the University of Vermont.

    Part 3: Sample Assignment:  

        Create a data visualization of the letter frequency distribution in an English message and demonstrate how it’s transformed by Caesar and Vigenère ciphers. 

Sample Assignment Instructions:

    Add documentation within the code that is at a level a new Python learner could understand easily.

    Indicate places in the code where a learner might experiment, changing the code in ways that have an interesting impact on the program.

    Briefly (in 2-4 paragraphs) explain what you would emphasize when sharing this code with a learner in the context of teaching them this new topic.  

        Assume that the learner is a introductory-level Python learner who knows what variables, random variables, functions, and loops are.

        In the case of Dijkstra's shortest path algorithm, assume that learner have just learned the basic idea of recursion from the immediately preceding unit (via Towers of Hanoi or something similar).

        For the Data Visualization project, assume that learners have seen Caesar shifts before but that the Vigenère cipher is new territory.

    Give two to three examples of multiple choice questions you could follow up with to make sure that your learner understood the ideas you explained.

In [1]:
from __future__ import print_function
from ipywidgets import interact, interactive, fixed, interact_manual
import ipywidgets as widgets

import matplotlib.pyplot as plt

from IPython.display import display, HTML
from pandas import Series, DataFrame

%matplotlib inline

def caesar_cipher(shift = 0, text_input = 'hello'):
    '''
    remove spaces and shift each letter of text_input by shift values 
    (if shift == 0, do nothing)
    
    # TO TRY:  
    #   HOW WOULD YOU HANDLE CAPITAL LETTERS?  (HINT: TRY 'HELLO'.lower() )
    #
    #   CAN YOU FILTER OUT NON-LETTERS?
    #
    #   CAN YOU KEEP THE PUNCTUATION IN THE ENCRYPTED MESSAGE?
    #
    #   IS THIS A GOOD IDEA?
    #
    #   CAN YOU STILL READ A MESSAGE IF VOWELS ARE MISSING?   TRY MODIFYING THE FUNCTION TO DO THIS
    #
    #   HOW WOULD YOU DE-CIPHER A CAESAR CIPHER?
    #          CAN YOU MAKE A SIMPLE MODIFICATION OF THIS FUNCTION TO DE-CIPHER A MESSAGE?
    #
    #   CAN YOU MULTIPLY BY THE SHIFT VALUE THEN TAKE THE MODULUS 26 (% IS MODULUS IN PYTHON)
    #
    #   HOW WOULD YOU DE-CIPHER THIS?
    '''
    
    # the output message list
    message = []  
    alphabet = 'abcdefghijklmnopqrstuvwxyz'

    # iterate through the letters in the input text
    for letter in text_input:             
        
        # check that the letter is in a-z
        if letter in alphabet:               
            # if yes, add to the output message
            # if shift is more than 26, start again at 0 (% is the modulus function)
            message.append(alphabet[(alphabet.index(letter) + shift) % len(alphabet)])    

    # join the output message list into one string
    output = ''.join(message)
    
    return output


def vigenere_cipher(key, text_input):
    '''
    use the key (repeated) to encrypt text_input
    key letter 'a' == shift of 0, 
     "    "    'b' == shift of 1,..
     
    # TO TRY:  HOW WOULD YOU DE-CIPHER A VIGENERE CIPHER?
    #          CAN YOU MAKE A SIMPLE MODIFICATION TO DE-CIPHER A MESSAGE?
    '''
    
    # the output message list
    message = []
    alphabet = 'abcdefghijklmnopqrstuvwxyz'

    # remove spaces in input text
    text_without_spaces = text_input.replace(' ', '')
    
    # how many times should we repeat the key to cover the input text length?
    num_keys_in_text = len(text_without_spaces) / len(key)
    
    # repeat the key
    repeated_key = key * (int(num_keys_in_text) + 1)
    
    # convert the repeated key to shift values
    repeated_key_shifts = [alphabet.index(x) for x in repeated_key]
    
    # step through the length of the input text without spaces
    for i in range(len(text_without_spaces)):
        
        # get the next letter
        letter = text_without_spaces[i]
        
        # check that the letter is in a-z
        if letter in alphabet:
            # if yes, add to the output message
            # if shift is more than 26, start again at 0 (% is the modulus function)
            message.append(alphabet[(alphabet.index(letter) + repeated_key_shifts[i]) % len(alphabet)])    

    # join the output message list into one string
    output = ''.join(message)
    
    return output

def plotFreq(text, title, sort = False):
    '''
    make a bar plot of letter frequency
    if sort == True, sorts columns by decreasing frequency
    else, keeps columns fixed
    '''
    
    # make a pandas Series to easily count and plot the letter frequencies
    ser = Series(list(text))
    
    # count the occurences of unique letters
    if sort == False:
        ser.value_counts().sort_index().plot(kind = 'bar', title = title, figsize = [12,5]);
    else:
        ser.value_counts().plot(kind = 'bar', title = title, figsize = [12,5]);

    plt.xlabel('unique letters')
    plt.ylabel('letter frequency');
    
    # TO TRY:
    #   SORT THE LETTER COUNT COLUMNS FROM LARGEST TO SMALLEST COUNT AND PLOT THE LETTER COUNT
    #     TRY THIS WITH *LONG* INPUT TEXT AND LOOK AT THE LETTER COUNTS OF THE TEXT INPUT
    #     (YOU MAY HAVE TO ADJUST THE PLOT TO SEE IT BETTER)
    #
    #   WHAT SHAPE OF LETTER COUNTS HAS THE MOST UNCERTAINTY ABOUT THE ORIGINAL MESSAGE WHEN USING THE CAESAR CIPHER?
    #   
    #   WHICH CIPHER SCRAMBLES THE LETTER COUNTS MORE?
    #
    #   WHICH KEYWORDS SCRAMBLE LETTER COUNTS BEST WHEN USING THE VIGENERE CIPHER?
    #
    #   HOW DO SHORT OR LONG VIGENERE KEYS INFLUENCE THE LETTER COUNT FREQUENCY?
    #
    #   WHAT HAPPENS TO THE LETTER COUNTS IF YOU USE A VERY LONG INPUT TEXT? (CUT AND PASTE)
    #       YOU MAY HAVE TO RE-WRITE THE CODE WITHOUT interact() AND INSTEAD USE text_input = input(), 
    #       OR ELSE TO READ IN A TEXTFILE (see https://docs.python.org/3/tutorial/)
    #
    #   HOW COULD YOU USE THIS TO DETERMINE WHICH CIPHER IS BEING USED IF YOU SEE ENCRYPTED TEXT?
    #         (SEE ALSO https://en.wikipedia.org/wiki/Zipf%27s_law )
    #
    #   HOW COULD YOU USE IT TO EVALUATE YOUR VIGENERE KEY?
    #
    #   CAN YOU WRITE A FUNCTION THAT GIVES A SCORE TO HOW WELL-ENCRYPTED A CIPHER IS?
    #
    #   CAN YOU WRITE A FUNCTION TO AUTOMATICALLY DE-CIPHER A CAESAR CIPHER USING LETTER FREQUENCIES, 
    #     *WITHOUT* KNOWING THE SHIFT!?
    #
    #   HOW ABOUT A VIGENERE CIPHER?  
    #     (HINT: THIS IS PRETTY CHALLENGING, TRY FIRST WITH A KNOWN KEY WORD LENGTH AND A LONG TEXT)

    
def displayTextCipher(text, cipher):
    '''
    display the original input text and the cipher as two rows of text
    '''
    
    # remove spaces in input text
    text_without_spaces = text.replace(' ', '')
    
    # make a pandas dataframe with the message as a row
    df = DataFrame([list(text_without_spaces)])
    
    # add the cipher as a second row
    df = df.append([list(cipher)])

    # display the DataFrame without index or column names
    display(HTML(df.to_html(index=False, header = None)))

    
def display_caeser(shift, text = 'the quick onyx goblin jumps over the lazy dwarf', sort_columns = False):
    '''
    according to the shift number, 
    display the caeser ciphered message and make a plot of the cipher letter counts
    '''
    
    # find the cipher using caesar cipher
    cipher = caesar_cipher(shift, text)
    
    title = 'letter counts:'+ '\n' + 'original text shifted by ' + str(shift)
    
    text_without_spaces = text.replace(' ', '')

    # call the plot function on the original text and the ciphered text
    #
    # TO TRY: CAN YOU MEASURE THE DIFFERENCE BETWEEN THE ORIGINAL AND CIPHERED LETTER COUNTS AND PLOT IT?
    #
    plt.figure()
    plt.subplot(211)
    plotFreq(text_without_spaces, 'original text', sort_columns)
    plt.subplot(212)
    plotFreq(cipher, title, sort_columns)
    plt.subplots_adjust(hspace=1)

    # display the input text and cipher
    displayTextCipher(text, cipher)
    
def display_vigenere(key, text = 'the quick onyx goblin jumps over the lazy dwarf', sort_columns = False):
    '''
    according to the key, 
    display the vigenere ciphered message and make a plot of the cipher letter counts
    '''
    
    cipher = vigenere_cipher(key, text)
    
    # build a nice title string using the key and cipher variables
    title = 'letter counts:' + '\n' + 'The cipher (original text shifted by \'' + str(key) +'\')'
    
    text_without_spaces = text.replace(' ', '')
        
    plt.figure()
    plt.subplot(211)
    plotFreq(text_without_spaces, 'original text', sort_columns)
    plt.subplot(212)
    plotFreq(cipher, title, sort_columns)
    plt.subplots_adjust(hspace=1)
    
    # display the input text and cipher
    displayTextCipher(text, cipher)

def choose_cipher(cipher_name = 'Caesar'):
    '''
    based on the input cipher_name, 
    interactively run one of the display_ functions above, 
    *Notice that these interactions allow the user to determine the caesar shift or the vigenere key*
    '''
    
    alphabet = 'abcdefghijklmnopqrstuvwxyz'

    if cipher_name == 'Caesar':
        interact(display_caeser, shift=widgets.IntSlider(min=-len(alphabet),max=len(alphabet),step=1,value=1));
        
    elif cipher_name == 'Vigenere':
        interact(display_vigenere, key=['aardvark', 'galileo', 'hubble', 'a', 'b', 'm', 'az', 'za']);
        
# create an interactive display, allowing the user to select the cipher, then 
interact(choose_cipher, cipher_name = ['Caesar', 'Vigenere']);

interactive(children=(Dropdown(description='cipher_name', options=('Caesar', 'Vigenere'), value='Caesar'), Out…

Give two to three examples of multiple choice questions you could follow up with to make sure that your learner understood the ideas you explained.

1.  IN ENGLISH, WHICH LETTER WOULD YOU EXPECT TO HAVE THE HIGHEST COUNT IN A TEXT FROM A RANDOM NEWSPAPER ARTICLE?

    a. 'z'
    
    b. 'm'

    c. 'e'
    
    d  'u'
    
    
2.  IS A TEXT WELL-ENCRYPTED IF THERE IS A CLEAR DISTRIBUTION OF LETTER COUNTS?

    a. yes
    
    b. no
    
    
3.  WHICH CIPHER TENDS TO SCRAMBLE INPUT TEXT BETTER?

    a. the Caeser cipher
    
    b. the Vigenere cipher
    
