## Caesar Cipher

The Caesar cipher, also known the shift cipher, is a type of substitution cipher in which each letter in the plaintext is replaced by a letter some fixed number of positions down the alphabet. 

For example, with a left shift of 3, D would be replaced by A, E would become B, and

```
Plain:    ABCDEFGHIJKLMNOPQRSTUVWXYZ
Cipher:   XYZABCDEFGHIJKLMNOPQRSTUVW
```

With the above encryption rule, a message can be encrypted as

```
Plain:  Codebuster is fun!
Cipher: Zlabyrpqbo fp crk!

```

Mathematically, if A=0, B=1, ... Z=25, encryption of a letter x by a shift n can be described as,

$$E_n(x) = (n+x) \mod 26 $$

while the decryption can be done by the reverse shift 

$$D_n(x) = (n-x) \mod 26 $$


### Codebuster strategies

The Caesar Ciper encrypted message is rather easy to decrypt, since the distances between letters remain the same. 
You may start with words with fewest letters. For example, for "Zlabyrpqbo fp crk!", we start with fp. We can simply write down all possible shifts,

```
ak bl cm dn eo fp gq hr is jt ku lv mw nx oy pz qa rb sc td ue vf wg xh yi zj
```

Apparently, only "is" is a word. Then 'f'->'i', or the shift =-3 or (23). The rest decryption is straightforward; you may use the Vigenère Alphabet Table provided at competitions; use one row as the shift table. 

Other common patterns are one-letter, 'a' or 'I', repetitive letters 'ee', 'll', 'oo' ...

More details as well as the frequent patterns in English can be found [here](https://www3.nd.edu/~busiforc/handouts/cryptography/cryptography%20hints.html).

### Python implementation of Cipher/Decipher

In computers, lettes are ususually represented by ascii code: 'A'=65, 'B'=66, ..., 'a'=97', ... which can be obtained by the `ord()` Python function. Implementations for Caesar Encryptor/Decryptors are shown below. 

In [1]:
### Python code for Caesar Cipher

def Caesar_Encryptor(text,shift): 
    """
    Caesar Cipher to encrypt a {text} with a given {shift}
    Only letters A-Za-z are encrypted
    Return: encrypted text
    """
    # create an empty string for output
    result = "" 
  
    # iterate over the input text
    for i in range(len(text)): 
        # get the character
        char = text[i] 
        # if it's a upper case letter 
        if (char.isupper()): 
            # shift the letter (c+shift) % 26
            result += chr((ord(char) + shift- ord('A')) % 26 + ord('A')) 
        # if it's a lower case letter 
        elif (char.islower()): 
            result += chr((ord(char) + shift - ord('a')) % 26 + ord('a')) 
        # All others including space, numbers, symbols
        else:
            # just copy it
            result += char
    # return the encrypted text
    return result 

def Caesar_Decryptor(text, shift): 
    """
    Caesar Cipher to decrypt a {text} if the {shift} has been figured out
    Only letters A-Za-z are decrypted
    Return: decrypted text
    """
    # simply reverse
    return Caesar_Encryptor(text, -shift) 
    

In [2]:
# Test the above functions
# Change text and/or shift for your own message,
text = "Codebuster is fun!"
shift = -3
print("Plain  : " + text ) 
print("Shift : " + str(shift))
encrypted_text = Caesar_Encryptor(text,shift)
print("Cipher: " + encrypted_text)
decrypted_text = Caesar_Decryptor(encrypted_text, shift)
print("Decrypted: " + decrypted_text)

### Computer-assisted decryption: a message with two-letter words 

If there are any two-letter words in the encrypted message, it is rather easy to decrypt, since there are not so many two-letter words. 

In Caesar Cipher, while the shift is varied and unknown in decryption, the relative distance between two letters remains the same. We define the distance between two letters as the number of alphabets counted from first letter to second (not including the first). For example 'be', we count 'cde', three letters, and the distance is 3. 

In [3]:
def TwoLetterDistance(word):
    """
    Calculate the distance between two letters in {word}, excluding the first one
    i.e., distance('be')=3 
    """
    # check whether it's a two-letter word
    if len(word)!=2:
        print("This function can only compute the distance for two-letter words")
        return
    # convert both to lower case 
    firstLetter = word[0].lower()
    secondLetter = word[1].lower()
    # compute the distance
    distance = (ord(secondLetter)-ord(firstLetter))% 26
    return distance

def TwoLetterWordDictionary():
    """
    Set up a dictionary {"word": distance} for known two-letter words
    """
    WordDict = {}
    TwoLetterWordList="of, to, in, it, is, be, as, at, so, we, he, by, or, on, do, if, me, my, up, an, go, no, us, am".split(', ')
    for word in TwoLetterWordList:
        distance = TwoLetterDistance(word)
        WordDict.update({word : distance})
    WordDict = dict(sorted(WordDict.items(), key=lambda item: item[1]))
    return WordDict    

# create my two-letter word dictionary
MyTwoLetterWordDictionary = TwoLetterWordDictionary()
# print out all items in the dictionary
for word,distance in MyTwoLetterWordDictionary.items():
    print("Word ", word, "has a distance=", distance, ", reverse distance=", distance-26)

For example, in the encrypted message "Zlabyrpqbo fp crk!", there is a two-letter word "fp". The distance between 'f' and 'p' is 10. We can easily figure out that the word must be "is". The shift is then the distance between 'i' and 'f', which is 23, or -3. We now can solve the puzzle, with the python code below

In [4]:
# find the distance between two letters in the two-letter word
twoLetterWord = 'fp'
distance=TwoLetterDistance(twoLetterWord) # return 10
print("Distance between ", twoLetterWord, " is ", distance)
# Check the two-letter word dictionary and possible matches
for word,dist in MyTwoLetterWordDictionary.items(): 
    if dist == distance:
        print("A possible match is '"+word+"' with the shift ", TwoLetterDistance(word[0]+twoLetterWord[0]))    
# return 'is' with the shift 23
# we now use the shift to decrypt
shift=23
decrypted_text = Caesar_Decryptor("Zlabyrpqbo fp crk!", shift)
print("Decypted: ", decrypted_text)

The integrate program which can decipher a message with a two-letter word:

In [5]:
def Caesar_Decryptor_TwoLetterWord(encrypted_text, twoLetterWord):
    """
    Caesar Decryptor to use a two-letter word to decrypt a message
    """
    
    # compute the distance between the two letters in twoLetterWord
    distance=TwoLetterDistance(twoLetterWord)
    print("Distance between ", twoLetterWord, " is ", distance)
    
    # create a list of all possible decrypted messages
    result = []
    
    # Check the two-letter word dictionary and possible matches
    for word,dist in MyTwoLetterWordDictionary.items(): 
        # found a matched two-letter word in dictionary
        if dist == distance:
            # calculate the shift
            shift = TwoLetterDistance(word[0]+twoLetterWord[0])
            print("A possible match is '"+word+"' with the shift ", shift)
            # decrypt the message
            decrypted_text = Caesar_Decryptor(encrypted_text, shift)
            print("The decrypted message is "+decrypted_text)
            # append the decrypted text to the answer list
            result.append(decrypted_text)
    # return the list of all possible answers
    return result
    
# test 
print(Caesar_Decryptor_TwoLetterWord("Zlabyrpqbo fp crk!", "fp"))

# more tests with two or more possibilities
print(Caesar_Decryptor_TwoLetterWord('SQJSX CU YV OEK SQD', 'CU'))
print(Caesar_Decryptor_TwoLetterWord('SQJSX CU YV OEK SQD', 'YV'))

### Computer-assisted decryptor, based on an English dictionary

It is easy to ask computer to try all possible shifts. We then check whether the decrypted words are in a language dictionary, by a pyspellchecker package.  

In [6]:
def Caesar_Decryptor_Dict(encrypted, matchCounts=4):
    """
    A Caesar Decipher which uses the spell checks, or whether words are in dictionary
    Use {matchCounts} to control how many words need to be identified in dictionary
    Note that spellcheck acts certain abbreviations as real words, 
    use a large matchCounts if there are many two-letter words
    """
    # use a spellchecker to check whether words are in dictionary
    from spellchecker import SpellChecker
    # create an English spell checker
    spell = SpellChecker(language=u'en')
    
    # iterate all possible shifts
    for shift in range(26):
        # decrypt with the current shift
        decrypted = Caesar_Decryptor(encrypted, shift)
        # split the text into a list of words
        wordsList = spell.split_words(decrypted)
        wordsCount = len(wordsList)
        # set a criterion for match, at least three words are in dictionary
        wordsMatchedCriterion = min(wordsCount, matchCounts)
        
        # check whether it is a real word
        dictWordsList = spell.known(wordsList)
        if len(dictWordsList) >= wordsMatchedCriterion:
            print("Find dictionary words at shift ", shift)
            return decrypted
    
    print("All trials failed")
    return ""
        
# a test
Caesar_Decryptor_Dict("SQJSX CU, YV OEK SQD!")

Find dictionary words at shift  16


'CATCH ME, IF YOU CAN!'

### Practice Problems

1. 'Bpm ozmibmab otwzg qv tqdqvo tqma vwb qv vmdmz nittqvo, jcb qv zqaqvo mdmzg bqum em nitt.'

2. 'Vjg yca vq igv uvctvgf ku vq swkv vcnmkpi cpf dgikp fqkpi.'

3. 'Dolu fvb ylhjo aol luk vm fvby yvwl, apl h ruva pu pa huk ohun vu.'

4. 'Itaqhqd ue tmbbk iuxx ymwq aftqde tmbbk faa.'

5. 'Vgrvtn mzhzhwzm ocvo tjp vmz vwnjgpozgt pidlpz. Epno gdfz zqzmtjiz zgnz.'


see the answers [here](Answer.txt).

You can also use the following code to generate the decrypted message for practice (of course, ask someone else to do it). 

In [7]:
# Generate your own ciphers
import random
message = "My message to be encrypted"
shift = random.randint(-26,26)
encrypted = Caesar_Encryptor(message, shift)
print("Cipher: ", encrypted)

Cipher:  Bn bthhpvt id qt tcrgneits


In [8]:
# to check your answer
decrypted = Caesar_Decryptor_Dict(encrypted)
print("Decrypted: ", decrypted)

Find dictionary words at shift  15
Decrypted:  My message to be encrypted
