# Monday, April 10th, 2023

## Project 6: Code breakers

The `ord` function can be used to return the ASCII code for a given string character.

In [7]:
for c in "message":
    print('{:>3} {}'.format(ord(c), "'"+c+"'"))

109 'm'
101 'e'
115 's'
115 's'
 97 'a'
103 'g'
101 'e'


In [3]:
ord('^')

94

The `chr()` function converts an integer `0`-`127` to the symbol corresponding to that ASCII code:

In [4]:
chr(94)

'^'

In [5]:
chr(55)

'7'

Encryption:

In [8]:
key_word = 'buffalo'
key_ascii = [ord(c) for c in key_word]

key_ascii

[98, 117, 102, 102, 97, 108, 111]

In [9]:
message = 'Top secret!'
message_ascii = [ord(c) for c in message]

message_ascii

[84, 111, 112, 32, 115, 101, 99, 114, 101, 116, 33]

In [41]:
def str_to_ascii(s):
    return [ord(c) for c in s]

key_ascii = str_to_ascii(key_word)
message_ascii = str_to_ascii(message)

In [48]:
def ascii_to_str(int_list):
    return ''.join([chr(i) for i in int_list])

In [49]:
ascii_to_str(key_ascii)

'buffalo'

In [13]:
encrypted_message_ascii = []
for c_key, c_message in zip(key_ascii, message_ascii):
    print(c_key, c_message)
    encrypted_message_ascii.append( (c_key + c_message) % 128 )
    
encrypted_message_ascii

98 84
117 111
102 112
102 32
97 115
108 101
111 99


[54, 100, 86, 6, 84, 81, 82]

Problem: We didn't encrypt the full message, because the length of our key is shorter than the length of the message. We'll need to fix.

In [14]:
encrypted_message_ascii

[54, 100, 86, 6, 84, 81, 82]

The message receiver now needs to decrypt the message:

Here, we will subtract our secret key and mod by 128.

In [15]:
decrypted_message_ascii = []

for c_key, c_encrypted in zip(key_ascii, encrypted_message_ascii):
    decrypted_message_ascii.append( (c_encrypted - c_key) % 128 )
    
decrypted_message_ascii

[84, 111, 112, 32, 115, 101, 99]

In [18]:
decrypted_message = [chr(c) for c in decrypted_message_ascii]
print(decrypted_message)

['T', 'o', 'p', ' ', 's', 'e', 'c']


We can use the `.join` method to convert our list of characters into a string:

In [19]:
''.join(decrypted_message)

'Top sec'

It will be useful to have functions that:
 - Convert a string to a sequence of ASCII integers
 - Convert a sequence of ASCII integers to a corresponding string
 - Encrypts a sequence of ASCII characters given an ASCII character key (or encrypts a string given a key string)
 - Decrypts a sequence of ASCII characters given an ASCII character key (or descrypts a string given a key string)

In [21]:
key_length = len(key_ascii)
key_length

7

In [24]:
message_length = len(message_ascii)
message_length

11

In [26]:
key_ascii + key_ascii

[98, 117, 102, 102, 97, 108, 111, 98, 117, 102, 102, 97, 108, 111]

In [30]:
for i, c_message in enumerate(message_ascii):
    print(i,c_message, key_ascii[i % key_length])

0 84 98
1 111 117
2 112 102
3 32 102
4 115 97
5 101 108
6 99 111
7 114 98
8 101 117
9 116 102
10 33 102


In [38]:
print(key_word)
print(key_ascii)

buffalo
[98, 117, 102, 102, 97, 108, 111]


## Working with files in Python

In [31]:
with open('5desk.txt') as file:
    s = file.read()

In [35]:
s[:100]

'A\na\nAachen\nAalborg\naardvark\nAarhus\nAaron\nAB\nAb\nabaci\naback\nabacus\nAbadan\nabaft\nabalone\nabandon\naband'

The `.split()` method can be used to convert a string into a list of strings, separated by a chosen delimiter:

In [36]:
words = s.split()

In [37]:
words[:10]

['A',
 'a',
 'Aachen',
 'Aalborg',
 'aardvark',
 'Aarhus',
 'Aaron',
 'AB',
 'Ab',
 'abaci']

In [40]:
list('buffalo')

['b', 'u', 'f', 'f', 'a', 'l', 'o']

## Sets in Python

In [51]:
my_str = "This is a test string. Some of these ;dfkalsjf;ldja are words some are not. as;ldfja."

We can use the `.replace()` method to replace one string with another:

In [53]:
my_str.replace('is', 'is not')

'This not is not a test string. Some of these ;dfkalsjf;ldja are words some are not. as;ldfja.'

In [55]:
my_str = my_str.replace('.','')
my_str

'This is a test string Some of these ;dfkalsjf;ldja are words some are not as;ldfja'

We can loop through a list of punctuations to remove them:

In [61]:
punctuations = ['.',',','"',"'",'!','?',':',';']

for punctuation in punctuations:
    my_str = my_str.replace(punctuation,'')

We can force all letters to be lowercase using the `.lower()` method:

In [64]:
my_str = my_str.lower()
my_str

'this is a test string some of these dfkalsjfldja are words some are not asldfja'

In [65]:
my_words = my_str.split()
print(my_words)

['this', 'is', 'a', 'test', 'string', 'some', 'of', 'these', 'dfkalsjfldja', 'are', 'words', 'some', 'are', 'not', 'asldfja']


Recall: we read in a list of 60,000 words:

In [67]:
words[:10]

['A',
 'a',
 'Aachen',
 'Aalborg',
 'aardvark',
 'Aarhus',
 'Aaron',
 'AB',
 'Ab',
 'abaci']

In [68]:
count = 0
for word in my_words:
    if word in words:
        count += 1
        
print(count)

13


We can also use sets to count the number of words in the word list:

In [69]:
my_list = [1,2,3,'a','b',2,1]
my_set = set(my_list)

print(my_list)
print(my_set)

[1, 2, 3, 'a', 'b', 2, 1]
{1, 2, 3, 'b', 'a'}


The `.intersection` method let's us compare one set to another:

In [72]:
my_set.intersection({'a',3,5,'c'})

{3, 'a'}

For our purposes, we can convert the list of words to a set, then use the `.intersection` method:

In [73]:
word_set = set(words)

In [76]:
num_word_list = []

for word in words:
    #... decrypt the message
    #... get a set of "words" in the decrypted message
    # ... count how many "words" are actual words
    num_words = len(word_set.intersection(my_words))
    num_word_list.append(num_words)

{'a',
 'are',
 'is',
 'not',
 'of',
 'some',
 'string',
 'test',
 'these',
 'this',
 'words'}

In [77]:
len(word_set.intersection(my_words))

11

The `np.argmax` function returns the location of the maximum value:

In [78]:
mylist = [5,6,7,9,100, 10, -2]

In [80]:
import numpy as np

In [81]:
np.argmax(mylist)

4

In [82]:
mylist = [[1,2,3],[4,5,6],[7,8,9]]

In [None]:
for ascii_key in decode(itemized_dictionary):
    key_length = len(ascii_key)