## Getting the Info

Before we start we need to talk a moment about **API**s in the internet

We will work with the information about late and great [Grace Hopper](https://en.wikipedia.org/wiki/Grace_Hopper) from Wikipedia.
First lets get the Information.

The Api request looks like this:

https://en.wikipedia.org/w/api.php?format=json&action=query&prop=extracts&explaintext&redirects=1&titles=Grace_Hopper

In [None]:
import requests

payload = {
    "format":"json",
    "action":"query",
    "prop": "extracts",
    "explaintext": "1",
    "redirects": "1",
    "titles": "Grace_Hopper"
}
json_response = requests.get('https://en.wikipedia.org/w/api.php', params=payload).json()
json_response

## Cleaning up the text

In [None]:
text = ""
for pagenum, page_info in json_response['query']['pages'].items():
    text = text + page_info["extract"]
    
text

## Lets encrypt this

We will encrypt (actually encode) this text using a randomly generated replacement dictionary. So lets to that.

*Python is actually pretty useful with random stuff*

#### Firs we will create the replacement dictionary randomly

In [None]:
import random
from string import ascii_lowercase

print(ascii_lowercase)
replacement = random.sample(ascii_lowercase, len(ascii_lowercase))
replacement = "".join(replacement)
print(replacement)

In [None]:
replacement_dict = {}
for a, b in zip(ascii_lowercase, replacement):
    replacement_dict[a] = b
    
print(replacement_dict)

#### Now lets encode this

In [None]:
encoded= ''
for char in text:
    lower_char = char.lower()
    if lower_char in replacement_dict:
        encoded = encoded + replacement_dict[lower_char]
    else:
        encoded = encoded + char
        
print(encoded)

> Please note I have left the numbersm spaces and panctuation alone. But I did ruin capital letters

## Breaking the encryption

This is not going to be a hard one, But I will demonstrate some things. And I will do it using [Frequency Analysis](https://en.wikipedia.org/wiki/Frequency_analysis)
 
 So now we need to cound the letters.

In [None]:
counters = {}
for char in encoded:
    if char not in ascii_lowercase:
        continue
    elif char in counters.keys():
        counters[char] += 1
    else:
        counters[char] = 1
        
print(counters)

#### Getting total number of letters

In [None]:
total_letters = 0
for num in counters.values():
    total_letters += num
    
print(total_letters)

### Lets try to show this more clearly

In [None]:
%matplotlib notebook
import matplotlib.pyplot as plt

letters = []
appearences = []

for letter, counter in counters.items():
    letters.append(letter)
    appearences.append(counter)

print(letters)
print(appearences)

plt.bar(letters, appearences, 0.8, color='red' )
plt.show()

### Lets try this again with frequncies as a precentage

and also lets look at [frequncy table in the english language](http://pi.math.cornell.edu/~mec/2003-2004/cryptography/subs/frequencies.html) and the [list of common words in english](https://en.wikipedia.org/wiki/Most_common_words_in_English)

In [None]:
%matplotlib notebook
import matplotlib.pyplot as plt

letters = []
appearences = []

for letter, counter in counters.items():
    if letter in "abinosuz":          # This line and the next should be commented
        continue
    letters.append(letter)
    appearences.append(counter / total_letters)

plt.bar(letters, appearences, 0.6, color='b' )

plt.show()

## Now lets try to recustruct the text

With trial and error...

In [None]:
d = {
 'a': 'A',
 'b': 'B',
 'c': 'C',
 'd': 'D',
 'e': 'E',
 'f': 'F',
 'g': 'G',
 'h': 'H',
 'i': 'I',
 'j': 'J',
 'k': 'K',
 'l': 'L',
 'm': 'M',
 'n': 'N',
 'o': 'O',
 'p': 'P',
 'q': 'Q',
 'r': 'R',
 's': 'S',
 't': 'T',
 'u': 'U',
 'v': 'V',
 'w': 'W',
 'x': 'X',
 'y': 'Y',
 'z': 'Z'
}

decoded = ''
for l in encoded:
    if l in d:
        decoded += d[l]
    else:
        decoded += l
        
print(decoded)

# So what did we see?

* Doing API requests
    * start of a simple bot...
* Some work with randomization
* Some basic encoding
    * Python has a lot of encoding options, and not so bad encryption algorithms
* Analyzing text
    * This was just a word count, but python has NLTK which is amazing
* Graphical plotting
* attemps at breaking a code.