## Problem Statement: Vigenere Cipher

##### Implement a Vigenere Cipher

##### Plain text:      attactatdawn
##### Key:                 python
##### Cipher text:      prmhqxprwhka

In [8]:
from string import ascii_lowercase
from itertools import repeat, cycle, islice

CODEBOOK = {x: {y: z for y,z in zip(ascii_lowercase, islice(cycle(ascii_lowercase), i, None))}
            for i, x in enumerate(ascii_lowercase)}
    
def encipher(message, key, codebook=CODEBOOK):
    message = ''.join(m for m in message.lower() if m in codebook)
    return ''.join(codebook[k][m] for k, m in zip(cycle(key), message))

def decipher(message, key, codebook=CODEBOOK):
    decodebook = {x: {z:y for y,z in yz.items()} for x, yz in codebook.items()}
    return ''.join(decodebook[k][m] for k, m in zip(cycle(key), message))

In [9]:
msg = 'Attack at dawn!'
key = 'python'
enc_msg = encipher(msg, key)
enc_msg

'prmhqxprwhka'

In [10]:
decipher(enc_msg, key)

'attackatdawn'

---------------------------------------------------

## Problem Statement: Concordance (Word Count)

##### Find the ten most commonly used words in a text file

In [11]:
! wget -O paradise-lost.txt 'http://www.gutenberg.org/cache/epub/26/pg26.txt'

--2018-02-22 20:36:08--  http://www.gutenberg.org/cache/epub/26/pg26.txt
Resolving www.gutenberg.org... 152.19.134.47
Connecting to www.gutenberg.org|152.19.134.47|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 200284 (196K) [text/plain]
Saving to: 'paradise-lost.txt'


2018-02-22 20:36:10 (166 KB/s) - 'paradise-lost.txt' saved [490469]



In [13]:
! head -n 10 paradise-lost.txt

﻿The Project Gutenberg EBook of Paradise Lost, by John Milton

This eBook is for the use of anyone anywhere at no cost and with
almost no restrictions whatsoever.  You may copy it, give it away or
re-use it under the terms of the Project Gutenberg License included
with this eBook or online at www.gutenberg.net


Title: Paradise Lost



In [18]:
from re import sub

def concordance(text):
    freq = {}
    for word in text.split():
        word = sub('[^\w]', '', word.lower())
        if word not in freq:
            freq[word] = 0
        freq[word] += 1
    return freq

In [19]:
with open('paradise-lost.txt') as f:
    text = ''.join(f)
    
freq = concordance(text)
sorted(freq.items(), key=lambda kv: kv[1], reverse=True)[:10]

[('and', 3483),
 ('the', 3162),
 ('to', 2326),
 ('of', 2186),
 ('in', 1430),
 ('with', 1208),
 ('his', 1181),
 ('or', 795),
 ('that', 720),
 ('all', 712)]