# Introduction to Letter Frequency Analysis with Python
### By Adam Erck

## What is Jupyter?
> "The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text. Uses include: data cleaning and transformation, numerical simulation, statistical modeling, data visualization, machine learning, and much more."  _-jupyter.org_

## What is Letter Frequency Analysis?

_Frequency analysis is determining how often a letter appears in text samples._

Wiki Book: [In the field of cryptanalysis, frequency analysis is a methodology for "breaking" simple substitution ciphers.](https://en.wikibooks.org/wiki/Cryptography/Frequency_analysis)

![Crypto-compare](files/crypto.jpg) [_image source_](http://theamazingking.com/crypto-ana101.php)

## Example 1: Simple String of Text (Slow)
For the first two examples, we will look at a test string containing `abcde` 5 times over. 

In [28]:
#test string to be analyzed
test_string = "abcdeabcdeabcdeabcdeabcde"

Creating a function will allow us to reuse this piece of code and pass different text samples to it.   
In this function, we will use a loop counter to tally the letter frequencies.
* text
* char
* count

In [29]:
def count_char(text, char):
    """This function uses count to tally the number of times 
    a character occurs in a string"""
    count = 0
    for c in text:
        if c == char:
            count += 1
    return count

In [30]:
def letterCounterS(text):
    """This function was constructed using examples from the code learning app Solo Learn. 
    This function uses a nested for loop to see if the character (char) is in the 
    supplied string 'text'. Returns a list containing the counts of each letter."""
    counts = []
    
    #convert string 'text' to all lowercase.
    text = text.lower()

    for char in "abcdefghijklmnopqrstuvwxyz":
        counts.append(char + str(count_char(text, char)))
    return counts

In [31]:
#Using our text file in the letterCounterS() function.
print(letterCounterS(test_string))

['a5', 'b5', 'c5', 'd5', 'e5', 'f0', 'g0', 'h0', 'i0', 'j0', 'k0', 'l0', 'm0', 'n0', 'o0', 'p0', 'q0', 'r0', 's0', 't0', 'u0', 'v0', 'w0', 'x0', 'y0', 'z0']


As we can see, after running the letterCounter function, there were 5 instances each for the letters `a`, `b`, `c`, `d`, and `e` as expected.

## Example 2: Simple String of Text (Fast)

In [32]:
#Our starting test string that will be analyzed.
test_string = "abcdeabcdeabcdeabcdeabcde"

In [33]:
def letterCounter(text):
    """This function takes an input string and counts the occurrences of 
    the letters a-z and A-Z and returns a dictionary with
    key:value pairs representing the key letter and an associated value 
    representing the number of  occurrences.
    Note: This function counts both 'a' and 'A' as an occurrence of 'a'."""
    letters_of_interest="abcdefghijklmnopqrstuvwxyz"


    #The dictionary object that will contain the frequency (number of occurrences) 
    #of each letter in our test-string.
    #Each key:value pair represents the key letter and an associated value representing 
    #the number of  occurrences.
    letter_frequency={"a" : 0, "b" : 0, "c" : 0, "d" : 0, "e" : 0, "f" : 0, "g" : 0, "h" : 0, "i" : 0, "j" : 0, "k" : 0,
         "l" : 0, "m" : 0, "n" : 0, "o" : 0, "p" : 0, "q" : 0, "r" : 0, "s" : 0, "t" : 0, "u" : 0, "v" : 0, "w" : 0, "x" : 0, "y" : 0, "z" : 0}

    #convert string 'text' to all lowercase.
    text = text.lower()

    #Checking each character (char) in the string 'text' to see if it is a letter a-z, 
    #and counting it if is a letter
    for char in text:
        if char in letters_of_interest:
            letter_frequency[char]=letter_frequency[char]+1

    #return the letter-frequency dictionary
    return letter_frequency

In [34]:
#Using our text file in the letterCounter() function.
print(letterCounter(test_string))

{'a': 5, 'b': 5, 'c': 5, 'd': 5, 'e': 5, 'f': 0, 'g': 0, 'h': 0, 'i': 0, 'j': 0, 'k': 0, 'l': 0, 'm': 0, 'n': 0, 'o': 0, 'p': 0, 'q': 0, 'r': 0, 's': 0, 't': 0, 'u': 0, 'v': 0, 'w': 0, 'x': 0, 'y': 0, 'z': 0}
