# The Jabberwocky

In this demo, we'll analyze Lewis Carroll's poem "Jabberwocky".  Make sure you have the file ```jabberwocky.txt``` in the same directory as this Python notebook before you run your code.

We'll use the ```with``` command when we open our text file to read it into memory.  This command ensures that the file will be closed when we're done with it, even if there's an error in our code.  If you don't use ```with```, you may have trouble opening files you're using in Python in other programs.

In [1]:
#Because the poem is fairly short, we can read all of it into memory with a single command

with open("jabberwocky.txt") as jab:
    poem = jab.read()

In [2]:
#Is the variable poem a string?

type(poem)

str

In [3]:
# How many characters are in the poem?

len(poem)

1018

In [4]:
# Here's a version of the poem that is in block caps.

poem_shout = poem.upper()
print(poem_shout)

'TWAS BRILLIG, AND THE SLITHY TOVES
      DID GYRE AND GIMBLE IN THE WABE:
ALL MIMSY WERE THE BOROGOVES,
      AND THE MOME RATHS OUTGRABE.

"BEWARE THE JABBERWOCK, MY SON!
      THE JAWS THAT BITE, THE CLAWS THAT CATCH!
BEWARE THE JUBJUB BIRD, AND SHUN
      THE FRUMIOUS BANDERSNATCH!"

HE TOOK HIS VORPAL SWORD IN HAND;
      LONG TIME THE MANXOME FOE HE SOUGHT--
SO RESTED HE BY THE TUMTUM TREE
      AND STOOD AWHILE IN THOUGHT.

AND, AS IN UFFISH THOUGHT HE STOOD,
      THE JABBERWOCK, WITH EYES OF FLAME,
CAME WHIFFLING THROUGH THE TULGEY WOOD,
      AND BURBLED AS IT CAME!

ONE, TWO! ONE, TWO! AND THROUGH AND THROUGH
      THE VORPAL BLADE WENT SNICKER-SNACK!
HE LEFT IT DEAD, AND WITH ITS HEAD
      HE WENT GALUMPHING BACK.

"AND HAST THOU SLAIN THE JABBERWOCK?
      COME TO MY ARMS, MY BEAMISH BOY!
O FRABJOUS DAY! CALLOOH! CALLAY!"
      HE CHORTLED IN HIS JOY.

'TWAS BRILLIG, AND THE SLITHY TOVES
      DID GYRE AND GIMBLE IN THE WABE:
ALL MIMSY WERE THE BOROGOVES,
      AND THE MO

In [5]:
# We remove the line breaks, replacing them by spaces

poem = poem.replace("\n", " ")

In [6]:
# We make a list of all of the words in the poem (assuming words are divided by spaces).

wordlist = poem.split(" ")

# Print the first 20 words in our list
for i in range(20):
    print(wordlist[i])

'Twas
brillig,
and
the
slithy
toves






Did
gyre
and
gimble
in
the
wabe:
All


In [7]:
len(wordlist)

256

In [8]:
#Let's count how many words in wordlist contain the letter 'A' (case-insensitive)

a_count = 0
for w in wordlist:
    if w.lower().count('a') > 0:
        a_count += 1

print(a_count)
    

59


In order to analyze punctuation, we import the [```string``` module](https://docs.python.org/3.2/library/string.html).

In [9]:
import string

In [10]:
# The constant string.punctuation contains ASCII punctuation characters
punct = string.punctuation
print(punct)

!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~


In [11]:
# We build a dictionary which contains the counts of each punctuation mark in our poem

# Start with an empty dictionary 
punct_dict = {}

for c in punct:
    punct_dict[c] = poem.count(c)

print(punct_dict)

{'!': 11, '"': 4, '#': 0, '$': 0, '%': 0, '&': 0, "'": 2, '(': 0, ')': 0, '*': 0, '+': 0, ',': 16, '-': 3, '.': 5, '/': 0, ':': 2, ';': 1, '<': 0, '=': 0, '>': 0, '?': 1, '@': 0, '[': 0, '\\': 0, ']': 0, '^': 0, '_': 0, '`': 0, '{': 0, '|': 0, '}': 0, '~': 0}


In [12]:
# Now our dictionary can tell us how many exclamation marks are in the poem

punct_dict['!']

11

In [13]:
# Another way to complete the same task is to use a dict comprehension

punct_dict_2 = {c:poem.count(c) for c in punct}
print(punct_dict_2)

{'!': 11, '"': 4, '#': 0, '$': 0, '%': 0, '&': 0, "'": 2, '(': 0, ')': 0, '*': 0, '+': 0, ',': 16, '-': 3, '.': 5, '/': 0, ':': 2, ';': 1, '<': 0, '=': 0, '>': 0, '?': 1, '@': 0, '[': 0, '\\': 0, ']': 0, '^': 0, '_': 0, '`': 0, '{': 0, '|': 0, '}': 0, '~': 0}


Another way to count characters in the poem is to use the Counter functionality in the [```collections``` module](https://docs.python.org/3.3/library/collections.html).

In [14]:
from collections import Counter

In [15]:
poem_charcount = Counter(poem)
print(poem_charcount)

Counter({' ': 255, 'e': 80, 't': 60, 'h': 57, 'a': 52, 'o': 50, 's': 37, 'i': 37, 'n': 36, 'r': 33, 'd': 31, 'l': 29, 'b': 28, 'm': 26, 'w': 23, 'g': 21, 'u': 21, ',': 16, 'y': 16, 'c': 12, '!': 11, 'f': 10, 'A': 9, 'T': 7, 'k': 7, 'v': 6, '.': 5, '"': 4, 'J': 4, 'j': 4, 'H': 4, 'C': 4, 'B': 3, 'p': 3, '-': 3, 'O': 3, "'": 2, 'D': 2, ':': 2, ';': 1, 'L': 1, 'x': 1, 'S': 1, '?': 1})


In [16]:
# How often does the character 'e' appear in this poem?

poem_charcount['e']

80

# Writing to files

We often want to save data we have computed to a file.

In [17]:
# We open a new text file in "write" mode, and save the uppercase version of our poem to it.

with open("jabberwocky_uppercase.txt","w") as jab_up:
    jab_up.write(poem_shout)

We'd like to save the information in our punctuation counts dictionary as a ```.csv``` file, because these files are easy to analyze with many different types of software.  One way to do this is to use the [```csv``` module](https://docs.python.org/3/library/csv.html). 

In [18]:
import csv

In [19]:
# When we open the .csv file for writing, we specify that the newline character is the empty string.
# This prevents us from creating empty lines when writing to the .csv file.

with open("jabberwocky_punct.csv", "w", newline="") as jab_punct:
    
    writer = csv.writer(jab_punct)
    # Our .csv file will have two columns, one for the dict keys and one for the dict values
    writer.writerows(punct_dict.items())