# The Jabberwocky

In this demo, we'll analyze Lewis Carroll's poem "Jabberwocky".  Make sure you have the file ```jabberwocky.txt``` in the same directory as this Python notebook before you run your code.

We'll use the ```with``` command when we open our text file to read it into memory.  This command ensures that the file will be closed when we're done with it, even if there's an error in our code.  If you don't use ```with```, you may have trouble opening files you're using in Python in other programs.

In [None]:
#Because the poem is fairly short, we can read all of it into memory with a single command

with open("jabberwocky.txt") as jab:
    poem = jab.read()

In [None]:
#Is the variable poem a string?

type(poem)

In [None]:
# How many characters are in the poem?

len(poem)

In [None]:
# Here's a version of the poem that is in block caps.

poem_shout = poem.upper()
print(poem_shout)

In [None]:
# We remove the line breaks, replacing them by spaces

poem = poem.replace("\n", " ")

In [None]:
# We make a list of all of the words in the poem (assuming words are divided by spaces).

wordlist = poem.split(" ")

# Print the first 20 words in our list
for i in range(20):
    print(wordlist[i])

In [None]:
len(wordlist)

In [None]:
#Let's count how many words in wordlist contain the letter 'A' (case-insensitive)

a_count = 0
for w in wordlist:
    if w.lower().count('a') > 0:
        a_count += 1

print(a_count)
    

In order to analyze punctuation, we import the [```string``` module](https://docs.python.org/3.2/library/string.html).

In [None]:
import string

In [None]:
# The constant string.punctuation contains ASCII punctuation characters
punct = string.punctuation
print(punct)

In [None]:
# We build a dictionary which contains the counts of each punctuation mark in our poem

# Start with an empty dictionary 
punct_dict = {}

for c in punct:
    punct_dict[c] = poem.count(c)

print(punct_dict)

In [None]:
# Now our dictionary can tell us how many exclamation marks are in the poem

punct_dict['!']

In [None]:
# Another way to complete the same task is to use a dict comprehension

punct_dict_2 = {c:poem.count(c) for c in punct}
print(punct_dict_2)

Another way to count characters in the poem is to use the Counter functionality in the [```collections``` module](https://docs.python.org/3.3/library/collections.html).

In [None]:
from collections import Counter

In [None]:
poem_charcount = Counter(poem)
print(poem_charcount)

In [None]:
# How often does the character 'e' appear in this poem?

poem_charcount['e']

# Writing to files

We often want to save data we have computed to a file.

In [None]:
# We open a new text file in "write" mode, and save the uppercase version of our poem to it.

with open("jabberwocky_uppercase.txt","w") as jab_up:
    jab_up.write(poem_shout)

We'd like to save the information in our punctuation counts dictionary as a ```.csv``` file, because these files are easy to analyze with many different types of software.  One way to do this is to use the [```csv``` module](https://docs.python.org/3/library/csv.html). 

In [None]:
import csv

In [None]:
# When we open the .csv file for writing, we specify that the newline character is the empty string.
# This prevents us from creating empty lines when writing to the .csv file.

with open("jabberwocky_punct.csv", "w", newline="") as jab_punct:
    
    writer = csv.writer(jab_punct)
    # Our .csv file will have two columns, one for the dict keys and one for the dict values
    writer.writerows(punct_dict.items())