# Monday, October 23

## Announcements and Reminders

- Chapter 10 reading: something is broken
- Quiz on Wednesday (dictionaries)
- Celebration of Mind Wednesday Evening
- Exercises for Chapter 10: due Friday


## Activity: Counting Words with Dictionaries

Today we will continue to explore the `dictionary` type in python.

### Motivating Question

Which word appears most frequently in Edgar Alan Poe's poem *The Raven*?

#### Strategy

How would you do this by hand?  How would you keep track of your data?

### Dictionary Basics

A *dictionary* is a data type in python that holds an unordered list of *key:value* pairs.  The keys must be distinct, the values do not.  

Here is an example:

In [7]:
fav_nums = {'Oscar': 42, 'Sheldon': 73}
print(fav_nums['Oscar']) #needs to be one of the keys
print(fav_nums) 
for person in fav_nums: #everyones fav number
    print(fav_nums[person])
for x in fav_nums.values(): #everyones fav number
    print(x)
num_list = list(fav_nums.values())
print(num_list)

42
{'Oscar': 42, 'Sheldon': 73}
42
73
42
73
[42, 73]


Some things to try:
* How could we find the favorite number of someone in the dictionary?
* How could we print out each favorite number?
* How could we get a list of the favorite numbers?
* How could we add a new person's favorite number?
* How could we check whether 496 is anyone's favorite number?
* How could we find the largest favorite number of anyone in the dictionary?

Here are some dictionary methods you might try to use: `.get()`, `.items()`, `.keys()`, `.values()`, `.update()`, `.pop()`.


In [15]:
print(list(fav_nums.items())) #.items()
fav_nums.update({'Gigi':8, 'Tom':21})
print(fav_nums)
fav_nums['Bendi'] = 19
print(fav_nums)
print(42 in fav_nums.values()) #the .values() says if 42 is a value, if .values() was not added then outcome would be false

[('Oscar', 42), ('Sheldon', 73), ('Gigi', 8), ('Tom', 21), ('Bendi', 19)]
{'Oscar': 42, 'Sheldon': 73, 'Gigi': 8, 'Tom': 21, 'Bendi': 19}
{'Oscar': 42, 'Sheldon': 73, 'Gigi': 8, 'Tom': 21, 'Bendi': 19}
True


In [23]:
nums = list(fav_nums.values())
nums.sort(reverse=True) #reverses the order of the numbers to highest to lowest 
largest = nums[0]
print(nums[0])
for person in fav_nums:
    if fav_nums in [person] == largest:
        print(person)


73


#### Adding new elements to a dictionary

With lists, we used `.append()` to add a new item to the list at the end of the list.  Or, if we want to change the item in the list at position 7 to the value 42, say, we could write `mylist[7] = 42`.

Dictionaries can similarly be updated like this.

In [None]:
fav_nums.update({"Wes":1})
fav_nums['Edgar'] = 13

print(fav_nums)

##### Caution!

What happens if you use `.update()` but there is already a key of the name you want to add?  One way to make sure this isn't the case is to use the `.get()` method.  This returns the value of the key, or if the key isn't present, it returns `None`.

(You can also set a default value.)

In [None]:
fav_nums.get('Edgar')

### Back to our main question

Let's import the text of the poem (saved as `raven.txt` in this folder).  Print it out to make sure it worked.

In [24]:
with open('raven.txt', 'r') as f:
    text = f.read()

words = text.split()
counter = 0
for word in words:
    if word == "Nevermore":
        counter += 1
print(counter)

Next we should make a list of words.

What about punctuation?  There are lots of different types of punctuation we need to consider.  Luckily, python has our backs with the `string` library.

In [32]:
import string

print(string.punctuation)

clean_words =[]
for word in words:
    word = word.lower()
    for p in string.punctuation:
        word = word.replace(p, "")
    clean_words.append(word)

#print(clean_words)

counter = 0
for word in clean_words:
    if word == "nevermore":
        counter += 1
print(counter)

TypeError: 'int' object is not callable

Remove all punctuation, and then create a list of words.

Also, make all of them lower-case.

### Counting words in the list

Now we should have a list of words from the poem, not including any whitespace or punctiation and all the same case.  How can we count how many times each word appears?

A few options:

* For each word, we can use `.count()` to see how many times it appears and just print this.  
* We could create a list of the number of times each word occurs in the same position as the word in the list of words.  
* We could create a variable for each word and have it store the number of times the word occurs
* Or we can do exactly this, but the right way: with a dictionary.

Let's start working with a smaller example of a string.

In [33]:
import string

lyrics = "We can dance if we want to We can leave your friends behind 'Cause your friends don't dance And if they don't dance Well, they're no friends of mine"
for mark in string.punctuation:
  lyrics = lyrics.replace(mark,'')

word_list = lyrics.lower().split()
print(word_list)


TypeError: 'int' object is not callable

Now we will create a dictionary and start adding words to it.

In [35]:
# start with an empty dictionary
wordcount = {}
for word in clean_words:
    if word not in wordcount:
        wordcount[word] = 1
    else:
        wordcount[word] += 1
print(clean_words)
# loop over words is word_list, adding them to the dictionary.
# Check to see if the word is in the dictionary.  If so, add to its counter.  Otherwise, start a new entry with counter 1.

TypeError: 'int' object is not callable

#### Sorting the dictionary

You cannot sort dictionaries!  But you can sort lists, and you can create a list out of a dictionary.  The tricky bit is figuring out how to sort that list based not on the words in the list, but by the value of the key that corresponds to that word in the dictionary.

The idea is this: define a function that returns some value for each key of the dictionary, and sort by these returned values.

In [36]:
results = list(wordcount.items())
print(results)
results.sort()
print(results)

TypeError: 'int' object is not callable