# Intro to Jupyter Notebooks

This is a notebook. They mix markdown text (like this cell) with code.

This lets you explain what you are doing in context.

For example, in the next cell, I'm going to just make sure that my Python kernel is working. Press `shift` + `enter` to run a cell and advance to the next one.

In [2]:
print("Hello, world!")

Hello, world!


That should have printed out "Hello, world!" If that didn't work, then raise your hand - something might be broken.

The one tricky thing to remember about Jupyter notebooks when compared to a .py file is that they run cells in the order that you choose to run them. For example, let's create a variable:

In [3]:
x = 1

Now, I can choose to run either of the following cells, in whichever order I want, as many times as I want.

In [None]:
x = x * 3

In [15]:
x = x - 1

Depending on what I do, the result of the following cell will change.

In [16]:
print(x)

-9


As you can imagine, this can cause problems, so be careful!

The best practice is to put cells in order and to make sure that all cells can be run, from top to bottom, and you still get the same output.

## Introduction to Dictionaries

Dictionaries are sort of like lists, except that we access them with a key, rather than with the index. A key can be a number of different objects: a string, a number, or even a tuple (which we will talk about in a moment).

Dictionaries are within "curly braces"-- {} -- and each key is separated by the value with a colon.

The following creates a new dictionary, and then shows how to add or edit entries.

In [11]:
basketball_wins = {'Purdue': 5,
                   'IU': 2,
                   'Northwestern': 0}

# To add a new entry
basketball_wins['Michigan'] = 5

# The same syntax updates an existing entry
basketball_wins['Purdue'] = 6

print(basketball_wins)

{'Purdue': 6, 'IU': 2, 'Northwestern': 0, 'Michigan': 5}


Note that when we print the dictionary, it may be in a different order than how we put items into it. While lists maintain the same order, dictionaries are "unordered". This is why you can't access an item in a dictionary by an index number.

Rather, you access the data associated with a key by entering the name of the key.

In [12]:
basketball_wins['Purdue']

6

In [14]:
# But you get an KeyError if it doesn't exist

basketball_wins['Wisconsin']

KeyError: 'Wisconsin'

While the keys must be unique, the values can change. The following code takes in a string and counts the letters in it.

In [3]:
string = """
I have been one acquainted with the night.
I have walked out in rain—and back in rain.
I have outwalked the furthest city light.

I have looked down the saddest city lane.
I have passed by the watchman on his beat
And dropped my eyes, unwilling to explain.

I have stood still and stopped the sound of feet
When far away an interrupted cry
Came over houses from another street,

But not to call me back or say good-bye;
And further still at an unearthly height,
One luminary clock against the sky

Proclaimed the time was neither wrong nor right. 
I have been one acquainted with the night.
"""
string = string.lower()
letter_dict = {}
for letter in string:
    if letter in ['\n',' ']:
        continue
    if letter in letter_dict:
        letter_dict[letter] = letter_dict[letter] + 1
    else:
        letter_dict[letter] = 1
        
print(letter_dict)


{'i': 34, 'h': 32, 'a': 44, 'v': 8, 'e': 57, 'b': 8, 'n': 37, 'o': 29, 'c': 13, 'q': 2, 'u': 13, 't': 46, 'd': 21, 'w': 11, 'g': 9, '.': 7, 'l': 18, 'k': 7, 'r': 22, '—': 1, 'f': 6, 's': 19, 'y': 12, 'p': 8, 'm': 8, ',': 3, 'x': 1, '-': 1, ';': 1}


**Excercise:** See if you can modify the code above to count how often each word appears instead.

In [14]:
string = """
I have been one acquainted with the night.
I have walked out in rain—and back in rain.
I have outwalked the furthest city light.

I have looked down the saddest city lane.
I have passed by the watchman on his beat
And dropped my eyes, unwilling to explain.

I have stood still and stopped the sound of feet
When far away an interrupted cry
Came over houses from another street,

But not to call me back or say good-bye;
And further still at an unearthly height,
One luminary clock against the sky

Proclaimed the time was neither wrong nor right. 
I have been one acquainted with the night.
"""
string.split()

['I',
 'have',
 'been',
 'one',
 'acquainted',
 'with',
 'the',
 'night.',
 'I',
 'have',
 'walked',
 'out',
 'in',
 'rain—and',
 'back',
 'in',
 'rain.',
 'I',
 'have',
 'outwalked',
 'the',
 'furthest',
 'city',
 'light.',
 'I',
 'have',
 'looked',
 'down',
 'the',
 'saddest',
 'city',
 'lane.',
 'I',
 'have',
 'passed',
 'by',
 'the',
 'watchman',
 'on',
 'his',
 'beat',
 'And',
 'dropped',
 'my',
 'eyes,',
 'unwilling',
 'to',
 'explain.',
 'I',
 'have',
 'stood',
 'still',
 'and',
 'stopped',
 'the',
 'sound',
 'of',
 'feet',
 'When',
 'far',
 'away',
 'an',
 'interrupted',
 'cry',
 'Came',
 'over',
 'houses',
 'from',
 'another',
 'street,',
 'But',
 'not',
 'to',
 'call',
 'me',
 'back',
 'or',
 'say',
 'good-bye;',
 'And',
 'further',
 'still',
 'at',
 'an',
 'unearthly',
 'height,',
 'One',
 'luminary',
 'clock',
 'against',
 'the',
 'sky',
 'Proclaimed',
 'the',
 'time',
 'was',
 'neither',
 'wrong',
 'nor',
 'right.',
 'I',
 'have',
 'been',
 'one',
 'acquainted',
 'with',
 

In [12]:
## Your code here


## Tuples

Tuples are very similar to lists. They are created with parentheses -- () -- rather than with square brackets. 

However, tuples are "immutable", which means that they cannot be changed.

In [8]:
my_tuple = (4,13,'hello')

Like lists, items in a tuple can be accessed by indexing.

In [10]:
my_tuple[1]

13

However, tuples are "immutable", meaning that they can't be changed. So, things like "append" and "pop" won't work.

This immutability is (for complicated reasons) an important attribute of dictionary keys, and tuples are often used in dictionaries. For example, let's say you wanted to store the population of cities in the US. You might create a dictionary like this:

In [1]:
population_dict = {('Georgia', 'Atlanta'): 498000,
              ('Illinois', 'Atlanta'): 1692,
              ('Illinois', 'Chicago'): 2750000
             }

In [3]:
d = {'a': 1,
    'b': 3}

In [6]:
d_list = list(d.items())

In [8]:
d_list[0]

('a', 1)

In [10]:
x, y = [3,4]

In [11]:
x

3

The following code takes a csv table of city populations from https://simplemaps.com/data/us-cities and converts them into a dictionary that looks like the above.

To run this code, you will need to download the file from Brightspace (or the site above) and put it into the same directory as this file.

In [41]:
import csv
f = open('./uscities.csv', 'r') # Open the file for reading
f_csv = csv.reader(f)
next(f_csv) # This just skips the header row, so it isn't in our data
population_dict = {}
for row in f_csv:
     # To get these numbers, I just opened the CSV file and looked at which columns had this data
    city = row[0]
    state = row[3]
    population = int(float(row[10])) # A few of the populations are recorded as floats, so we have to convert them
    if (state, city) in population_dict: # Check for the same city twice in the same state
        print(state, city)
    else:
        population_dict[(state, city)] = population
        
# This code prints the first few items in the dictionary, to make sure it looks like it's right
print(list(population_dict.items())[:5])

[(('Washington', 'South Creek'), 2500), (('Washington', 'Roslyn'), 947), (('Washington', 'Sprague'), 441), (('Washington', 'Gig Harbor'), 9507), (('Washington', 'Lake Cassidy'), 3591)]


It looks right, so let's press on.

By using tuples as keys, you can do things like summarize by one or the other entries in the tuple.

In [42]:
state_populations = {}
for city in population_dict:
    state = city[0] # Extract the state from the key
    city_pop = population_dict[city] # Extract the population from the value
    try: # If the key exists, then add the population
        state_populations[state] += city_pop
    except KeyError: # Otherwise set the value to the population
        state_populations[state] = city_pop
    
print(state_populations)

{'Washington': 10134988, 'Delaware': 557808, 'District of Columbia': 5289420, 'Wisconsin': 5936025, 'West Virginia': 1375886, 'Hawaii': 1839050, 'Florida': 30250154, 'Wyoming': 487683, 'New Hampshire': 799613, 'New Jersey': 5697038, 'New Mexico': 2102495, 'Texas': 33020438, 'Louisiana': 4894945, 'Alaska': 695791, 'North Carolina': 9203228, 'North Dakota': 727244, 'Nebraska': 1885071, 'Tennessee': 6065103, 'New York': 37847876, 'Pennsylvania': 15641201, 'Rhode Island': 1653103, 'Nevada': 4434080, 'Virginia': 8197045, 'Colorado': 7499152, 'California': 59608877, 'Alabama': 4299420, 'Arkansas': 2608972, 'Vermont': 258863, 'Illinois': 18266059, 'Georgia': 10750970, 'Indiana': 6161106, 'Iowa': 3167941, 'Massachusetts': 8972816, 'Arizona': 9581518, 'Idaho': 1632041, 'Connecticut': 4072787, 'Maine': 616831, 'Maryland': 6880970, 'Oklahoma': 3657380, 'Ohio': 13764673, 'Utah': 4854868, 'Missouri': 7463356, 'Minnesota': 7275566, 'Michigan': 10429997, 'Kansas': 2586907, 'Montana': 816367, 'Mississ

**Exercise:** Reuse and modify the code above so that it prints a dictionary of the population for each letter that a city starts with.

E.g., {'a':4539, 'b': 489399, ...}

In [13]:
# Your code here