# Agenda: Day 3 (Dictionaries and files)

1. Q&A
2. Dictionaries
    - What are dictionaries?
    - Defining / retrieving
    - Modifying dictionaries
    - Accumulating using dictionaries
    - Accumulating the unknown
    - Looping over dicts
    - How do dictionaries work?
3. Files
    - What does it mean to work with files?
    - Reading from files
    - Turning files into data structures
    - Writing to files, as well (using `with`)


In [1]:
# you can run both mylist += [80, 90, 100] and mylist += (50, 60, 70) and it works for 
# both in the same way. But you cannot run mylist.append['abc'] or 
# mylist.append[40], why?

In [2]:
mylist = [10, 20, 30]

# when we use += on a list, += looks to its right
# and runs a "for" loop on what it sees.  Each element we
# get in that loop is appended to mylist
mylist += [80, 90, 100]

In [3]:
# the above is basically the same as saying

mylist = [10, 20, 30]

for one_item in [80, 90, 100]:
    mylist.append(one_item)

mylist

[10, 20, 30, 80, 90, 100]

In [4]:
# let's try this with a tuple!

mylist = [10, 20, 30]

# += doesn't care what we have on the right side! It will run 
# a for loop on whatever we give it. 
mylist += (80, 90, 100)   

mylist

[10, 20, 30, 80, 90, 100]

In [5]:
# why can't we run mylist.append['abc'] or mylist.append[40]?

# append is a method
# we need to use () to invoke it
# whatever is in the () is appended

mylist.append('abc')
mylist

[10, 20, 30, 80, 90, 100, 'abc']

In [6]:
# if you were to say

mylist.append['abc']   

# the above means (to Python): Find mylist, and grab its "append" method. Then, on the method,
# retrieve whatever is at index 'abc'


TypeError: 'builtin_function_or_method' object is not subscriptable

In [7]:
mylist.append(40)
mylist

[10, 20, 30, 80, 90, 100, 'abc', 40]

In [None]:
## From last session
# In google colab i ran the program to get a number from user as input and display it
# 1 * 10 ** 3
# 2 * 10 ** 2
# 3 * 10 ** 1

# i didn't have to do index -1. when i ran 

user_input = input(' enter a number :  ').strip()
for index,one_char in enumerate(user_input):
    print(f'{one_char} * 10 ** {len(user_input) - index}')

# output is
# 1 * 10 ** 3
# 2 * 10 ** 2
# 3 * 10 ** 1

# v/s if i do index - 1
# output is
# 1 * 10 ** 2
# 2 * 10 ** 1
# 3 * 10 ** 0

In [None]:
user_input = input(' enter a number :  ').strip()
for index,one_char in enumerate(user_input):
    print(f'{one_char} * 10 ** {len(user_input) - index}')


In [2]:
user_input = '4321'   # this is 4*10**3  3*10**2  2*10**1   1*10**0

for index,one_char in enumerate(user_input):
    print(f'{one_char} * 10 ** {len(user_input) - index - 1}')


4 * 10 ** 3
3 * 10 ** 2
2 * 10 ** 1
1 * 10 ** 0


# Dictionaries

Dictionaries (aka "dicts" in the Python world) are the most important data structure in Python. They aren't unique to Python! Many other languages have a similar data structure, often called:

- hash tables
- hashes
- hash maps
- maps
- key-value stores
- name-value stores
- associative arrays

The basic idea behind a dictionary is as follows:

When we store things in a list, we determine the value, but Python normally determines the index. If we want, we can replace a value at a given index, but normally, we're just adding to the end of a list.

That's fine, except that the index numbers don't really have any meaning. What if I want to have a list of people in my company? Wouldn't it be better if I could store them not at arbitrary indexes (0, 1, 30, 256) but rather at indexes corresponding to their user IDs?

In other words: I'd like to determine not just the values, but also the indexes we use to store those values.

If I'm already wishing, I'd like to use not just integers, but also strings as indexes. Those could really come in handy.

What I've just described is (basically) a dictionary!

- We decide on both the keys (what we call the indexes in a dict) and the values
- The keys can be any immutable value (typically, integers and strings, but theoretically also floats and tuples)
- The values can be absolutely anything at all
- Each key must be unique in a dict, so each key can appear at most once

Dicts are incredibly efficient and fast, as well as convenient. They're a big win for everyone.

# Defining dicts

- We define a dictionary using `{}` (curly braces)
- Each key-value pair is defined with a colon separating the key from the value
- The pairs are separated from one another using commas.
- 

In [3]:
d = {'a':10, 'b':20, 'c':30}

type(d)

dict

In [4]:
# how many pairs are in this dict?

len(d) 

3

In [5]:
# what if I want to retrieve a value via a key?
d['a']   # put the key in square brackets

10

In [6]:
# what if I try to retrieve a key that doesn't exist?
d['x']

KeyError: 'x'

In [7]:
# how can I know if a key is in a dict?
# we can ask with "in"
# "in" *only* searches in the keys; it ignores the values

'a' in d

True

In [8]:
10 in d

False

In [9]:
# we can use variables

s = 'a'
d[s]   # this first evaluates s, getting back 'a', and then evaluates d['a']

10

In [10]:
# we saw that we can get a value via a key.
# can we do the opposite? The answer: no

# dicts are one way, from keys to values
# we could search through a dict, pair by pair, for a value and get its corresonding key
# but if you're doing that, you're almost certainly doing something wrong

# remember: keys are unique, but values aren't (or don't have to be)

# Exercise: Restaurant

1. Define a dict, `menu`, representing a menu in a restaurant. The keys will be the menu items (strings), and the values will be the prices of those items (integers).
2. Set a variable, `total`, to be 0.
3. Ask the user repeatedly to order something.
    - If they give us the empty string, stop asking and print the total.
    - If they give us the name of something on the menu (i.e., a key in our `menu` dict), then add the price to the total, and print the item, price, and new total.
    - If they give us the name of something *not* on the menu, then scold them lightly.
4. Print `total`.

Example:

    Order: sandwich
    sandwich costs 10, total is 10
    Order: tea
    tea costs 5, total is 15
    Order: elephant
    we are fresh out of elephant today!
    Order: [ENTER]
    Your total is 15

In [11]:
menu = {'sandwich':10, 'tea':5, 'cookie':3, 'apple':2}
total = 0

while True:
    s = input('Order: ').strip()

    if s == '':   # empty string? break out of the loop
        break

    # is the user's input a key in our dict?
    if s in menu:
        price = menu[s]   # get the price for the user's order
        total += price    # increase total by this price
        print(f'{s} costs {price}; total is now {total}')

    else:
        print(f'Sorry, we are fresh out of {s} today.')

print(total)

Order:  sandwich


sandwich costs 10; total is now 10


Order:  tea


tea costs 5; total is now 15


Order:  elephant


Sorry, we are fresh out of elephant today.


Order:  


15


# When would I use a dict this way?

If you have key-value associations that might make sense throughout your program, you might well do this:

- Month names and month numbers
- Month numbers and month names
- User IDs and usernames
- Usernames and further data
- IP address and computer names

In [12]:
menu = {'sandwich':10.50, 'tea':5.75, 'cookie':3.25, 'apple':2.99}
total = 0

while True:
    s = input('Order: ').strip()

    if s == '':   # empty string? break out of the loop
        break

    # is the user's input a key in our dict?
    if s in menu:
        price = menu[s]   # get the price for the user's order
        total += price    # increase total by this price
        print(f'{s} costs {price}; total is now {total}')

    else:
        print(f'Sorry, we are fresh out of {s} today.')

print(total)

Order:  apple


apple costs 2.99; total is now 2.99


Order:  sandwich


sandwich costs 10.5; total is now 13.49


Order:  


13.49


# Mutable dictionaries

We've seen that some data structures in Python are immutable (e.g., `int`, `float`, `str`, and `tuple`). But we know that others are mutable (e.g., `list`). Which is true for dictionaries?

Answer: Dicts are mutable. They can be changed:

- We can modify existing key-value pairs
- We can add new key-value pairs
- We can remove existing key-value pairs

Every key needs to have a value, and every value needs to have a key. So there's no such thing as removing a key but keeping the value around, or of removing a value and keeping the key around. 

In [13]:
d = {'a':10, 'b':20, 'c':30}

# let's update the value associated with 'b'
d['b'] = 25   # assigning to an existing key updates the value for that key
d

{'a': 10, 'b': 25, 'c': 30}

In [14]:
# I can even use += with an existing dict value
d['b'] += 1  # this will add 1 to the existing value

d

{'a': 10, 'b': 26, 'c': 30}

In [15]:
# what happens if I try to update a value for a key that doesn't exist?

d['x'] += 1

KeyError: 'x'

In [16]:
# what if I want to add a new key-value pair?
# we know, from lists, that we can use "append"
# BUT NOT IN DICTS!

# In dictionaries, to add a new key-value pair, all we do is assign
d['z'] = 1234

# it looks just like updating! But it's really adding a new key-value pair

In [17]:
d

{'a': 10, 'b': 26, 'c': 30, 'z': 1234}

In [18]:
d['a', 'b'] += 1    # no, this doesn't work

KeyError: ('a', 'b')

In [19]:
# Removing key-value pairs
# we can use the "pop" method to remove a key-value pair -- specify the key, and the value is returned + removed

d.pop('z')

1234

In [20]:
d

{'a': 10, 'b': 26, 'c': 30}

In [21]:
# can we add multiple key-value pairs to a dict?
# yes -- we can do that by defining a second dict, and using the | (union) method

d

{'a': 10, 'b': 26, 'c': 30}

In [22]:
new_stuff = {'c':77, 'd':88, 'e':99}

d | new_stuff    # this returns a new dict -- doens't affect either d or new_stuff

{'a': 10, 'b': 26, 'c': 77, 'd': 88, 'e': 99}

# Next up

- Accumulating in dicts (known keys)
- Accumulating in dicts (unknown keys)

We've seen that we can define a dict and then use it as a read-only database inside of a program.  But we can also use it as a read-write data structure. 

One of the most common ways to use dicts in this format is to define it at the start of the program with some keys and some initial values, such as 0. Then, as the program proceeds, you add to the values. 

In other words: The keys don't change (no new ones, no removals) but the values do.



In [23]:
counts = {'a':0, 'e':0, 'i':0, 'o':0, 'u':0}

s = 'This is a bunch of letters for my course'

for one_character in s:
    if one_character in counts:      # if the current character is a key in our dict
        counts[one_character] += 1   # add 1 to the value!

print(counts)

{'a': 1, 'e': 3, 'i': 2, 'o': 3, 'u': 2}


# Exercise: Vowels, digits, and others (dict edition)

1. Define a dict in which the keys are `vowels`, `digits`, and `others`, and all values are 0.
2. Ask the user, repeatedly, to enter a string.
    - If they enter an empty string, stop asking
3. Go through each character in the string they gave you
    - If the character is a vowel, add 1 to `vowels`
    - If the character is a digit, add 1 to `digits`
    - Otherwise, add 1 to `others`
4. In the end, print the dict with the counts.

In [24]:
counts = {'vowels':0,
          'digits':0,
          'others':0}

while True:
    s = input('Enter text: ').strip()
 
    if s == '':    # if we got the empty string, exit the loop
        break

    for one_character in s:
        if one_character in 'aeiou':
            counts['vowels'] += 1
        elif one_character.isdigit():
            counts['digits'] += 1
        else:
            counts['others'] += 1

print(counts)

Enter text:  hello!! 123
Enter text:  what about now?!? 456
Enter text:  


{'vowels': 7, 'digits': 6, 'others': 19}


In [None]:
# AB

dicto={'vowels':0,'digits':0,'others':0}
while True:
    s = input("Enter a string? ").strip().lower()
    if s == '':
        break
    for one_character in s:
        if one_character in 'aeoiuy':
            dicto['vowels'] += 1
        elif one_character.isdigit():
            dicto['digits'] += 1
        else:
            dicto['others'] += 1
print(dicto)

In [25]:
# MK
# can we define a dict

d = {'vowels':'aeiou'}

d['vowels']

'aeiou'

In [26]:
vowels = 'aeiou'

# Loops and dicts

We know that we can iterate over a number of different data structures:

- string, we get one character at a time
- list, we get one element at a time
- tuple, we get one element at a time

This raises an obivous question: Can we iterate over dicts? If so, what do we get?

In [27]:
d = {'a':10, 'b':20, 'c':30}

for one_item in d:
    print(one_item)

a
b
c


In [28]:
# iterating over a dict gives us the dictionary's keys
# we can use this to iterate over the dict

for one_key in d:
    print(f'{one_key}: {d[one_key]}')

a: 10
b: 20
c: 30


In [29]:
# some people discover that there is a "keys" method for dicts
# what am I going to get from this?

# exactly the same result as iterating over d
# **EXCEPT** that it's much slower

for one_key in d.keys():
    print(f'{one_key}: {d[one_key]}')

a: 10
b: 20
c: 30


In [30]:
# there is also a dict.values method
# that returns all of the values in a special data structure that's sort of (but not really) a list

d.values()

dict_values([10, 20, 30])

In [31]:
# if you want to search or iterate over the values, you can use this

In [33]:
# there is a better way to iterate over a dict
# the dict.items method returns one key-value pair for each iteration

for t in d.items():  # get a (key, value) tuple with each iteration
    key, value = t   # unpacking to get the key and value 
    print(f'{key}: {value}')

a: 10
b: 20
c: 30


In [34]:
# we can use unpacking directly in the for loop!

for key, value  in d.items():  # get a (key, value) tuple with each iteration
    print(f'{key}: {value}')

a: 10
b: 20
c: 30


# Paradigm 3 for dict usage

- In paradigm 1, we define a dict and treat it as a read-only database, never changing it
- In paradigm 2, we define a dict and modify the values, but not the keys (no new ones, no removal)
- In paradigm 3, we start with an *empty* dict, adding new keys as needed and updating values as needed

Paradigm 3 is perfect for when you don't know what keys or values you'll get, but you know what to do with them when you get them.

Imagine that you'll get user IDs and names for users on your system. You don't know in advance all of the users' names and IDs. But you can still define a dict and expect that when new users show up, you'll add a new pair to the dict.

In [35]:
# counting characters
# what if I want to count *all* of the characters in a string?
# do I really want to define a new dict with all characters in its keys?
# instead, I'll start with an empty dict, and add new keys (characters) as we encounter them.

counts = {}

s = 'This is another amazing sentence that I can use in my Python course'

for one_character in s:
    if one_character in counts:
        counts[one_character] += 1   # add 1 to the count if we've seen it before
    else:
        counts[one_character] = 1    # otherwise, start the count with 1

for key, value in counts.items():
    print(f'{key}: {value}')

T: 1
h: 4
i: 4
s: 5
 : 12
a: 5
n: 7
o: 3
t: 5
e: 6
r: 2
m: 2
z: 1
g: 1
c: 3
I: 1
u: 2
y: 2
P: 1


# Exercise: Rainfall

1. Define an empty dict, `rainfall`. Each key-value pair that we add will be a string (a key, the name of a city) and an integer (a value, an amount of rain, in mm, that fell there).
2. Ask the user to repeatedly enter the name of a city.
    - If we get an empty string, exit from the loop.
3. If we got a city name, ask the user how many rain fell there?
4. Check:
    - If the city already exists in the dict as a key, add the rainfall to the existing value
    - If the city does *not* exist in the dict, then add a new key-value pair, the city and the rainfall
5. Iterate over our dict, printing the cities and amounts.

Example:

    City: a
    Rain: 5
    City: b
    Rain: 4
    City: a
    Rain: 3
    City: [ENTER]

    a: 8
    b: 4