# Agenda: Week 3

1. Q&A
2. Dictionaries
    - Defining them
    - Retrieving from them
    - Modifying them
    - Three paradigms of dictionary use
    - Looping over dictionaries
    - How do dicts work behind the scenes?
3. Files (text files)
    - How do we read from a file?
    - Iterating over file objects
    - Writing to files and the `with` construct

# Dictionaries

So far, we've talked about several types of "collections" in Python, data structures that contain other data structures:

- Strings (which contain characters)
- Lists (the main container, which contain anything -- traditionally many items of the same type)
- Tuples (contains, traditionally containing different types)

In all of these cases, we store and retrieve via the index, which starts at 0 and goes up to the length - 1.

Two problems: 

1. If we want to search for something, it can take a long time to find it, because we need to iterate over the entire data structure in order to search. The longer the string/list/tuple, the longer it can take to find out if a value is there.
2. Storing and retrieving by numeric index is not very intuitive.  We might want to store/retrieve information about employees by their ID number. Or about cars by their license plates. We *can* use the indexes, but something else might be nicer/better/easier to work with.

This is where dicts come in. Python is not the only language with dictionaries!  In other languages, we call them:

- Hash tables
- Hashes
- Hash maps
- Maps
- Name-value pairs
- Key-value pairs
- Associative arrays

The idea is that we are going to store not individual values, but *pairs* of values. One is called the "key" (it's the equivalent to an index) and the other is the "value." 

We can only work with pairs, never just a key or just a value.

Some rules for dicts:

- Every key has a value, and every value has a key.
- Values can be *ANYTHING* at all in the Python world.
- Keys are more restricted: They must be immutable (basically: numbers, strings, tuples), and they cannot repeat.

It's most common for us to define a dict with strings as keys, and something else (integers, floats, other strings) as the values.

You can think of a dict in some ways as a list in which we get to control not only the value that's stored, but also the index (key) we use to store it.

# Dict syntax

To define a dict:

- We use `{}`
- Each key-value pair has a `:` between the key and the value
- Pairs are separated by `,`


In [1]:
d = {'a':10, 'b':20, 'c':30}    # here, I'm creating a dict with three pairs

In [2]:
# we retrieve from a dict using [], just as we do with strings, lists, and tuples
# but in the [], we put the key we want

d['a']  

10

In [3]:
k = 'a'   # assign to a variable
d[k]  

10

In [4]:
d['wxyz']   # ask for a key that doesn't exist...

KeyError: 'wxyz'

In [5]:
# get the length of a dict with len()
len(d)

3

In [6]:
d

{'a': 10, 'b': 20, 'c': 30}

In [7]:
# if we want to avoid an error, then we don't want to request a key that doesn't exist
# we can use "in" to search in a dict's keys, to know if the key exists.
# note that "in" NEVER EVER searches in the values, only in the keys

'a' in d   # is 'a' a key in d?

True

In [8]:
'x' in d

False

- Keys are unique and immutable
- Values can be anything at all, no restrictions on types or repetition

In [9]:
person = {'first_name':'Reuven', 'last_name':'Lerner', 'shoe_size':46}

In [10]:
# I've now created a dict with three key-value pairs

person['first_name']

'Reuven'

In [11]:
person['shoe_size']

46

In [12]:
person['email']

KeyError: 'email'

In [14]:
if 'email' in person:
    print(person['email'])
else:
    print('Who has email any more?')

Who has email any more?


When we use `in` to search in a string, list, or tuple, we're searching through each of the values, one at a time.

When we use `in` to search a dict's keys, we're actually going straight to the place in memory where the key might (if it's there) be stored. And we know right away. This is a far faster search, and takes much less time.

# Uses for dicts

It turns out that there are *many* programming problems that are elegantly solved using dicts:

- user IDs and user records in a database
- filenames and file attributes
- filenames and file contents
- directory names and file objects in that directory

The fact that a dict can use strings for keys allows us to use natural, human types of information to store and retrieve. We can even get input from the user and use that to search through our dict.

# To define a dict

- Use `{}` on the outside
- Each key-value pair has a `:` between the key and value
- The pairs are separated with `,`

```python
month_numbers = {'Jan':1, 'Feb':2, 'Mar':3, 'Apr':4}
month_names = {1:'Jan', 2:'Feb', 3:'Mar', 4:'Apr'}
letter_values = {'a':1, 'b':2, 'c':3, 'd':4, 'e':5, 'f':6}
```

# Exercise: Restaurant

1. Define a dict, `menu`, which represents the menu at a restaurant. (You can decide what the restaurant sells, and what prices they have.)
2. Define `total`, an integer to be 0.
3. Ask the user, repeatedly, to enter the name of the dish they want to order
    - If they enter an empty string, then stop asking and print the total
4. If the item is on the menu, then print the item, its price, and the new total.
5. If the item is *not* on the menu, then scold the user and let them try again.
6. In the end, print the total.

Example:

    Order: sandwich
    sandwich is 10, total is 10
    Order: tea
    tea is 8, total is 18
    Order: elephant
    we're fresh out of elephant today!
    Order: [ENTER]
    total is 18

Plan:
- Create a dict with keys (strings/items) and values (prices)
- Use a `while` loop to ask the user repeatedly to enter what they want to order
- Use `input` in the `while` loop to get the user's input
- Check if the user entered an empty string, and if so, use `break` to leave the loop
- Check if the user's input is in the dict with `in`
- Update `total` accordingly

In [15]:
menu = {'sandwich':10, 'tea':8, 'apple':3, 'cake':5} 
len(menu)   

4

In [18]:
menu = {'sandwich':10, 'tea':8, 'apple':3, 'cake':5} 
order = input('Order: ').strip()
total = 0

# is the user's order on the menu?
if order in menu:
    price = menu[order]   # get the price for the user's order
    total += price        # add the price to the total
    print(f'{order} costs {price}; total is now {total}')

else:
    print(f'Sorry, but we are out of {order} today.')

print(f'total is {total}')

Order:  elephant


Sorry, but we are out of elephant today.
total is 0


**DO NOT USE PYTHON TYPE NAMES AS VARIABLE NAMES!**

Don't call your variables:
- `int`
- `str`
- `list`
- `tuple`
- `dict`

This *will* actually work... until it doesn't. It makes for very odd, hard-to-understand bugs.

In [19]:
type(menu)  # what data type is menu?

dict

In [20]:
# add the loop
# here, we're using the dict as a small read-only database in our program

menu = {'sandwich':10, 'tea':8, 'apple':3, 'cake':5} 
total = 0

while True:
    order = input('Order: ').strip()

    # did we get the empty string? Stop the loop
    if order == '':
        break
    
    # is the user's order on the menu?
    if order in menu:
        price = menu[order]   # get the price for the user's order
        total += price        # add the price to the total
        print(f'{order} costs {price}; total is now {total}')
    
    else:
        print(f'Sorry, but we are out of {order} today.')

print(f'total is {total}')

Order:  sandwich


sandwich costs 10; total is now 10


Order:  tea


tea costs 8; total is now 18


Order:  cake


cake costs 5; total is now 23


Order:  apple


apple costs 3; total is now 26


Order:  apple


apple costs 3; total is now 29


Order:  cookie


Sorry, but we are out of cookie today.


Order:  


total is 29


In [None]:
# AS

menu = {'burger':5, 'sandwitch':10, 'coffee':4}
total = 0

while True:
    order = input("what you want to order")
    if order == "":
        break
    elif order in menu:    
        total = total + menu[order]
    else:
        print("sorry")
print(total)        

# Rules for dicts

- Every key has one value, every value has one key
- Keys must be immutable and unique in the dict
- Values can be anything at all

On the face of it, each key can have one value, so you cannot have multiple values for a key.

*BUT* you can have a list or tuple as the value for a key, and then you have .. multiple values.

When we retrieve from a dict, we retrieve one value at a time, based on one key. There is no such thing as a "dict slice."

# Modifying dicts

Dictionaries are mutable! 

- We can add a new key-value pair whenever we want
- We can replace a value with a new one
- We can remove key-value pairs

This means that a dict cannot be a key in a dict. *BUT* a dict can be a value in a dict. This is how we create complex data structures, such as trees, in Python.

In [21]:
d = {'a':10, 'b':20, 'c':30, 'd':40}

# I can update/modify an existing value by assigning to the same key
# (remember that every key in a dict is unique)

d['c'] = 1234     # now 1234 will replace 30 as the value for 'c'

d

{'a': 10, 'b': 20, 'c': 1234, 'd': 40}

In [22]:
# how do I add a new key-value pairs to a dict?
# remember that in a list, we use the .append method to do so.
# well, in dicts, we just assign the key-value pair
# (yes, it's the same as modifying an existing key-value pair!)

d['x'] = 9876

d

{'a': 10, 'b': 20, 'c': 1234, 'd': 40, 'x': 9876}

In [23]:
# we can update an existing value using += 

d['x'] += 10    # this means: d['x'] = d['x'] + 10
d

{'a': 10, 'b': 20, 'c': 1234, 'd': 40, 'x': 9886}

In [24]:
# removing key-value pairs
# this is actually pretty rare, in my experience

d.pop('x')   # this removes the key-value pair with 'x' as the key, and returns the value

9886

In [25]:
d

{'a': 10, 'b': 20, 'c': 1234, 'd': 40}

In [26]:
# example of multiple values for a key

d = {'a':[10, 20, 30], 'b':[40, 50, 60], 'c':[70, 80, 90]}
d

{'a': [10, 20, 30], 'b': [40, 50, 60], 'c': [70, 80, 90]}

In [27]:
d['a']

[10, 20, 30]

In [28]:
# I can even do this:

d['a'].append(35)  # adds 35 to the list stored at d['a']

d

{'a': [10, 20, 30, 35], 'b': [40, 50, 60], 'c': [70, 80, 90]}

In [29]:
d['a']

[10, 20, 30, 35]

In [30]:
# this looks funny, but just read it from left to right
# - d, a dict
# - ['a'], retrieving from the dict
# - this gives back a list
# - on a list, we can use [0] to retrieve the first element

d['a'][0]  

10

In [31]:
# we can even put that on the left side of assignment!

d['a'][0]  = 99999   # assigns to the element at index 0 in the list at d['a']

d

{'a': [99999, 20, 30, 35], 'b': [40, 50, 60], 'c': [70, 80, 90]}

In [32]:
# you can get the keys with d.keys()
d.keys()

dict_keys(['a', 'b', 'c'])

In [33]:
# you can get the values with d.values()
d.values()

dict_values([[99999, 20, 30, 35], [40, 50, 60], [70, 80, 90]])

# Python does this, too!

If you store a value in a variable `x`, Python is really taking the variable name, using it as a string (`'x'`), and using that string as a key in an internal dict of variable names and values!

Python's core developers are always trying to find ways to make dicts faster, not just for us, but because the entire language then gets faster as a result.

# Next up

1. Using a dict for accumulation (paradigm 2)
2. Using a dict to count from nothing (paradigm 3)



We can use a dict to accumulate information over the course of a program:

- We set up a dict with keys that we know and initial values (typically 0)
- As the program progresses, we add to the values, typically 1 at a time
- When the program is done, we look at the keys and values and know how often something happened

In [35]:
# Odds and evens

# let's write a program that goes through the numbers from 0 - 9, decides if each is odd or even, and 
# counts accordingly.

counts = {'odds':0, 'evens':0}    # this dict is where we'll keep track of things

for one_number in range(10):  # this will iterate from 0 - 9
    if one_number % 2 == 0:   # does the number have 0 remainder after dividing by 2? It must be even!
        print(f'\t{one_number} is even')
        counts['evens'] += 1  # add 1 to counts['evens']
    else:
        print(f'\t{one_number} is odd')
        counts['odds'] += 1   # add 1 to counts['odds']

print(counts)

	0 is even
	1 is odd
	2 is even
	3 is odd
	4 is even
	5 is odd
	6 is even
	7 is odd
	8 is even
	9 is odd
{'odds': 5, 'evens': 5}


In [37]:
# do we have more odds or evens?

if counts['odds'] > counts['evens']:
    print('More odds than evens')
elif counts['odds'] == counts['evens']:
    print('The same!')
else:
    print('More evens than odds')

The same!


# Exercise: Vowels, digits, and others (dict edition)

1. Set up a dict, `counts`, with three keys -- `vowels`, `digits`, and `others`, with 0 as the value for each.
2. Ask the user to enter text (a string)
3. Go through the text, one character at a time
    - If it's a vowel, then add 1 to `vowels`
    - If it's a digit, then add 1 to `digits`
    - If it's neither, then add 1 to `others`
4. Print the dict that results.

In [38]:
counts = {'vowels':0,
          'digits':0,
          'others':0}

# counts['vowels'] += 1    # this means: counts['vowels'] = counts['vowels'] + 1

text = input('Enter text: ').strip()

for one_character in text:
    if one_character in 'aeiou':   # vowel?
        counts['vowels'] += 1      #    add 1 to the vowels counter
    elif one_character.isdigit():  # digit?
        counts['digits'] += 1      #    add 1 to the digits counter
    else:
        counts['others'] += 1      # add 1 to the others counter

print(counts)  

Enter text:  hello!! 123


{'vowels': 2, 'digits': 3, 'others': 6}


In [40]:
# AI

counts = {'others':0, 'digits':0, 'vowels':0}
text=input('Type text:')

for one_character in text:
    if one_character in 'aeiou':
      counts['vowels']+=1
    elif one_character.isdigit():
      counts['digits']+=1
    else:
      counts['others']+=1

print(counts)


Type text: hello!! 123


{'others': 6, 'digits': 3, 'vowels': 2}


In [None]:
# SS

counts = {'vowels':0 , 'digits':0 , 'others':0} 
input_text = input("Enter your text: ").strip()
vowelsList = ['a' , 'e' , 'i' , 'o' , 'u'] 

for text in input_text:
    if text.isdigit():
        counts['digits'] += 1 
    elif text in vowelsList:
        counts['vowels'] += 1 
    else :
        counts['others'] += 1 

print(counts)



In [41]:
# what if I don't want to count these characters, but store these characters?
# I can have my dict values be lists, rather than integers!

counts = {'vowels':[],
          'digits':[],
          'others':[]}

text = input('Enter text: ').strip()

for one_character in text:
    if one_character in 'aeiou':   # vowel?
        counts['vowels'].append(one_character)
    elif one_character.isdigit():  # digit?
        counts['digits'].append(one_character)     #    add 1 to the digits counter
    else:
        counts['others'].append(one_character)     # add 1 to the others counter

print(counts)  

Enter text:  hello!! 123


{'vowels': ['e', 'o'], 'digits': ['1', '2', '3'], 'others': ['h', 'l', 'l', '!', '!', ' ']}


In [43]:
# let's say we want to ignore some characters

ignore_list = [' ']

counts = {'vowels':[],
          'digits':[],
          'others':[]}

text = input('Enter text: ').strip()

for one_character in text:
    if one_character in ignore_list:
        print(f'\tIgnoring {one_character}')   # \t means "tab", or "indent up to 8 spaces"
    elif one_character in 'aeiou':   # vowel?
        counts['vowels'].append(one_character)
    elif one_character.isdigit():  # digit?
        counts['digits'].append(one_character)     #    add 1 to the digits counter
    else:
        counts['others'].append(one_character)     # add 1 to the others counter

print(counts)  

Enter text:  hello!! 123


	Ignoring  
{'vowels': ['e', 'o'], 'digits': ['1', '2', '3'], 'others': ['h', 'l', 'l', '!', '!']}


# Paradigm 3 for dicts: Starting with nothing

Sometimes, we want to count things, but we don't know what we'll be counting. We start with an empty dict, and then:

- If we encounter a key we've seen before (which we can know thanks to `in`), we add 1 to its count (value)
- If a key is new, then we add the key-value pair to the dict, with a value of 1 already


In [44]:
# example: Counting characters

counts = {}   # empty dict

text = input('Enter text: ').strip()  # get the text from the user

for one_character in text:
    if one_character in counts:     # have we seen this character before?
        counts[one_character] += 1  #     add 1 to its count
    else:
        counts[one_character] = 1   #     otherwise, add this character as a key, with a value of 1

print(counts)

Enter text:  this is a very interesting and important sentence


{'t': 6, 'h': 1, 'i': 5, 's': 4, ' ': 7, 'a': 3, 'v': 1, 'e': 6, 'r': 3, 'y': 1, 'n': 6, 'g': 1, 'd': 1, 'm': 1, 'p': 1, 'o': 1, 'c': 1}


# Exercise: Rainfall

We're going to keep track of how much rain fell in a variety of cities. We don't know what cities we'll be tracking when we write the program. The end result will be a dict whose keys are the city names (of what we tracked) and the values will be the total rain (in mm) that fell in that city.

1. Create an empty dict, `rainfall`.
2. Ask the user to enter a city name.
    - If they give you an empty string, stop asking; exit the loop.
3. If they gave you the name of a city, ask a second question: How much rain fell there?
    - We'll assume that they will give you a legal answer, with digits
4. If we have seen this city before, add the new rainfall to the old one.
5. If this is the first time we're seeing this city, set the key to the city and the value to the amount of rain we got.
6. At the end, print the dict.

Example:

    City: Jerusalem
    Rain: 5
    City: Tel Aviv
    Rain: 3
    City: Jerusalem
    Rain: 4
    City: [ENTER]
    {'Jerusalem':9, 'Tel Aviv':3}

This involves:
- A `while` loop (since we don't know how many inputs we'll get)
- Comparisons and `break`
- Check if the city name is in our dict as a key

    

In [46]:
for one_item in 'this is a string'.split():
    print(one_item)

this
is
a
string


In [47]:
rainfall = {}   # empty dict

while True:     # infinite loop -- because we don't know how many pieces of data we'll get

    city_name = input('Enter city name: ').strip()

    if city_name == '':    # exit from the loop if we got an empty city name
        break

    mm_rain = input('Enter mm rain: ').strip()
    mm_rain = int(mm_rain)

    if city_name in rainfall:        # have we seen this city before?
        rainfall[city_name] += mm_rain   # add the new amount of rain to whatever is in the dict
    else:
        rainfall[city_name] = mm_rain    # if it's the first time, then just assign the key-value pair

print(rainfall)

Enter city name:  Jerusalem
Enter mm rain:  5
Enter city name:  Tel Aviv
Enter mm rain:  4
Enter city name:  Jerusalem
Enter mm rain:  3
Enter city name:  


{'Jerusalem': 8, 'Tel Aviv': 4}


In [52]:
# AI

rainfall={}

while True:
 city=input('City:')
 if city== "":
  break
 rain=input('How much rain?:')
 rain = int(rain)    
 if city in rainfall:
  rainfall[city]+=rain
 else:
  rainfall[city]=rain
print(rainfall) 

City: a
How much rain?: 5
City: b
How much rain?: 4
City: a
How much rain?: 3
City: 


{'a': 8, 'b': 4}


# Next up

- Iterating over dicts (the right and wrong ways to do it)
- How dicts are implemented
- Then ... files!

Download the zipfile from here: https://files.lerner.co.il/exercise-files.zip

In [None]:
s = 'abcd'
s.upper()  # we're invoking a method, the str.upper method on s -- the "." indicates that we're using an internal dict!

In [53]:
# SS


rainfallRec = {} 

while True : 
    cityName = input('Enter city name : ').strip()
    if cityName == '':
        break
    rain_mm = input('Enter rainfall in mm:').strip()
    rain_mm = int(rain_mm)
    if cityName in rainfallRec:
        rainfallRec[cityName] += rain_mm
    else:
        rainfallRec[cityName] = rain_mm

print (rainfallRec)

Enter city name :  a
Enter rainfall in mm: 5
Enter city name :  b
Enter rainfall in mm: 4
Enter city name :  a
Enter rainfall in mm: 3
Enter city name :  


{'a': 8, 'b': 4}


# Variable name style

There are a few basic styles for naming variables and functions in programming languages:

- `snake_case`, all lowercase, with `_` between words
- `CamelCase`, with initial letters of words Capitalized
- `ALL_CAPS`
- `__dunder__`, this is a special Python-only thing, with two underscores before and after the word

In Python:
- We use `snake_case` for all variables and functions
- We use `CamelCase` for class names (in object-oriented programming)
- We use `ALL_CAPS` for constants, meaning variables that shouldn't be modified
- Dunder names are for special methods that you define in your objects



# Iterating over dicts

We've seen that we can use a `for` loop over different data structures:

- Iterating over a string gives us characters
- Iterating over a list gives us elements
- Iterating over a tuple gives us elements

What happens if we iterate over a dict?

In [54]:
# iterating over a dict gives us the keys (and not the values)

d = {'a':10, 'b':20, 'c':30}

for one_item in d:
    print(one_item)

a
b
c
