# Agenda, week 3: Dictionaries and files

1. Recap (loops, lists, and tuples) + Q&A
2. Dictionaries
    - Defining them
    - Retrieving from them
    - Assigning to them / modifying them
    - Accumulating in dictionaries
    - Accumulating the unknown
    - Looping over dicts
    - How do dicts work?
3. Files
    - Reading from text files
    - Looping over text files
    - Writing to files and the `with` construct
  
Download files from : https://files.lerner.co.il/exercise-files.zip

# Extracting a value from a list in a list



In [2]:
mylist = [10, 20, 30]

# I want the item at index 1, aka 20
mylist[1]

20

In [3]:
mylist = [[10, 20, 30], [40, 50, 60], [70, 80, 90]]

# this is a list of 3 elements
# each of those elements is itself a list of 3 integers

mylist[1]   # what will this return? A list -- [40, 50, 60]

[40, 50, 60]

In [4]:
# How can we retrieve 50 from that inner list?

inner_list = mylist[1]  # will this work?
inner_list[1]    # get index 1 from inner_list

50

In [5]:
inner_list[0]  # get index 0 from inner_list

40

In [7]:
# now, let's do that WITHOUT another variable
# we'll get index 0 from the list at mylist[1]

mylist[1][0]   # read the expression from left to right -- mylist, get its element [1]
               # that returns [40, 50, 60].  Apply [0] to that list, and we get 40

40

# Data structures so far

So far, we've talked about some simple data structures:

- Integers and floats
- Booleans (`True` and `False`)

Then we discussed three types of data which are *sequences*:

- Strings (immutable, and contain characters)
- Lists (mutable, and contain anything -- but traditionally, all elements are of the same type)
- Tuples (immutable and contain anything -- but traditionally, elements are of different types)

We saw that with all sequences, we can run a `for` loop or we can use `in` to search in the sequence.

Two big questions:
1. How long does it take to run that search for an element in a sequence (string, list, tuple)?
2. Wouldn't it be nice if we could use something other than integers (0, 1, 2, etc.) as indexes in our sequences?

Right now, we're able to store lots of interesting and important data in a list, but the indexes are 0, 1, 2, etc.  Which means that if I'm storing information about different people, and I have an ID number (SSN or national ID or employee ID) that uniquely identifies the person, I can't use that ID number easily as the index in my list. Rather, I have to use 0, 1, 2, 3, etc.

Dictionaries are especially useful because they let us use *anything* as the key (which is the term we use in dicts instead of "index"). We can use ID numbers as our indexes (aka our keys). We can use names (yes, strings!) as our keys. We can actually use anything for a key that is immutable -- basically meaning numbers and strings, but theoretically anything immutable.

# Dicts in a nutshell

A dictionary contains key-value pairs.  Dicts exist in many programming languages, not just Python, but often with other names:

- Hash tables
- Hashes
- Associative arrays
- Key-value stores
- Name-value stores
- Hashmaps
- Maps

Every key has one value, and every value has one key. There is no such thing, in a dict, of a key without a value or a value without a key.

Keys are guaranteed to be unique -- they cannot repeat -- and must be immutable (again, basically numbers or strings).

Values can be absolutely anything at all in Python.

In this way, we have amazing flexibility when choosing our keys and values. Often, the keys will be a unique index into our data set.

In [10]:
# let's define a dict!

# 1. We use curly braces to define the dictionary
# 2. Each key-value pair has a : between the key and the value
# 3. We put commas between each key-value pair
# 4. Keys can be anything immutable, and values can be anything at all
# 5. Values can repeat if you want, but keys cannot.  Keys are guaranteed to be unique.

d = {'a':10, 'b':20, 'c':30}

In [11]:
type(d)   # what kind of data do we have here?

dict

In [12]:
# How do I retrieve from a dict?
# Just like with a string, list, or tuple with [] and the key we want inside of them

d['a']   # retrieve the value with the key 'a'

10

In [13]:
k = 'a'
d[k]    # will this work, using a variable that refers to a string?

10

In [14]:
# what if I try to retrieve the value for a key that doesn't exist?
d['q']

KeyError: 'q'

In [16]:
# we can avoid these problems if we first check whether a key is in the dict
# we can do that with the "in" operator

if 'a' in d:       # 'in' only checks the keys, not the values. 
    print(d['a'])
else:
    print(f'a is not a key in d')

10


In [17]:
if 'q' in d:       # 'in' only checks the keys, not the values. 
    print(d['q'])
else:
    print(f'q is not a key in d')

q is not a key in d


# Variable names and value types

You can give ANY NAME YOU WANT to ANY VALUE YOU WANT in Python.  No restrictions there, aside from reserved words.

I typically use `t` for tuples and `d` for dictionaries.  This is just my lazy teaching convention. You should normally aim to use longer names for all of your variables that make your code clearer.

Python couldn't care less what you call your variables. The names are for *you* and anyone who will be reading/maintaining your code.

# Paradigms for dict use

I like to talk about three paradigms for dictionary use. We're going to use paradigm #1 in this exercise, namely creating a dict and using it as a read-only database in our program.

We'll define it at the top of the program, then use it, but never modify it.

# Exercise: Restaurant

1. Define a dict called `menu` in which the keys are strings representing items on a restaurant menu, and the values are the prices of those items.
2. Define `total` to be 0.
3. Ask the user, repeatedly, to order something.
    - If they give you the empty string, then stop asking and print the total.
    - If they give you the name of something on the menu, then print the price and the new total. (And don't forget to add the price to the total.)
    - If they ask for something *not* on the menu, then tell them it's out of stock.
4. Print the total.

Example:

    Order: sandwich
    Sandwich is 10, total is 10
    Order: tea
    Tea is 5, total is 15
    Order: elephant
    We are fresh out of elephant today!
    Order: [ENTER]
    Total is 15

Some things to remember:
1. We can use a `while True` loop, which is infinite. Just don't forget to `break` when you want to exit from the loop, or it'll run forever.
2. When we use `input` to get input from the user, check to see if we got an empty string; if so, you can `break` out of the loop.
3. Ask the user for their input. If it's blank, `break`. If not, check to see if it's a key in the dict.

In [18]:
# define my dict, a variable called "menu"

menu = {'sandwich':10   ,   'tea':5   ,   'cake':7,     'apple':3  }

In [19]:
# what is the length of my dict?

len(menu)

4

In [20]:
menu['sandwich']

10

In [21]:
menu['tea']

5

In [22]:
# in my "menu" dict, the convention is that the keys are the menu items,
# and the values are the prices for the corresponding items.

menu = {'sandwich':10   ,   'tea':5   ,   'cake':7,     'apple':3  }
total = 0

while True:   # infinite loop -- be sure to have an escape plan with "break" later on
    order = input('Order: ').strip()   # ask the user for an order, remove whitespace, and assign to order

    if order == '':      # this is my escape hatch!
        break

    # not an empty string? Let's find out if the string is a key in our "menu" dict
    if order in menu:
        price = menu[order]    # grab the price from the menu dict
        total += price         # add the price to our total
        print(f'{order} costs {price}; new total is {total}')

    else:
        print(f'Sorry, we are all out of {order} today. Try something else.')

print(f'Total is {total}')

Order:  sandwich


sandwich costs 10; new total is 10


Order:  tea


tea costs 5; new total is 15


Order:  elephant


Sorry, we are all out of elephant today. Try something else.


Order:  


Total is 15


In [23]:
menu

{'sandwich': 10, 'tea': 5, 'cake': 7, 'apple': 3}

In [24]:
menu['sandwich']

10

In [25]:
price = menu['sandwich']   # this assigns the value 10 to the variable price
print(price)

10


In [26]:
total = 0
price = menu['sandwich']    # this assigns the value 10 to price
total += price               # this adds the value of price to total
total

10

In [27]:
# if you want, you can create a dict from a list of tuples (or even a list of lists)
# if each inner tuple/list contains two elements, that works:

mylist = [('a',10), ('b',20), ('c',30)]

dict(mylist)   # this returns a new dict based on mylist

{'a': 10, 'b': 20, 'c': 30}

In [28]:
# what if we don't use in?
# I assume you aren't asking: What if we just drop the word "in" from our query?

order = 'sandwich'

if order menu:  # this is not what you mean!
    print('Good!')

SyntaxError: invalid syntax (754900947.py, line 6)

In [29]:
# I think you were asking: What happens if we don't check, using "in", whether the order
# is in the dict? Then, what value do we get back?

price = menu['sandwich']
print(price)

10


In [30]:
price = menu['elephant']
print(price)

KeyError: 'elephant'

# Changing dictionaries

Dicts, like lists, are *mutable*. They can be changed!  Let's see.

In [31]:
d = {'a':10, 'b':20, 'c':30}

# what happens if I assign to an existing key?
d['b'] = 1234

In [34]:
# assigning to an existing key keeps the key, but replaces the value. 
# (note that it does not add the old value to the new one. It just replaces it.)

d

{'a': 10, 'b': 1234, 'c': 30}

In [37]:
# what if I assign to a key that does *not* exist?
# we add a new key-value pair to a dict
d['x'] = 5678

In [38]:
d

{'a': 10, 'b': 1234, 'c': 30, 'x': 5678}

# Lists vs. dicts

If you thought that there would be an `append` method for dicts, allowing us to add new elements, surprise! All we have to do is assign to a key-value pair. If the key is new, we get a new key-value pair. If the key already exists, then we replace the value.



In [39]:
# FD asks: can floats be in dicts?
# they can *definitely* be values in dicts, because ALL Python  objects an be values

menu['water'] = 2.5
menu

{'sandwich': 10, 'tea': 5, 'cake': 7, 'apple': 3, 'water': 2.5}

In [40]:
price = menu['water']
print(price)

2.5


In [41]:
type(price)

float

In [42]:
# can floats be *keys* in a dict?
# answer: yes! They are immutable, just like integers

In [43]:
special_numbers = {3.14:'pi', 2.718281828459045:'e', -1:'i'}

In [44]:
special_numbers[3.14]

'pi'

# CH asks: More than 2 things?

Every key has exactly one value. Every value has exactly one key.

HOWEVER, a value can be a list, tuple, or dict. So a key could refer to a value that is a dict, whose values are lists, whose values are tuples, etc. etc.

# FJ : Can we make the values in a dict immutable?

If the values are immutable (e.g., strings), then we cannot change them. But that wasn't your question - you want to know if we can make it impossible to change the values in a dict. Answer: No.

There *might* be a "frozendict" in Python that is created once, and then not changeable. There is a "frozenset," and sets are second cousins of dictionaries.

In [45]:
# AK Formatting output, such as $2.50

# We can actually use F-strings for this!

price

2.5

In [53]:
# you can put lots of magic stuff after : in the {} in an f-string to modify
# how things are displayed.

print(f'the price is ${price:0.2f}.')  # :0.2f means: show 2 digits after the . for float values

the price is $2.50.


In [51]:
# keys cannot exist more than once in a dict 
d = {'a':10, 'b':20, 'c':30, 'a':40, 'b':50, 'c':60}

d  # did we have duplicate keys? No way

{'a': 40, 'b': 50, 'c': 60}

In [52]:
# what if I assign a key that already exists?
d['a'] = 999
d

{'a': 999, 'b': 50, 'c': 60}

In [54]:
# How do we handle case-sensitive keys?

# (1) are keys case sensitive? YES, definitely

# (2) how can we deal with that? Answer: Use lowercase when creating the dict (if you can)
# and then use .lower() on the user's input.

# if you have mIxEd cAsE as keys in your dict.... good luck!

# What about removing items from a dict?

First: I rarely find myself removing items from a dict in a program.

Second: You can use the `.pop` method, which takes a key as an argument, and returns the value. Plus, it removes the key-value pair you specified.

In [55]:
d

{'a': 999, 'b': 50, 'c': 60}

In [56]:
d.pop('c')   # remove the key-value pair with the key 'c'

60

In [57]:
d

{'a': 999, 'b': 50}

In [58]:
d.pop('c')   # remove it again?!?

KeyError: 'c'

# Keys are primary in a dict

The values in a dict are kind of along for the ride -- you cannot search on them, they don't determine any of the back-end storage (which we'll discuss later), and they can repeat.

You always need to think about a dict as a one-way street -- via the key, you can get to the value. But via the value, you *cannot* easily get to the key.

# Handling bad keys

If you try to retrieve from a dict, and the key you specify doesn't exist in the dict, then you'll get an exception. There is something in Python known as "exception handling" that we won't cover in this course. If you encounter that exception, you can "trap" it with an exception handler, and then give the user a more normal error message.

We're going to stick to searching in the dict via `in` for a key, and thus avoiding such trouble.



In [59]:
# variation on our exercise code, with quantities (blame this on AD)

menu = {'sandwich':10   ,   'tea':5   ,   'cake':7,     'apple':3  }
total = 0

while True:   # infinite loop -- be sure to have an escape plan with "break" later on
    order = input('Order: ').strip()   # ask the user for an order, remove whitespace, and assign to order

    if order == '':      # this is my escape hatch!
        break

    # not an empty string? Let's find out if the string is a key in our "menu" dict
    if order in menu:
        quantity_string = input('How many? ').strip()

        if quantity_string.isdigit():
            quantity = int(quantity_string)
        else:
            print(f'Non-numeric quantity; try again. ')
            continue   # go back to the top of the while loop
        
        price = menu[order]          # grab the price from the menu dict
        total += price * quantity    # add the price to our total
        print(f'{order} costs {price}; you ordered {quantity}; new total is {total}')

    else:
        print(f'Sorry, we are all out of {order} today. Try something else.')

print(f'Total is {total}')

Order:  sandwich
How many?  3


sandwich costs 10; you ordered 3; new total is 30


Order:  apple
How many?  10


apple costs 3; you ordered 10; new total is 60


Order:  


Total is 60


# Next up:

1. Accumulating in dicts
2. Accumulating the unknown

# Dicts so far

We've used paradigm #1, namely: We define a dict and use it in our program, never changing it. We just read it from it.  This is pretty common:

1. Dict keys are month names, and dict values are month numbers.
2. Dict keys are month numbers, and dict values are month names.
3. Dict keys are country names, and dict values are international dial prefixes.
4. Dict keys are user IDs, and dict values are also dicts, with user names, e-mail addresses, and salary.

# Sometimes, we want to modify a dict!

In paradigm #2, we define a dict at the start of the program with keys and empty values -- 0, empty list, etc. Over the course of the program, we don't add/remove keys, but we do update/modify the values.

In this paradigm, we know what we want to count or keep track of, but we don't know in advance how many there will be.



In [60]:
# odds and evens

counts = {'odds':0, 'evens':0}

# let's create a list of integers
mylist = [10, 15, 20, 25]

for one_number in mylist:
    if one_number % 2 == 1:   # if the remainder after dividing by 2 is 1, we have an odd number!
        counts['odds'] += 1   # add 1 to the score of odd numbers
    else:
        counts['evens'] += 1  # add 1 to the score of even numbers



In [61]:
counts

{'odds': 2, 'evens': 2}

# Exercise: Digits, vowels, and others (dict edition)

1. Create a dict called `counts` in which you have three keys: `vowels`, `digits`, and `others`. The values should all be set to 0.
2. Ask the user to enter a string.
3. Go through the string, one character at a time:
    - If the character is a digit, add 1 to `digits`
    - If the character is a vowel, add 1 to `vowels`
    - In other cases, add 1 to `others`
4. Print the dict with the counts.

Example:

    Enter a string: hello! 123
    {'digits':3, 'vowels':2, 'others':5}

In [62]:
counts = {'digits':0, 
          'vowels':0,
          'others':0}
s = input('Enter a string: ').strip()

for one_character in s:
    if one_character.isdigit():
        counts['digits'] += 1    # counts['digits'] = counts['digits'] + 1
    elif one_character in 'aeiou':
        counts['vowels'] += 1
    else:
        counts['others'] += 1

print(counts)

Enter a string:  hello! 123


{'digits': 3, 'vowels': 2, 'others': 5}


In [63]:
# how do I check odd vs. even?

10 % 3     # this means: what's the remainder after dividing 10 / 3 ?   it'll be 3, remainder 1

1

In [64]:
# if I divide a number by 2 and get its remainder:
# - if the remainder is 0, the number is even
# - if the remainder is 1, the number is odd

10 % 2    # remainder is 0, 10 is even

0

In [65]:
13 % 2    # remainder is 1, 13 is odd

1

In [73]:
# AK extension -- also check for non-vowel letters

# we can use the "string" module in Python's standard library
# which provides pre-defined variables with all sorts of useful stuff

import string

counts = {'digits':0, 
          'vowels':0,
          'consonants':0,
          'others':0}
s = input('Enter a string: ').strip()

for one_character in s:
    if one_character.isdigit():
        counts['digits'] += 1    # counts['digits'] = counts['digits'] + 1
    elif one_character in 'aeiou':
        counts['vowels'] += 1
    # elif one_character in 'bcdfghjklmnpqrstvwxyz':

    # the string module contains many predefined variables with different characters
    # ascii_lowercase means: lowercase letters in the Latin/English alphabet
    # ASCII is an old standard for computers to turn numbers into letters (and vice versa)
    elif one_character in string.ascii_lowercase:
        counts['consonants'] += 1
    else:
        counts['others'] += 1

print(counts)

Enter a string:  hello! 123


{'digits': 3, 'vowels': 2, 'consonants': 3, 'others': 2}


# When .isdigit doesn't work

I've often seen two mistakes/problems with `isdigit`:

1. Remember that `isdigit` is a string method. It only works on strings.  So don't try to convert something to an integer, and then run `isdigit` on the integer. You'll get an error saying that integers don't have any attribute named `isdigit`

In [70]:
x = 5
if x.isdigit():    # isdigit is a string method, not an int method
    print('yes, it is numeric!')   # this code will fail

AttributeError: 'int' object has no attribute 'isdigit'

2. A second mistake is that they forget that you need to call a method with `()` after its name. If you don't do that, then you're asking if `isdigit` exists, and the answer is always "yes" to that. Which means you'll get errors, because you'll get a `True` result, regardless of whether it's an integer.

In [71]:
x = '5'

if x.isdigit:   # notice: no (), which is A BIG BUG!
    print(f'Yes, {x} contains digits!')

Yes, 5 contains digits!


In [72]:
x = '!'

if x.isdigit:   # notice: no (), which is A BIG BUG!
    print(f'Yes, {x} contains digits!')

Yes, ! contains digits!


# Paradigm 3: Start with an empty dict, build it over time

In this third paradigm, we don't know what keys we'll get, and we don't know what values we'll get. But we know what we want to do with them over time.

As we get new keys, we add them to the dict. As we get values, we add them (as before) to the existing count.



In [74]:
# Example: How many times does each character appear in a user's input?

counts = {}   # empty dict!

s = input('Enter a string: ').strip()

for one_character in s:
    counts[one_character] += 1   # my ideal -- this will not work!

print(counts)

Enter a string:  hello out there!


KeyError: 'h'

In [75]:
# Example: How many times does each character appear in a user's input?
# now, let's do it the working way, making sure that a key is there before we add to / retrieve it

counts = {}   # empty dict!

s = input('Enter a string: ').strip()

for one_character in s:
    if one_character in counts:      # if we've seen this letter before,
        counts[one_character] += 1   #    add 1 to its count
    else:                            # if we have NOT seen this letter before,
        counts[one_character] = 1    #    set the new key-value pair with a value of 1
        
print(counts)

Enter a string:  hello out there


{'h': 2, 'e': 3, 'l': 2, 'o': 2, ' ': 2, 'u': 1, 't': 2, 'r': 1}


# Exercise: Rainfall

The goal of this exercise is to create a dict in which the keys are city names, and the values are quantities of rain (measured in mm).  Over time, we'll gather information about cities and rainfall, and then be able to produce a report showing the totals.

1. Create an empty dict, called `rainfall`.
2. Ask the user to enter the name of a city.  (We have no restrictions -- we don't have a list of cities in advance.)
    - If the user enters an empty string, we stop asking and print the entire `rainfall` dict.
3. If we got a city name, ask the user to enter how many mm of rain fell since the last report. We can assume that the user will give us digits here.
4. If this is the first time we're seeing a report from this city, add the key-value pair to `rainfall`.
5. If this is *not* the first time we're seeing a report from this city, then add the new value to the existing one in `rainfall`.
6. Print the entire dict.

Example:

    City: Jerusalem
    Rain: 3
    City: Tel Aviv
    Rain: 4
    City: Tel Aviv
    Rain: 5
    City: [ENTER]
    {'Jerusalem':3, 'Tel Aviv':9}

In [78]:
# empty dict in which we'll store city names + mm rain
rainfall = {}

while True:   # potentially infinite loop!
    city_name = input('City: ').strip()

    if city_name == '':     # this is our escape hatch -- if we got an empty string, break out of the loop
        break

    mm_rain = int(input('Rain: ').strip())

    if city_name in rainfall:            # is this city already a key in "rainfall"?
        rainfall[city_name] += mm_rain   #     add to the existing rainfall
    else:
        rainfall[city_name] = mm_rain    # if this is the first time with this city, we'll just assign

print(rainfall)

City:  a
Rain:  5
City:  b
Rain:  4
City:  a
Rain:  3
City:  


{'a': 8, 'b': 4}


# Next up

1. Looping and dicts
2. How dicts are implemented (behind the scenes)
3. Start with files!

In [80]:
# TC's code

rainfall = {}

while True:
    city = input('Enter a city: ').strip()
    
    if city == '':
        break
    
    mm = int(input('Rain: ').strip())
    
    if city in rainfall:        # if the city that the user entered is in the dict as a key...
        rainfall[city] += mm    #    add mm to the current rainfall
    else:
        rainfall[city] == mm    # we'll compare the current rainfall with the user's new rainfall
        
print(rainfall)

Enter a city:  a
Rain:  5


KeyError: 'a'

# Loops and dicts

We've seen that we can do lots of things with dicts:

- Define them
- Retrieve from them via a key
- Update a value based on a key
- Remove a key-value pair

What happens if we run a `for` loop on our dict?

We know that every type of value works differently with `for` loops:

- integers *don't* work with them
- `range()` does work with loops, giving us one integer at a time
- strings give us one character at a time
- lists give us one element at a time
- tuples give us one element at a time

What about dicts?

In [82]:
d = {'a':10, 'b':20, 'c':30}

# iterating over a dictionary gives us the keys (not the values)!
for one_item in d:    
    print(one_item)

a
b
c


In [84]:
# what if I want to print a dict -- all of its keys and values?

for key in d:
    print(f'{key}: {d[key]}')   # print keys and values

a: 10
b: 20
c: 30


In [86]:
# there is a nicer/better way to print both keys and values
# this is my favorite way: the .items method
# items returns, with each iteration, one 2-element tuple (key, value)

for t in d.items():    # get the tuple, print the tuple
    print(t)

('a', 10)
('b', 20)
('c', 30)


In [88]:
for t in d.items():             # get the tuple,
    print(f'{t[0]}: {t[1]}')    # print t[0] (key) and t[1] (value)

a: 10
b: 20
c: 30


In [89]:
# let's turn t[0] and t[1] into variables with names!
# let's do this with unpacking!   there are 2 values in t, and we can assign them to 2 variables

for t in d.items():
    key, value = t    # unpacking here
    print(f'{key}: {value}')

a: 10
b: 20
c: 30


In [90]:
# now, let's show you the final way to do this

for key, value in d.items():
    print(f'{key}: {value}')

a: 10
b: 20
c: 30


In [91]:
d.items()

dict_items([('a', 10), ('b', 20), ('c', 30)])

In [92]:
# what if you want to search in the values?
# you can call the .values() method on a dict

d.values()

dict_values([10, 20, 30])

In [93]:
20 in d.values()   # this is relatively rare and VERY slow (comparatively)

True

In [94]:
# you can also say:

'a' in d.keys()   # this is also VERY slow, and you should never do it -- it's much faster to say 'a' in d

True

In [95]:
d = {'a':10, 'b':(20, 30, 40), 'c':{'x':100, 'y':200}}
d

{'a': 10, 'b': (20, 30, 40), 'c': {'x': 100, 'y': 200}}

In [97]:
for key, value in d.items():
    print(f'{key}: {value} -- value is {type(value)}')

a: 10 -- value is <class 'int'>
b: (20, 30, 40) -- value is <class 'tuple'>
c: {'x': 100, 'y': 200} -- value is <class 'dict'>


# Dicts: Behind the scenes

Dicts are also known as "hash tables," because a "hash function" is used to determine where things are stored in memory.

Let's start by talking about lists. If I want to search in a list for a value, how long will it take? In CS theory notation, we say `O(n)`, aka "linear time."  Meaning: The longer the list, the longer we might need to search for a value.

When Python searches in a list (with a `for` loop or with `in`), it might need to go through the entire list before discovering that the value is not there. 

How can we do better? Hash functions!

When Python needs to decide where to store a key-value pair in memory, it runs `hash(key)`. This function returns a number that is very hard to predict, looks random, but gives the same answer for the same key over time. It tries to randomly distributed value, so that the odds of two different value having the same hash result is very small. 

When you say `d['a'] = 10`, Python runs `hash('a')`, and uses the number it gets back to store the key-value pair. When you then ask `'a' in d`, Python again runs `hash('a')`, jumps to that location in memory, and checks if the key-value pair is there. If so, then great! If not, we get an error.

This means that we can have any number of key-value pairs, and the search speed won't change. It'll always be what CS theory people call O(1), or "constant time." 

This explains why only immutable data can be used as a key. If the data can change, then the hash function's result on the data will change, too. Then what? Do we move things around in the data structure? Do we keep a record of where it used to be? No. That's too complicated.  As a result, Python forbids us from using mutable data as keys, to avoid such problems.

# Files

The point of a file is to avoid having to enter data each time we start a program, or turn on a computer. They provide us with storage for information that we might need later on.

There are many types of files, but they all come down to some numbers on the disk. Sometimes, those numbers can be interepreted as simple text. (Those are the files we'll be dealing with today.) Other times, those numbers have to be interpreted as complex documents such as Excel, PowerPoint, PDF, or other.  Those are typically quite complicated to work with.

We're just going to work with simple text files.

It used to be, long ago, that a computer program could just read from the disk on its own. Nowadays, that's very much not the case; the operating system has to protect the files on disk from unauthorized access. Also, it wants to make sure that we don't do anything truly dangerous to the files.

Thus, if we want to read from a file, we need to ask the OS for help in doing so, It returns a "file object," or a "file handle," through which we can actually work with a a file.

To work with a file:
1. Get the name of the file we want to work with
2. `open` the filename, a string, for reading. This returns a new file object.
3. Read data from the file, via the file object.
4. Close the file object, so that we don't waste resources on the computer.

In [98]:
# let's open a file.
# I'm going to read through the file in the zipfile I mentioned earlier,
# one called linux-etc-passwd.txt.  This is a "password file" from a Linux system
# I used to use.

In [99]:
# here, I use a Unix command to see the top of the file
!head -20 linux-etc-passwd.txt

# This is a comment
# You should ignore me
root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
bin:x:2:2:bin:/bin:/usr/sbin/nologin
sys:x:3:3:sys:/dev:/usr/sbin/nologin
sync:x:4:65534:sync:/bin:/bin/sync
games:x:5:60:games:/usr/games:/usr/sbin/nologin
man:x:6:12:man:/var/cache/man:/usr/sbin/nologin
lp:x:7:7:lp:/var/spool/lpd:/usr/sbin/nologin
mail:x:8:8:mail:/var/mail:/usr/sbin/nologin



news:x:9:9:news:/var/spool/news:/usr/sbin/nologin
uucp:x:10:10:uucp:/var/spool/uucp:/usr/sbin/nologin
proxy:x:13:13:proxy:/bin:/usr/sbin/nologin
www-data:x:33:33:www-data:/var/www:/usr/sbin/nologin
backup:x:34:34:backup:/var/backups:/usr/sbin/nologin
list:x:38:38:Mailing List Manager:/var/list:/usr/sbin/nologin


In [100]:
# to read this file from Python, we'll need to 
# (1) open it for reading
# (2) get its data
# (3) close it

f = open('linux-etc-passwd.txt')   # same directory, so no need to use / (Unix) or \ (Windows)
print(f.read())                    # read() returns the entire contents of the file as a string
f.close()                          # we don't need the file open any more.

# This is a comment
# You should ignore me
root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
bin:x:2:2:bin:/bin:/usr/sbin/nologin
sys:x:3:3:sys:/dev:/usr/sbin/nologin
sync:x:4:65534:sync:/bin:/bin/sync
games:x:5:60:games:/usr/games:/usr/sbin/nologin
man:x:6:12:man:/var/cache/man:/usr/sbin/nologin
lp:x:7:7:lp:/var/spool/lpd:/usr/sbin/nologin
mail:x:8:8:mail:/var/mail:/usr/sbin/nologin



news:x:9:9:news:/var/spool/news:/usr/sbin/nologin
uucp:x:10:10:uucp:/var/spool/uucp:/usr/sbin/nologin
proxy:x:13:13:proxy:/bin:/usr/sbin/nologin
www-data:x:33:33:www-data:/var/www:/usr/sbin/nologin
backup:x:34:34:backup:/var/backups:/usr/sbin/nologin
list:x:38:38:Mailing List Manager:/var/list:/usr/sbin/nologin
irc:x:39:39:ircd:/var/run/ircd:/usr/sbin/nologin
gnats:x:41:41:Gnats Bug-Reporting System (admin):/var/lib/gnats:/usr/sbin/nologin

nobody:x:65534:65534:nobody:/nonexistent:/usr/sbin/nologin
syslog:x:101:104::/home/syslog:/bin/false
messagebus:x:102:106::/var/run/dbu

In [101]:
# if we aren't going to use the read() method, then how can we read from our file
# in a safe way?

# answer: for loops!

# if you iterate over a file object, you'll get the file, one line at a time. Each iteration
# will give you the next line in the file, up to and including its trailing \n character.

f = open('linux-etc-passwd.txt')
for one_line in f:
    print(one_line)  # one \n from print, another \n from the line itself
f.close()    

# This is a comment

# You should ignore me

root:x:0:0:root:/root:/bin/bash

daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin

bin:x:2:2:bin:/bin:/usr/sbin/nologin

sys:x:3:3:sys:/dev:/usr/sbin/nologin

sync:x:4:65534:sync:/bin:/bin/sync

games:x:5:60:games:/usr/games:/usr/sbin/nologin

man:x:6:12:man:/var/cache/man:/usr/sbin/nologin

lp:x:7:7:lp:/var/spool/lpd:/usr/sbin/nologin

mail:x:8:8:mail:/var/mail:/usr/sbin/nologin







news:x:9:9:news:/var/spool/news:/usr/sbin/nologin

uucp:x:10:10:uucp:/var/spool/uucp:/usr/sbin/nologin

proxy:x:13:13:proxy:/bin:/usr/sbin/nologin

www-data:x:33:33:www-data:/var/www:/usr/sbin/nologin

backup:x:34:34:backup:/var/backups:/usr/sbin/nologin

list:x:38:38:Mailing List Manager:/var/list:/usr/sbin/nologin

irc:x:39:39:ircd:/var/run/ircd:/usr/sbin/nologin

gnats:x:41:41:Gnats Bug-Reporting System (admin):/var/lib/gnats:/usr/sbin/nologin



nobody:x:65534:65534:nobody:/nonexistent:/usr/sbin/nologin

syslog:x:101:104::/home/syslog:/bin/false

messagebu

In [102]:

f = open('linux-etc-passwd.txt')
for one_line in f:
    print(one_line.strip())   # use strip to remove whitespace, then we only have \n from print
f.close()    

# This is a comment
# You should ignore me
root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
bin:x:2:2:bin:/bin:/usr/sbin/nologin
sys:x:3:3:sys:/dev:/usr/sbin/nologin
sync:x:4:65534:sync:/bin:/bin/sync
games:x:5:60:games:/usr/games:/usr/sbin/nologin
man:x:6:12:man:/var/cache/man:/usr/sbin/nologin
lp:x:7:7:lp:/var/spool/lpd:/usr/sbin/nologin
mail:x:8:8:mail:/var/mail:/usr/sbin/nologin



news:x:9:9:news:/var/spool/news:/usr/sbin/nologin
uucp:x:10:10:uucp:/var/spool/uucp:/usr/sbin/nologin
proxy:x:13:13:proxy:/bin:/usr/sbin/nologin
www-data:x:33:33:www-data:/var/www:/usr/sbin/nologin
backup:x:34:34:backup:/var/backups:/usr/sbin/nologin
list:x:38:38:Mailing List Manager:/var/list:/usr/sbin/nologin
irc:x:39:39:ircd:/var/run/ircd:/usr/sbin/nologin
gnats:x:41:41:Gnats Bug-Reporting System (admin):/var/lib/gnats:/usr/sbin/nologin

nobody:x:65534:65534:nobody:/nonexistent:/usr/sbin/nologin
syslog:x:101:104::/home/syslog:/bin/false
messagebus:x:102:106::/var/run/dbu

In [103]:
# let's now shorten it more:

for one_line in open('linux-etc-passwd.txt'):
    print(one_line.strip()) 

# This is a comment
# You should ignore me
root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
bin:x:2:2:bin:/bin:/usr/sbin/nologin
sys:x:3:3:sys:/dev:/usr/sbin/nologin
sync:x:4:65534:sync:/bin:/bin/sync
games:x:5:60:games:/usr/games:/usr/sbin/nologin
man:x:6:12:man:/var/cache/man:/usr/sbin/nologin
lp:x:7:7:lp:/var/spool/lpd:/usr/sbin/nologin
mail:x:8:8:mail:/var/mail:/usr/sbin/nologin



news:x:9:9:news:/var/spool/news:/usr/sbin/nologin
uucp:x:10:10:uucp:/var/spool/uucp:/usr/sbin/nologin
proxy:x:13:13:proxy:/bin:/usr/sbin/nologin
www-data:x:33:33:www-data:/var/www:/usr/sbin/nologin
backup:x:34:34:backup:/var/backups:/usr/sbin/nologin
list:x:38:38:Mailing List Manager:/var/list:/usr/sbin/nologin
irc:x:39:39:ircd:/var/run/ircd:/usr/sbin/nologin
gnats:x:41:41:Gnats Bug-Reporting System (admin):/var/lib/gnats:/usr/sbin/nologin

nobody:x:65534:65534:nobody:/nonexistent:/usr/sbin/nologin
syslog:x:101:104::/home/syslog:/bin/false
messagebus:x:102:106::/var/run/dbu

In [107]:
# how can I print only the usernames (i.e., the values before the first : on each line)
# from this file?

for one_line in open('linux-etc-passwd.txt'):  # file is in the same directory as Jupyter is running
    if one_line[0] == '#':
        continue     # go onto the next line if this line is a comment

    if one_line.strip() == '':
        continue     # go onto the next line if this line is empty
    
    # fields = one_line.split(':')   # this returns a list of strings
    # print(fields[0])

    print(one_line.split(':')[0])    # split the line, get a list based on it, and return index 0

root
daemon
bin
sys
sync
games
man
lp
mail
news
uucp
proxy
www-data
backup
list
irc
gnats
nobody
syslog
messagebus
landscape
jci
sshd
user
reuven
postfix
colord
postgres
dovecot
dovenull
postgrey
debian-spamd
memcache
genadi
shira
atara
shikma
amotz
mysql
clamav
amavis
opendkim
gitlab-redis
gitlab-psql
git
opendmarc
dkim-milter-python
deploy
redis


In [108]:
# what does "with" do? 
# let's look at some code for opening a file:

f = open('linux-etc-passwd.txt')
for one_line in f:
    print(one_line)  # one \n from print, another \n from the line itself
f.close()    

# This is a comment

# You should ignore me

root:x:0:0:root:/root:/bin/bash

daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin

bin:x:2:2:bin:/bin:/usr/sbin/nologin

sys:x:3:3:sys:/dev:/usr/sbin/nologin

sync:x:4:65534:sync:/bin:/bin/sync

games:x:5:60:games:/usr/games:/usr/sbin/nologin

man:x:6:12:man:/var/cache/man:/usr/sbin/nologin

lp:x:7:7:lp:/var/spool/lpd:/usr/sbin/nologin

mail:x:8:8:mail:/var/mail:/usr/sbin/nologin







news:x:9:9:news:/var/spool/news:/usr/sbin/nologin

uucp:x:10:10:uucp:/var/spool/uucp:/usr/sbin/nologin

proxy:x:13:13:proxy:/bin:/usr/sbin/nologin

www-data:x:33:33:www-data:/var/www:/usr/sbin/nologin

backup:x:34:34:backup:/var/backups:/usr/sbin/nologin

list:x:38:38:Mailing List Manager:/var/list:/usr/sbin/nologin

irc:x:39:39:ircd:/var/run/ircd:/usr/sbin/nologin

gnats:x:41:41:Gnats Bug-Reporting System (admin):/var/lib/gnats:/usr/sbin/nologin



nobody:x:65534:65534:nobody:/nonexistent:/usr/sbin/nologin

syslog:x:101:104::/home/syslog:/bin/false

messagebu

In [109]:
# let's switch to using "with"
# here's the above code using "with"

# (1) we say with OBJECT as VARIABLE
# (2) everything you want to do with the file is indented, in the block
# (3) when you exit the "with" block, the file is automatically closed -- no need to do it yourself

with open('linux-etc-passwd.txt') as f:
    for one_line in f:
        print(one_line)  # one \n from print, another \n from the line itself

# This is a comment

# You should ignore me

root:x:0:0:root:/root:/bin/bash

daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin

bin:x:2:2:bin:/bin:/usr/sbin/nologin

sys:x:3:3:sys:/dev:/usr/sbin/nologin

sync:x:4:65534:sync:/bin:/bin/sync

games:x:5:60:games:/usr/games:/usr/sbin/nologin

man:x:6:12:man:/var/cache/man:/usr/sbin/nologin

lp:x:7:7:lp:/var/spool/lpd:/usr/sbin/nologin

mail:x:8:8:mail:/var/mail:/usr/sbin/nologin







news:x:9:9:news:/var/spool/news:/usr/sbin/nologin

uucp:x:10:10:uucp:/var/spool/uucp:/usr/sbin/nologin

proxy:x:13:13:proxy:/bin:/usr/sbin/nologin

www-data:x:33:33:www-data:/var/www:/usr/sbin/nologin

backup:x:34:34:backup:/var/backups:/usr/sbin/nologin

list:x:38:38:Mailing List Manager:/var/list:/usr/sbin/nologin

irc:x:39:39:ircd:/var/run/ircd:/usr/sbin/nologin

gnats:x:41:41:Gnats Bug-Reporting System (admin):/var/lib/gnats:/usr/sbin/nologin



nobody:x:65534:65534:nobody:/nonexistent:/usr/sbin/nologin

syslog:x:101:104::/home/syslog:/bin/false

messagebu

In [111]:
# put r before the opening quote and all backslashes will be doubled
f = open(r"C:\Users\sXXXX\Desktop\linux-etc-passwd.txt")
print(f.read())
f.close    

FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\sXXXX\\Desktop\\linux-etc-passwd.txt'

# Next up

1. Practice with reading files
2. Make some reports from files
3. A little of writing to files

In [112]:
!ls *.txt

linux-etc-passwd.txt  mini-access-log.txt  nums.txt  shoe-data.txt  wcfile.txt


# Exercise: Sum numbers

1. The file `nums.txt` contains about 10 lines in it.  Each line could be blank, or could contain an integer. If it contains an integer, there might be whitespace (spaces, tabs, etc.) around it.
2. Set `total` to 0.
3. Go through `nums.txt`, one line at a time.
    - If there is only whitespace, then ignore the line
    - If there is an integer (with potential whitespace around it), then invoke `int` on the line, and add it to `total`.
4. Print `total`, which should be 83.

In [113]:
!cat nums.txt

5
	10     
	20
  	3
		   	20        

 25


In [120]:
total = 0

for one_line in open('nums.txt'):
    s = one_line.strip()     # remove the outer whitespace from the string, assign to s
    if s.isdigit():          # does s only contain digits?
        total += int(s)      # if so, turn it into an int and add to total

print(total)

83


In [119]:
total = 0

for one_line in open('nums.txt'):
    if one_line.strip().isdigit():
        total += int(one_line)
print(total)

83


In [121]:
s = '    a     b     c      '

s.strip()   # this will return a new string, without s's leading/trailing whitespace

'a     b     c'

In [123]:
# what if I want to remove all whitespace, not just on the outside?
# then I can use str.replace(' ', '')

s.replace(' ', '')   # every time you see ' ', replace it with ''

'abc'

In [124]:
!head mini-access-log.txt

67.218.116.165 - - [30/Jan/2010:00:03:18 +0200] "GET /robots.txt HTTP/1.0" 200 99 "-" "Mozilla/5.0 (Twiceler-0.9 http://www.cuil.com/twiceler/robot.html)"
66.249.71.65 - - [30/Jan/2010:00:12:06 +0200] "GET /browse/one_node/1557 HTTP/1.1" 200 39208 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
65.55.106.183 - - [30/Jan/2010:01:29:23 +0200] "GET /robots.txt HTTP/1.1" 200 99 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)"
65.55.106.183 - - [30/Jan/2010:01:30:06 +0200] "GET /browse/one_model/2162 HTTP/1.1" 200 2181 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)"
66.249.71.65 - - [30/Jan/2010:02:07:14 +0200] "GET /browse/browse_applet_tab/2593 HTTP/1.1" 200 10305 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.71.65 - - [30/Jan/2010:02:10:39 +0200] "GET /browse/browse_files_tab/2499?tab=true HTTP/1.1" 200 446 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.65.12 - - [30/J

# Exercise: Count IP addresses

The goal of this exercise is to create a dict whose keys are IP addresses (strings), the numbers that represent locations on the Internet.  The values will be integers, telling us how many times each address appears in `mini-access-log.txt`.

1. Create an empty dict called `counts`.
2. Go through `mini-access-log.txt` one line at a time in a `for` loop.
3. Retrieve the IP address from the current line. Examine the lines to understand how you might do that.
4. If the address is new, add a new key-value pair to `counts`.
5. If the address is already in `counts`, increase the count by 1.
6. Use a `for` loop to iterate over `counts`, showing each IP address and how often it appeared in the file.

In [125]:
# the current directory is wherever you ran Jupyter
# you can get it with the magic %pwd command

%pwd

'/Users/reuven/Courses/Current/oreilly-2023-07July'

In [135]:
!head -4 mini-access-log.txt

67.218.116.165 - - [30/Jan/2010:00:03:18 +0200] "GET /robots.txt HTTP/1.0" 200 99 "-" "Mozilla/5.0 (Twiceler-0.9 http://www.cuil.com/twiceler/robot.html)"
66.249.71.65 - - [30/Jan/2010:00:12:06 +0200] "GET /browse/one_node/1557 HTTP/1.1" 200 39208 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
65.55.106.183 - - [30/Jan/2010:01:29:23 +0200] "GET /robots.txt HTTP/1.1" 200 99 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)"
65.55.106.183 - - [30/Jan/2010:01:30:06 +0200] "GET /browse/one_model/2162 HTTP/1.1" 200 2181 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)"


In [129]:
counts = {}   # new, empty dict

for one_line in open('mini-access-log.txt'):
    ip_address = one_line.split()[0]

    if ip_address in counts:     # have we already seen this IP address?
        counts[ip_address] += 1
    else:
        counts[ip_address] = 1

In [131]:
for key, value in counts.items():
    print(f'{key}: {value}')

67.218.116.165: 2
66.249.71.65: 3
65.55.106.183: 2
66.249.65.12: 32
65.55.106.131: 2
65.55.106.186: 2
74.52.245.146: 2
66.249.65.43: 3
65.55.207.25: 2
65.55.207.94: 2
65.55.207.71: 1
98.242.170.241: 1
66.249.65.38: 100
65.55.207.126: 2
82.34.9.20: 2
65.55.106.155: 2
65.55.207.77: 2
208.80.193.28: 1
89.248.172.58: 22
67.195.112.35: 16
65.55.207.50: 3
65.55.215.75: 2


# How I approached this problem:

1. I saw that each line in the file starts with an IP address, followed by a space. Everything, from that space onwards in each line, can be thrown away. I just care about the IP address.
2. How can I get the IP address, if it is before the first space on each line? I can use split, which returns a list of strings based on the line. If I call `split()`, it'll use the whitespace, and the resulting list's index 0 will be the IP address.
3. I then take the IP address I found and stick it into `ip_address`
4. Then I check to see if `ip_address` is already a key in `counts`
    - If so, I can just increment the count by 1
    - If not, then I add a new key-value pair, with a count of 1


In [132]:
# if you want to see how to sort a dictionary (or anything else), 
# I gave a talk at Euro Python 2020 called "How to sort anything"

In [134]:
for one_line in open('mini-access-log.txt'):
    print(one_line.split())


['67.218.116.165', '-', '-', '[30/Jan/2010:00:03:18', '+0200]', '"GET', '/robots.txt', 'HTTP/1.0"', '200', '99', '"-"', '"Mozilla/5.0', '(Twiceler-0.9', 'http://www.cuil.com/twiceler/robot.html)"']
['66.249.71.65', '-', '-', '[30/Jan/2010:00:12:06', '+0200]', '"GET', '/browse/one_node/1557', 'HTTP/1.1"', '200', '39208', '"-"', '"Mozilla/5.0', '(compatible;', 'Googlebot/2.1;', '+http://www.google.com/bot.html)"']
['65.55.106.183', '-', '-', '[30/Jan/2010:01:29:23', '+0200]', '"GET', '/robots.txt', 'HTTP/1.1"', '200', '99', '"-"', '"msnbot/2.0b', '(+http://search.msn.com/msnbot.htm)"']
['65.55.106.183', '-', '-', '[30/Jan/2010:01:30:06', '+0200]', '"GET', '/browse/one_model/2162', 'HTTP/1.1"', '200', '2181', '"-"', '"msnbot/2.0b', '(+http://search.msn.com/msnbot.htm)"']
['66.249.71.65', '-', '-', '[30/Jan/2010:02:07:14', '+0200]', '"GET', '/browse/browse_applet_tab/2593', 'HTTP/1.1"', '200', '10305', '"-"', '"Mozilla/5.0', '(compatible;', 'Googlebot/2.1;', '+http://www.google.com/bot.htm

In [136]:
# what txt files do you see in the current directory?

!dir *.txt

linux-etc-passwd.txt  mini-access-log.txt  nums.txt  shoe-data.txt  wcfile.txt


# Writing to files

You will almost certainly read from *many* more files than write to them. But you can still know how to write to a file.

When we `open` a file, the assumption is that we want to read from it. However, we can pass a second argument to `open`, the letter `w`, which means: We want to open this file for writing. And yes, computers require that you decide between reading from a file and writing to a file.

In other words, to open a file for reading, I say:

    open(filename, 'r')  # if you want to be very explicit
    open(filename)       # the default

To open a file for writing, you say:

    open(filename, 'w')   # must pass 'w'!

If you open a file for writing, then after you do so, one of two things has happened:

1. The file now exists, and you can write to it. Also, it currently contains 0 bytes.
2. You cannot open the file for some reason, and the program exited with an exception.

What if the file already existed? It now contains zero bytes.

Also: You can write to a file (when you've opened it for writing in 'w' mode) with the `write` method. This is similar to `print`, but doesn't add a `'\n'` after each line.

Finally: You really must close a file when you're done writing to it, to ensure the contents were written right away. Meaning: Use `with` when you're writing to a file, even if you don't use it when reading.

In [138]:
with open('myfile.txt', 'w') as f:    # open the file for writing, assign to f
    f.write('abcd\n')                 # write any text I want to the file
    f.write('another line!\n')
    f.write('end of the road\n')
    # at the end of the with block, the file is automatically closed

In [139]:
!cat myfile.txt

abcd
another line!
end of the road


In [140]:
# AK : check for a file before writing to it

# open has a lot of options:
# - r (for reading)
# - w (for writing)
# - a (for appending -- write, but add to what's there already, at the end)
# - x (for writing -- but if the file already exists, give an error. This avoids 'w' destruction)

# you can also use os.path.exists to find out if a file already exists

import os
os.path.exists('/etc/passwd')

True

In [141]:
os.path.exists('unicorns')

False

# Next time: Functions

- Writing them
- Calling them
- Arguments and parameters
- A little about local vs. global variables