# Day 2

## Day 2 Agenda
* __`enumerate()/zip()`__
* list comprehensions
* tuples
* dictionaries
* explaining __`this.py`__
* sets
* file I/O

## "Pythonic"

In [1]:
cars = ['Tesla', 'Fisker', 'Rivian', 'Lordstown']

In [2]:
i = 0
for car in cars: # for thing in container
    print('index', i, 'is', car)
    i += 1

index 0 is Tesla
index 1 is Fisker
index 2 is Rivian
index 3 is Lordstown


## __`enumerate()`__
* a builtin function which associates an index with each item in an iterable
* returns an _enumerate_ object
* return two things: the index AND the item


In [13]:
import string

for index, car in enumerate(cars, start=5):
    print('car maker', string.ascii_uppercase[index], 'is', car)

car maker F is Tesla
car maker G is Fisker
car maker H is Rivian
car maker I is Lordstown


In [16]:
type(enumerate(cars))

enumerate

In [18]:
print(1, 2, 3, 4, 5, 'f')

1 2 3 4 5 f


## __`zip(*iterables)`__ # 0 or more containers
* builtin function which matches up each item in an iterable with the corresponding item in the other iterable(s)
* technically creates an iterator that aggregates elements from each iterable
* why is it called __`zip`__?

In [19]:
first_names = ['Dave', 'Bruce', 'Taylor']
last_names = ['W-S', 'Lee', 'Swift']
employee_nums = [3456, 1, 2]

for first, last, num in zip(first_names, last_names, employee_nums):
    print(first, last, num)

Dave W-S 3456
Bruce Lee 1
Taylor Swift 2


In [25]:
stooges = ['Larry', 'Moe', 'Curly']
marxbros = ['Groucho', 'Harpo', 'Chico', 'Zeppo']

for stooge, marx in zip(stooges, marxbros):
    print(stooge, marx)

Larry Groucho
Moe Harpo
Curly Chico


In [27]:
import itertools # module that helps with iteration

stooges = ['Larry', 'Moe', 'Curly']
marxbros = ['Groucho', 'Harpo', 'Chico', 'Zeppo']

for stooge, marx in itertools.zip_longest(stooges, marxbros, fillvalue='***'):
    print(stooge, marx)

Larry Groucho
Moe Harpo
Curly Chico
*** Zeppo


# List Comprehensions

## List Comprehensions ("listcomps")
* quick/compact way to build a list
* "more readable"/faster
* which is easier to read?

In [9]:
fruits = 'apple lemon cherry fig lime watermelon abcd'.split() # Pythonic

fruit_lengths = [len(fruit) for fruit in fruits]

fruit_lengths

[5, 5, 6, 3, 4, 10, 4]

In [2]:
fruit_lengths = [] # empty to start

for fruit in fruits:
    fruit_lengths.append(len(fruit))
    
print(fruit_lengths)

[5, 5, 6, 3, 4, 10]


In [29]:
fruit_lengths = [''.join(sorted(list(fruit))) for fruit in fruits]

print(fruit_lengths)

['aelpp', 'elmno', 'cehrry', 'fgi', 'eilm', 'aeelmnortw', 'abcd']


In [28]:
fruit_lengths = []

for fruit in fruits:
    letters = list(fruit)
    letters.sort()
    joined = ''.join(letters)
    fruit_lengths.append(joined)
    
fruit_lengths

['aelpp', 'elmno', 'cehrry', 'fgi', 'eilm', 'aeelmnortw', 'abcd']

In [13]:
list(fruits[0]) # => return an "exploded" list of the charcters in fruits[0]
sorted(list(fruits[0])) # => takes that list as input and returns (output) a sorted list of those

['a', 'e', 'l', 'p', 'p']

In [16]:
mylist = [1, 3, 2, -5]
mylist.sort() # mutator that sorts the list in place
mylist

[-5, 1, 2, 3]

In [18]:
[1, 3, 2, -5].sort() # does not return anything

In [19]:
sorted([1, 3, 2, -5])

[-5, 1, 2, 3]

## List Comprehensions (cont'd)
* listcomps can generate a list from the Cartesian product of two or more iterables

In [31]:
colors = ['black', 'white']
sizes = ['S', 'M', 'L', 'XL']

In [40]:
tshirts = [[size, color] for size in sizes
                               for color in colors]
tshirts

[['S', 'black'],
 ['S', 'white'],
 ['M', 'black'],
 ['M', 'white'],
 ['L', 'black'],
 ['L', 'white'],
 ['XL', 'black'],
 ['XL', 'white']]

In [39]:
print(type(1))

<class 'int'>


In [42]:
string = 'alphabet soup tastes great!'

In [43]:
print(list(string))

['a', 'l', 'p', 'h', 'a', 'b', 'e', 't', ' ', 's', 'o', 'u', 'p', ' ', 't', 'a', 's', 't', 'e', 's', ' ', 'g', 'r', 'e', 'a', 't', '!']


In [52]:
letters_wo_vowels = [char for char in string if char not in 'aeiou!']
print(letters_wo_vowels)
print(''.join(letters_wo_vowels))

['l', 'p', 'h', 'b', 't', ' ', 's', 'p', ' ', 't', 's', 't', 's', ' ', 'g', 'r', 't']
lphbt sp tsts grt


In [53]:
joined = ''
for letter in letters_wo_vowels:
    print('concatenating', letter)
    joined = joined + letter
    
print(joined)

concatenating l
concatenating p
concatenating h
concatenating b
concatenating t
concatenating  
concatenating s
concatenating p
concatenating  
concatenating t
concatenating s
concatenating t
concatenating s
concatenating  
concatenating g
concatenating r
concatenating t
lphbt sp tsts grt


## Lab: List Comprehensions
*  Start with Cartesian product example (colors x sizes of t-shirts) and add a third list, __`sleeves = ['short', 'long']`__ then write a new listcomp which generates the Cartesian product __`colors x sizes x sleeves`__. __`tshirts`__ should look like this:<pre><b>
    [['black', 'S', 'short'],
     ['black', 'S', 'long'],
     ['black', 'M', 'short'],
     ['black', 'M', 'long'],
     ['black', 'L', 'short'],
     ['black', 'L', 'long'],
     ['white', 'S', 'short'],
     ['white', 'S', 'long'],
     ['white', 'M', 'short'],
     ['white', 'M', 'long'],
     ['white', 'L', 'short'],
     ['white', 'L', 'long']]
     
 </b></pre>
* Use a list comprehension to create a list of the squares of the integers from 1 to 25 (i.e, 1, 4, 9, 16, …, 625)
* Given a list of words, create a second list which contains all the words from the first list which do not end with a vowel
* Use a list comprehension to create a list of the integers from 1 to 100 which are not divisible by 5
* Use a list comprehension and __`zip()`__ to create a list of lists, where the list items are name and ID number that you grabbed from separate lists of names and ID numbers
  * start with a list of, say, 5 names ['John', 'Mary', 'Edward', 'Linda', 'Dinesh']
  * and a list of, say, 5 ID numbers [1003, 2043, 8762, 7862, 1093]
  * additional wrinkle: do not include any names whose corresponding ID is -1

In [1]:
colors = ['black', 'white']
sizes = ['S', 'M', 'L', 'XL']
sleeves = ['short', 'long']
           
tshirts = [[color, size, sleeve] for color in colors
                                    for size in sizes
                                        for sleeve in sleeves]
tshirts

[['black', 'S', 'short'],
 ['black', 'S', 'long'],
 ['black', 'M', 'short'],
 ['black', 'M', 'long'],
 ['black', 'L', 'short'],
 ['black', 'L', 'long'],
 ['black', 'XL', 'short'],
 ['black', 'XL', 'long'],
 ['white', 'S', 'short'],
 ['white', 'S', 'long'],
 ['white', 'M', 'short'],
 ['white', 'M', 'long'],
 ['white', 'L', 'short'],
 ['white', 'L', 'long'],
 ['white', 'XL', 'short'],
 ['white', 'XL', 'long']]

In [4]:
squares = [num ** 2 for num in range(1, 26)]
print(squares)

[1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121, 144, 169, 196, 225, 256, 289, 324, 361, 400, 441, 484, 529, 576, 625]


In [6]:
words = 'eggs cheese milk apple cherry pancakes banana'.split()
words_no_vowel = [word for word in words
                          if len(word) % 2 == 0]
print(words_no_vowel)

['eggs', 'cheese', 'milk', 'cherry', 'pancakes', 'banana']


In [8]:
print(list(range(1, 101)))

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100]


In [9]:
no_divis_by_5 = [num for num in range(1, 101)
                        if num % 5]
print(no_divis_by_5)

[1, 2, 3, 4, 6, 7, 8, 9, 11, 12, 13, 14, 16, 17, 18, 19, 21, 22, 23, 24, 26, 27, 28, 29, 31, 32, 33, 34, 36, 37, 38, 39, 41, 42, 43, 44, 46, 47, 48, 49, 51, 52, 53, 54, 56, 57, 58, 59, 61, 62, 63, 64, 66, 67, 68, 69, 71, 72, 73, 74, 76, 77, 78, 79, 81, 82, 83, 84, 86, 87, 88, 89, 91, 92, 93, 94, 96, 97, 98, 99]


In [11]:
names = ['John', 'Mary', 'Edward', 'Linda', 'Dinesh']
nums = [1003, 2043, 8762, 7862, 1093]
employees = [[name, num] for name, num in zip(names, nums)]
print(employees)
for name, num in employees:
    print(name, num)

[['John', 1003], ['Mary', 2043], ['Edward', 8762], ['Linda', 7862], ['Dinesh', 1093]]
John 1003
Mary 2043
Edward 8762
Linda 7862
Dinesh 1093


In [14]:
names = ['John', 'Mary', 'Edward', 'Linda', 'Dinesh']
nums = [1003, 2043, -8762, 7862, -1093]
employees = [[name, num] for name, num in zip(names, nums)
                            if num > 0]
print(employees)

[['John', 1003], ['Mary', 2043], ['Linda', 7862]]


In [None]:
word = 'apple'
word[-1] != ('a' or 'e' or 'i' or 'o' or 'u')

In [25]:
True or False, 1 or 15, 'hello' or ''

(True, 1, 'hello')

In [28]:
x = 0
if x < 1:
    print('do something')

do something


In [29]:
if 5: # if True, but actually under the hood, 5
    print('do this')

do this


In [31]:
True and False

False

In [34]:
if 5 and 0:
    print('ok')

In [35]:
5 and True

True

In [36]:
True and 5

5

In [38]:
'a' and 'e'

'e'

In [41]:
if 0 or 5:
    print('nope!')

nope!


In [42]:
'' or [1, 2, 3]

[1, 2, 3]

In [43]:
string = ''
if not string:
    print('empty string')

empty string


In [44]:
if 7 > 4:
    print('yep')

yep


In [48]:
if 7:
    print('yep')

yep


In [49]:
123 or 456

123

In [60]:
def f():
    print('run the function')
    return 100

In [63]:
x = 4
#...
if x == 4 or f() >= 100:
    print('do this if statement')

do this if statement


In [64]:
this = print('hi')

hi


In [71]:
0b1000 ^ 0b1101

5

## listcomps recap
* keep them short
* they are not _list incomprehensions_, so keep them simple
* use line breaks since they are ignored inside [] (and (), {}) and you therefore don't need the ugly '\\' line continuation character
* note that __`for`__ loops do many things (e.g., scan a sequence to count or select items), computing aggregates (sum, averages) or any number of other processing tasks
  * in contrast, listcomps do ONE thing–generate lists!

In [83]:
%%python2
print(range(10))
print(xrange(10))

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
xrange(10)


In [86]:
print(range(10))

range(0, 10)


In [84]:
list(range(10))

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [78]:
cities = ['Tokyo', 'Frankfurt', 'Chennai', 'Beijing']
names = ['Smith', 'Jones', 'Gupta', 'Chen']
depts = ['accounting', 'engineering', 'sales', 'human resources']
for thing in zip(cities, names, depts):
    print(thing)

('Tokyo', 'Smith', 'accounting')
('Frankfurt', 'Jones', 'engineering')
('Chennai', 'Gupta', 'sales')
('Beijing', 'Chen', 'human resources')


In [80]:
x, y, z = 5, 7, -2
print(x, y, z)

5 7 -2


In [75]:
for thing in enumerate(cities):
    print(thing)

(0, 'Tokyo')
(1, 'Frankfurt')
(2, 'Chennai')


# Tuples

## Tuples
* immutable data type
* typically heterogeneous (cf. lists)
* generally imply some structure
 * tuples typically represent a single object, but multiple aspects/attributes of it
 * if lists are typically used like the columns of a spreadsheet...
   * then tuples are typically the rows...

In [87]:
t = () # empty tuple (cf. empty list...[])
t

()

In [88]:
type(t)

tuple

In [96]:
t = (1,) # singleton tuple

In [95]:
t

(1,)

In [97]:
t = 'Jones', 'John', 1023, True # no parens
t

('Jones', 'John', 1023, True)

In [98]:
# tuple unpacking
last_name, first_name, employee_num, full_time = t

In [99]:
employee_num # type(employee_num)

1023

In [100]:
something = input('Enter something: ')
as_a_list = something.split() # split() always returns a list
as_a_tuple = tuple(as_a_list) # tuple() always returns a tuple

Enter something:  Swift Taylor 1 True


In [101]:
print(as_a_list, as_a_tuple, sep='\n')

['Swift', 'Taylor', '1', 'True']
('Swift', 'Taylor', '1', 'True')


In [103]:
person = 'Sara Breedlove', 1867, 'Louisiana'

In [104]:
person[1]

1867

In [105]:
person[1] = 1868

TypeError: 'tuple' object does not support item assignment

In [107]:
# a tuple may contain a mutable object...

person = 'Curie', 'Marie', 1867, []
person

('Curie', 'Marie', 1867, [])

In [4]:
person[-1].extend(['physicist', 'chemist'])
person

('Curie',
 'Marie',
 1867,
 ['physicist', 'chemist', 'physicist', 'chemist', 'physicist', 'chemist'])

In [5]:
('Curie', 'Marie', 1867, []) + ('Polish',)

('Curie', 'Marie', 1867, [], 'Polish')

## Lab: Tuples
* Create a tuple representing a city w/fields of your own choosing (e.g., city name, state/country, population, elevation, etc.)
* "Add" a field to the tuple–since tuples are immutable, you will have to do this by concatenating tuples
* Using the _in_ operator, check to see if a particular value is in the tuple
* Using the __`.index()`__ method, find the position of a particular value in the tuple
* Since we've already looked a bit at reading from files, try reading from a text file which is laid out like this: __`last_name,first_name,employee_num,home_office`__
  * assume 1+ lines like that and put each line into a tuple and then print it out
  * so... __`Roche,Vincent,1,California`__ would become __`('Roche', 'Vincent', 1, 'California')`__ and be printed out as such
  * perhaps make a function called __`process_line_from_file()`__ which (reads) processes a line and returns a tuple with the info the line
   

In [26]:
city = 'Kochi', 'Kerala', 602_046, 0 # city, state, population, elevation
sq_miles = 36.63 # square miles
city += (sq_miles,) # create a new tuple from city and the fields of the second tuple
city

('Kochi', 'Kerala', 602046, 0, 36.63)

In [16]:
tuple1 = 'foo', 'bar', 19, True
tuple2 = 'baz', 'quux', 23, False
tuple1 + tuple2

('foo', 'bar', 19, True, 'baz', 'quux', 23, False)

In [28]:
'Kerala' in city

True

In [36]:
city.index(602_046)

2

In [42]:
query = input('Enter the thing you want to search for: ')

Enter the thing you want to search for:  Kerala


In [45]:
if query in city:
    print(city.index(query))

1


In [34]:
def process_line(line):
    result = line.strip().split(',') # remove trailing whitespace and make it into a (mutable) list
    result[2] = int(result[2]) # convert employee number from string to int
    
    return tuple(result)

employee_file = open('employees.csv') # 'r' is the default
for line in employee_file: # read is done automagically here
    print(process_line(line))

('Roche', 'Vincent', 1, 'California')
('Swift', 'Taylor', 2, 'Frankfurt')
('Lee', 'Bruce', 3, 'Hong Kong')


## Recap: Tuples
* not just "constant lists" 
 (see http://jtauber.com/blog/2006/04/15/python_tuples_are_not_just_constant_lists)
* remember that lists are (typically) ordered sequences of homogeneous values (i.e., Excel/DB column)
* and tuples typically imply some structure and refer to multiple attributes of ONE item (person, country, building, etc.)
 * i.e., database/Excel row

# Dictionaries



# Dictionaries
* "unordered" grouping of key/value pairs
* sometimes called a "map", "hashmap", or "associative array"

In [46]:
d = {} # empty dict

In [47]:
d = { 'X': 10, 'V': 5, 'I': 1 } # can be initialized when declared

In [48]:
d

{'X': 10, 'V': 5, 'I': 1}

In [49]:
d['L'] = 50 # add something to the dict
print(d)

{'X': 10, 'V': 5, 'I': 1, 'L': 50}


In [None]:
# iterating through a dict iterates through the keys 
for key in d: # for thing in container
    print(key, end=' ')

In [None]:
# ...of course we can print the values while iterating
for thing in d:
    print(thing, d[thing])

In [61]:
sbux_dict = {'venti': 20, 'tall': 12, 'grande': 16}
print(sbux_dict)

{'venti': 20, 'tall': 12, 'grande': 16}


In [52]:
print(sbux_dict.keys(), sbux_dict.values(),
      sbux_dict.items(), sep='\n')

dict_keys(['venti', 'tall', 'grande'])
dict_values([20, 12, 16])
dict_items([('venti', 20), ('tall', 12), ('grande', 16)])


In [53]:
total_ounces = 0
for amount in sbux_dict.values():
    total_ounces += amount

total_ounces

48

In [54]:
sum(sbux_dict.values())

48

In [59]:
%%python2
from __future__ import print_function
sbux_dict = {'venti': 20, 'tall': 12, 'grande': 16}
keys = sbux_dict.keys()
print(keys)
sbux_dict['trenta'] = 31
print(keys)

['tall', 'venti', 'grande']
['tall', 'venti', 'grande']


## Dictionaries: View Objects
* __`keys()`__, __`values()`__, and __`items()`__ are view objects
* view objects provide a dynamic window into the dictionary

In [62]:
keys = sbux_dict.keys()
keys

['venti', 'tall', 'grande']

In [63]:
# keys will change automagically after we add to the dict
print(keys)
sbux_dict['trenta'] = 31
print(keys)

['venti', 'tall', 'grande']
['venti', 'tall', 'grande']


In [58]:
sbux_dict

{'venti': 20, 'tall': 12, 'grande': 16, 'trenta': 31}

# __`get()`__: Dealing with missing dict values

In [2]:
d = {'foo': 'bar'}
d

{'foo': 'bar'}

In [3]:
d['foo']

'bar'

In [4]:
d['foot']

KeyError: 'foot'

In [6]:
if 'foot' in d: # is 'foot' a key in this dict
    print(d['foot'])
# or just... d.get('foot')

In [14]:
print(d.get('foot'))

bar


In [13]:
def fruit_size(fruit):
    fruit_size_dict = { 'apple' : 6,
                        'pear': 5,
                        'banana' : 3,
                        'fig': 1,
                        'watermelon' : 48,
                      }
    return fruit_size_dict[fruit]
    
fruits = 'apple pear banana fig watermelon'.split()
sorted(fruits, key=fruit_size)

['fig', 'banana', 'pear', 'apple', 'watermelon']

In [None]:
# what if we sort a dict?
for key in sorted(sbux_dict):
    print(key, sbux_dict[key])

In [9]:
sbux_dict = {'venti': 20, 'tall': 12, 'grande': 16}
sbux_dict

{'venti': 20, 'tall': 12, 'grande': 16}

In [10]:
sorted(sbux_dict)

['grande', 'tall', 'venti']

In [15]:
sorted(sbux_dict, key=sbux_dict.get)

['tall', 'grande', 'venti']

In [16]:
# In order to iterate in order, we have to sort the
# dict by value (as opposed to key)
# By default, sorted() will sort by key--
# usually not what we want!

for k in sorted(sbux_dict, key=sbux_dict.get):
    print(k, '=>', sbux_dict[k])

tall => 12
grande => 16
venti => 20


In [20]:
id(sbux_dict.get), id(sorted)

(140275997090992, 140276494898672)

In [22]:
type(sbux_dict.get), type(fruit_size)

(builtin_function_or_method, function)

## Removing items from a dict
* __`del`__ = remove an item from the dict
* __`dict.pop(key)`__ = remove item and return value
* __`dict.clear()`__ = empty out the dict

In [23]:
mydict = {'trenta': 31, 'grande': 16, 'venti': 20,
          'tall': 12}
print(mydict)

{'trenta': 31, 'grande': 16, 'venti': 20, 'tall': 12}


In [24]:
del mydict['trenta']
print(mydict)

{'grande': 16, 'venti': 20, 'tall': 12}


In [25]:
print(mydict.pop('venti'))

20


In [26]:
print(mydict)

{'grande': 16, 'tall': 12}


In [27]:
mydict.clear()
mydict

{}

In [29]:
print("{:b}".format(15))

1111


In [37]:
val = 15.54377433

In [38]:
# Python 2 style of formatted printing
print('val is %.3f, and two times val is %.5f' % (val, 2*val))

val is 15.544, and two times val is 31.08755


In [43]:
a, b = 1, 2
# Python 3 style of formatted printing
print('{0:5d} + {0:5d} = {1}'.format(a, b))

    1 +     1 = 2


In [49]:
# Python 3.6 f-string formatted printing
print(f'{a:5d} + {a:5d} = {a+a}')

    1 +     1 = 2


In [46]:
val = 15
print(f'{val:b}')

1111


In [47]:
bin(val)

'0b1111'

## Lab: dictionary
* use a dict to translate Roman numerals into their Arabic equivalents
1. load the dict with Roman numerals M (1000), D (500), C (100), L (50), X (10), V (5), I (1)
2. read in a Roman numeral
3. print Arabic equivalent
4. try it with MCLX = 1000 + 100 + 50 + 10 = 1160
4. __If you have time, deal with the case where a smaller number precedes a larger number, e.g., XC = 100 - 10 = 90, or MCM = 1000 + (1000-100) = 1900__
4. __MCMXCIX = 1999__

In [None]:
roman_to_arabic = {
    'M': 1000,
    'D': 500,
    'C': 100,
    'L': 50,
    'X': 10,
    'V': 5,
    'I': 1,
}
    
# 1. get a Roman numeral from user
# 2. take each digit and plug into dict to get Arabic value
# 3. put Arabic value into a list
# e.g., MCLX ... [1000, 100, 50, 10]
# 4. add them up = 1160, i.e., the built-in sum() function

# for part 2, where we consider subtraction
# e.g., MCMXCIX ... [1000, 100, 1000, 10, 100, 1, 10]
# 5. for each number in the list
# 6. if the number is less than the neighbor (i.e, number to the right), then
# 7. make that number negative
# then we have... [1000, -100, 1000, -10, 100, -1, 10] = 1999

# step 1
roman = input('Enter a Roman numeral: ')

arabic_vals = [roman_to_arabic.get(digit, 0) for digit in roman]

if 0 in arabic_vals:
    print('Bad digit, no biscuit!')
    
# step 2

print(arabic_vals)
print('first attempt:', sum(arabic_vals)) # Step 4

# Part 2: Deal with subtraction

# Here is a case where we DO need the index of the item in the list...
# ...because we have to look at the i-th item and the (i+1)-th item
# so for num in arabic_vals won't work...

# Step 5: iterate through the list and stop one short of end (otherwise we will "fall off")
for index in range(len(arabic_vals) - 1):
    # Step 6: if digit is LESS THAN digit which follows...
    if arabic_vals[index] < arabic_vals[index + 1]:
        arabic_vals[index] = -arabic_vals[index] # Step 7: make it negative

print(arabic_vals)
print('final attempt:', sum(arabic_vals)) # Step 4

Enter a Roman numeral:  MCAX


Bad digit, no biscuit!
[1000, 100, 0, 10]
first attempt: 1110
[1000, 100, 0, 10]
final attempt: 1110


## Dict Comprehension
* like a listcomp, a dictcomp creates a dict quickly

In [58]:
names = ['Sally', 'Bob', 'Martha', 'Dirk']
employee_ids = [345, 286, 453, 119]
id_dict = { name: emp_id + 1000
                   for name, emp_id in zip(names, employee_ids)}
print(id_dict)

{'Sally': 1345, 'Bob': 1286, 'Martha': 1453, 'Dirk': 1119}


In [59]:
d = { 'foo': 4, 'bar': -1, 'baz': -1, 'blah': 3, 'what': 2 }
print(d)

{'foo': 4, 'bar': -1, 'baz': -1, 'blah': 3, 'what': 2}


In [63]:
d.items()

dict_items([('foo', 4), ('blah', 3), ('what', 2)])

In [62]:

d = { key: val for key, val in d.items() 
                  if val != -1 }
print(d)

{'foo': 4, 'blah': 3, 'what': 2}


In [66]:
id_dict_inverse = { val : key for key, val in id_dict.items() }

In [65]:
id_dict.items()

dict_items([('Sally', 1345), ('Bob', 1286), ('Martha', 1453), ('Dirk', 1119)])

In [67]:
id_dict_inverse

{1345: 'Sally', 1286: 'Bob', 1453: 'Martha', 1119: 'Dirk'}

## Now we understand this code!

In [74]:
s = """Gur Mra bs Clguba, ol Gvz Crgref

Ornhgvshy vf orggre guna htyl.
Rkcyvpvg vf orggre guna vzcyvpvg.
Fvzcyr vf orggre guna pbzcyrk.
Pbzcyrk vf orggre guna pbzcyvpngrq.
Syng vf orggre guna arfgrq.
Fcnefr vf orggre guna qrafr."""

d = {}
for c in (65, 97):
    for i in range(26):
        d[chr(i+c)] = chr((i+13) % 26 + c)

print("".join([d.get(c, c) for c in s]))

The Zen of Python, by Tim Peters
# guvf n zhygv-yvar fgevat
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.


In [71]:
ord('A')

65

In [73]:
chr(65), chr(78)

('A', 'N')

# Sets

## Sets
* unordered collection, no duplicates
* kind of a one-trick pony–remove duplicates

In [None]:
s = { 'Annie', 'Betty', 'Cathy', 'Donna' }
print(s)

In [None]:
s.add('Ellen')
print(s)

In [None]:
s.add('Annie')
print(s)

In [None]:
# we can use the 'in' operator
if 'Annie' in s:
    print('Yep!')

## Deleting from a Set
* __`remove(item)`__: remove an item if it's in the set
* __`discard(item)`__: remove an item whether or not it's in the set
* __`pop()`__: pops a random element out of the set

In [None]:
print(s)

In [None]:
s.remove('Betty')

In [None]:
print(s)

In [None]:
s.discard('Loren')

In [None]:
print(s)

In [None]:
print(s.pop())
print(s)

In [None]:
while s: # while the set is non-empty
    print(s.pop())

## Lab: Sets
* Use a set to find all of the unique words in the input and print them out in sorted order
* If the user entered __There is no there there__, your program should print out 
   <pre><b>
   is
   no
   there
   </b></pre>
* Note that `There` and `there` should be counted as the same word.

## Sets Recap
* unordered
* no duplicates
* use __`in`__ to test for membership


# File I/O

## File I/O
* __`fileobj = open(filename, mode)`__
* mode is one or two letters
  * r = read
  * r+ = open for reading and writing
  * w = write (create/overwrite)
  * x = write, but only if file does not already exist
  * a = append, if file exists (unless a+, then create)
* second letter =
  * t = text file (default)
  * b = binary
* __`fileobj.close()`__

## File I/O: Open/Close

In [None]:
f = open('test.txt', 'r')

In [None]:
f = open('test.txt', 'w')
f.close()

In [None]:
!ls -l test.txt

In [None]:
f = open('test.txt', 'x')

## File I/O: Read/Write

In [None]:
poem = """TWO roads diverged in a yellow wood,
And sorry I could not travel both
And be one traveler, long I stood
And looked down one as far as I could
To where it bent in the undergrowth;

Then took the other, as just as fair,
And having perhaps the better claim,
Because it was grassy and wanted wear;
Though as for that the passing there
Had worn them really about the same,

And both that morning equally lay
In leaves no step had trodden black.
Oh, I kept the first for another day!
Yet knowing how way leads on to way,
I doubted if I should ever come back.

I shall be telling this with a sigh
Somewhere ages and ages hence:
Two roads diverged in a wood, and I—
I took the one less traveled by,
And that has made all the difference."""

len(poem)

In [None]:
f = open('poem.txt', 'w')
f.write(poem)

In [None]:
f.close()

In [None]:
f = open('poem.txt')
poem2 = f.read()
f.close()

In [None]:
poem == poem2

## File I/O: __`write()`__ vs. __`print()`__


In [None]:
f = open('poem.txt', 'w')
# another example of why print being a function is good
print(poem, file=f, end='') 
f.close()

In [None]:
f = open('poem.txt')
poem2 = f.read()
f.close()

In [None]:
poem == poem2

In [None]:
len(poem2)

## __`print(*objects, sep=' ', end='\n', file=sys.stdout, flush=False)`__
* __`sep`__ = separator (default is space)
* __`end`__ = what to print at end (default is newline)
* __`file`__ = where to print, default is screen
* __`flush`__ = whether to flush output buffer, default is no

## File I/O: How to Read Data
* __`read()`__: slurps up entire file at once
  * __`read(x)`__ reads a most __`x`__ bytes
* __`readline()`__: reads a line at a time
* __`readlines()`__ reads a line at a time and returns the lines as a list of strings
* or use an iterator…

In [None]:
poem = ''
f = open('poem.txt')
for line in f: # Python reads each line
    poem += line
f.close()

In [None]:
print(poem)

## File I/O: __`with`__ statement
* the __`with`__ statement sets up a temporary "context" and closes the file automatically so we don't have to bother with closing it

In [None]:
with open('poem.txt') as f1: # ~ f1 = open('poem.txt')
    poem2 = f1.read()
    # at this point file is open
    print('in with statement, f1.closed =', f1.closed)

In [None]:
poem == poem2

In [None]:
f1.closed

## Quick Lab: File I/O
* write a Python program which prompts the user for a filename, then opens that file and writes the contents of the file to a new file, in reverse order, i.e.,

<pre><b>
    Original file       Reversed file
    Line 1              Line 4
    Line 2              Line 3
    Line 3              Line 2
    Line 4              Line 1
</b></pre>

## Lab: File I/O + dicts
* write a Python program to read a file and count the number of occurrences of each word in the file
* use a __`dict`__, indexed by word, to count the occurrences
* remember __`d.get(key)`__ will return __`None`__ if there is no such key in the dict (vs. __`d[key]`__ which will throw an exception) and also the __`in`__ operator
  * or use a __`collections.defaultdict`__ if we've covered it
* treat __The__ and __the__ as the same word when counting
* print out words and counts, from most common to least common
* EXTRA: remove punctuation, so __Hamlet,__ == __Hamlet__ # refer back to "import this"
* Road Not Taken and Hamlet are in your materials

## File I/O: recap
* __`open()`__ returns file object
* __`close()`__ closes the file
* __`read()`__ reads bytes
* __`readline()`__ reads a line at a time
* __`readlines()`__ reads all lines–shouldn't be used
* can also iterate through a file object a line at a time
* __`with`__ statement sets up a temporary context (block) for file I/O and automatically closes file when block is exited

# End of Day 2