# Day 2

## Day 2 Agenda
* __`enumerate()/zip()`__
* list comprehensions
* tuples
* dictionaries
* explaining __`this.py`__
* sets
* file I/O

## "Pythonic"

In [None]:
stooges = ['Shemp', 'Moe', 'Larry', 'Curley']

In [None]:
i = 0
for stooge in stooges:
    print('index', i, 'is', stooge)
    i += 1

## __`enumerate()`__
* a builtin function which returns an _enumerate_ object from any iterable


In [None]:
for index, stooge in enumerate(stooges):
    print('index', index, 'is', stooge)

In [None]:
type(enumerate(stooges))

## __`zip(*iterable)`__
* builtin function which creates an iterator that aggregates elements from each iterable
* why is it called zip?

In [None]:
stooges = ['Larry', 'Moe', 'Curly']
marxbros = ['Groucho', 'Harpo', 'Chico']
for stooge, marx in zip(stooges, marxbros):
    print(stooge, marx)

In [None]:
stooges = ['Larry', 'Moe', 'Curly']
marxbros = ['Groucho', 'Harpo', 'Chico', 'Zeppo']
for stooge, marx in zip(stooges, marxbros):
    print(stooge, marx)

In [None]:
import itertools
stooges = ['Larry', 'Moe', 'Curly']
marxbros = ['Groucho', 'Harpo', 'Chico', 'Zeppo']
for stooge, marx in itertools.zip_longest(stooges, marxbros,
                                          fillvalue='****'):
    print(stooge, marx)

# List Comprehensions

## List Comprehensions ("listcomps")
* quick way to build a list
* more readable/faster
* which is easier to read?

In [None]:
string = 'ABCabc*'
ascii_codes = []
for char in string:
    ascii_codes.append(ord(char))
    
ascii_codes

In [None]:
string = 'ABCabc*'
ascii_codes = [ord(char) for char in string]

ascii_codes

## List Comprehensions (cont'd)
* listcomps can generate a list from the Cartesian product of two or more iterables

In [None]:
colors = ['black', 'white']
sizes = ['S', 'M', 'L']
tshirts = [[color, size] for color in colors
                         for size in sizes]

tshirts

In [None]:
# generate a list of all the consonants in a string,
# discarding the vowels and spaces
string = 'alphabet soup tastes great'
consonants = [char for char in string
                          if char not in 'aeiou ']

consonants

## Lab: List Comprehensions
*  Start with Cartesian product example (colors x sizes of t-shirts) and add a third list, sleeves = ['short', 'long'] then write a new listcomp which generates the Cartesian product colors x sizes x sleeves. tshirts should look like this:<pre><b>
    [['black', 'S', 'short'],
     ['black', 'S', 'long'],
     ['black', 'M', 'short'],
     ['black', 'M', 'long'],
     ['black', 'L', 'short'],
     ['black', 'L', 'long'],
     ['white', 'S', 'short'],
     ['white', 'S', 'long'],
     ['white', 'M', 'short'],
     ['white', 'M', 'long'],
     ['white', 'L', 'short'],
     ['white', 'L', 'long']]
     
 </b></pre>
*  Use a list comprehension to create a list of the squares of the integers from 1 to 25 (i.e, 1, 4, 9, 16, …, 625)

## listcomps recap
* keep them short
* use line breaks since they are ignored inside [] (and (), {}) and you therefore don't need the ugly '\' line continuation character
* note that __`for`__ loops do many things (e.g., scan a sequence to count or select items), computing aggregates (sum, averages) or any number of other processing tasks
  * in contrast, listcomps do ONE thing–generate lists!

# Tuples

## Tuples
* immutable data type
* typically heterogeneous (cf. lists)
* generally imply some structure

In [None]:
t = () # empty tuple
t

In [None]:
type(t)

In [3]:
t = ('hello',) # singleton tuple

In [4]:
t

('hello',)

In [5]:
t = 'Jones', 'Mary', 1023, True # no parens
t

('Jones', 'Mary', 1023, True)

In [12]:
last_name, first_name, employee_num, full_time = t # tuple unpacking

In [8]:
quot, remain = divmod(5, 2)

In [10]:
quot, remain

(2, 1)

In [None]:
employee_num

In [13]:
something = input()
t1 = list(something.split())
t2 = tuple(something.split())

hello


In [15]:
t1, t2

(['hello'], ('hello',))

In [None]:
print(t1, t2, sep='\n')

In [16]:
person = ('Gutzon Borglum', 1867, 'Idaho')

In [17]:
person[1]

1867

In [18]:
person[1] = 1868

TypeError: 'tuple' object does not support item assignment

In [19]:
# a tuple may contain a mutable object...

person = ('Curie', 'Marie', 1867, [])

In [20]:
person[3].extend('physicist chemist'.split())

In [21]:
person

('Curie', 'Marie', 1867, ['physicist', 'chemist'])

## Lab: Tuples
* Given a list of words, sort them by length of word, rather than alphabetically.
* To do this, first create a list of tuples of the form (len, word), where the first element is the length of the word.
* Next, sort the tuples.
* Finally, extract the words from the list of tuples into a new list which is now sorted by length of word. Try to use a list comprehension if you can.

## Recap: Tuples
* not just "constant lists" 
 (see http://jtauber.com/blog/2006/04/15/python_tuples_are_not_just_constant_lists)
* remember that lists are (typically) ordered sequences of homogeneous values
* and tuples typically imply some structure and refer to multiple attributes of ONE item (person, country, building, etc.)

# Dictionaries


# Dictionaries
* unordered grouping of key/value pairs
* sometimes called a "hash", "hashmap", or "associative array"

In [22]:
d = {} # empty dict

In [24]:
d = { 'X': 10, 'V': 5, 'I': 1 } # can be initialized when declared

In [25]:
d

{'I': 1, 'V': 5, 'X': 10}

In [26]:
d['L'] = 50
d

{'I': 1, 'L': 50, 'V': 5, 'X': 10}

In [27]:
# iterating through a dict iterates through the keys 
for thing in d:
    print(thing, end=' ')

X V I L 

In [28]:
# ...of course we can print the values while iterating
for thing in d:
    print(thing, d[thing])

X 10
V 5
I 1
L 50


In [29]:
mydict = {'trenta': 31, 'grande': 16, 'venti': 20}
mydict

{'grande': 16, 'trenta': 31, 'venti': 20}

In [30]:
print(mydict.keys(), mydict.values(), mydict.items(), sep='\n')

dict_keys(['trenta', 'grande', 'venti'])
dict_values([31, 16, 20])
dict_items([('trenta', 31), ('grande', 16), ('venti', 20)])


In [38]:
for key, value in mydict.items():
    print("{}:{}".format(key, value))
    print("%s: %s" % (key, value))

trenta:31
trenta: 31
grande:16
grande: 16
venti:20
venti: 20


In [None]:
total = 0
for amount in mydict.values():
    total += amount

total

## Dictionaries: View Objects
* __`keys()`__, __`values()`__, and __`items()`__ are view objects
* unlike lists, they provide a dynamic window into the dictionary
* view objects are new to Python 3

In [None]:
keys = mydict.keys()
keys

In [None]:
# keys will change automagically after we add to the dict
mydict['tall'] = 12
mydict, keys

In [42]:
list1 = ["a", "b", "c", 10]
for idx, i in enumerate(list1):
    print("{}:{}".format(idx, i))

0:a
1:b
2:c
3:10


## Dictionaries: __`enumerate()`__
* because dicts are unordered, __`enumerate()`__ isn't all that useful

In [None]:
for index, val in enumerate(mydict):
    print('index', index, 'is', val)

In [None]:
# We can iterate through the dict items, but remember that dict is unordered...
for key, val in mydict.items():
    print(key, '=>', val)

In [None]:
# In order to iterate in order, we have to sort the dict by value
# By default, sort() will sort by key–usually not what we want!

for k in sorted(mydict, key=mydict.get):
    print(k, '=>', mydict[k])

# __`get()`__/__`setdefault()`__: Dealing with missing dict values

In [43]:
d = {'foo': 'bar'}

In [44]:
d['foo']

'bar'

In [45]:
d['foot']

KeyError: 'foot'

In [48]:
d.get('foot')

In [49]:
if 'foot' in d:
    d['foot']
# or just... d.get('foot')

In [None]:
d.setdefault('foo', 23) # get the value of 'foo' or add 'foo' 
# to dict with value = 23
#if 'foo' in d:
    #val = d['foo']
#else:
    #d['foo'] = 23
    #val = 23

In [None]:
d

In [50]:
d.setdefault('foot', 23)
d

{'foo': 'bar', 'foot': 23}

In [52]:
d['foobar'] = 'foobar'
d.setdefault('foobar', 23)
d

{'foo': 'bar', 'foobar': 'foobar', 'foot': 23}

## Removing items from a dict
* __`del`__ = remove an item from the dict
* __`dict.pop(key)`__ = remove item and return value
* __`dict.clear()`__ = empty out the dict

In [53]:
mydict = {'trenta': 31, 'grande': 16, 'venti': 20, 'tall': 12}
mydict

{'grande': 16, 'tall': 12, 'trenta': 31, 'venti': 20}

In [54]:
del mydict['trenta']
mydict

{'grande': 16, 'tall': 12, 'venti': 20}

In [55]:
mydict.pop('venti')

20

In [56]:
mydict

{'grande': 16, 'tall': 12}

In [None]:
mydict.clear()
mydict

## Lab: dictionary
* use a dict to translate Roman numbers into their Arabic equivalents
1. load the dict with Roman numerals M (1000), D (500), C (100), L (50), X (10), V (5), I (1)
2. read in a Roman numeral
3. print Arabic equivalent
4. try it with MCLX = 1000 + 100 + 50 + 10 = 1160
4. __If you have time, deal with the case where a smaller number precedes a larger number, e.g., XC = 100 - 10 = 90, or MCM = 1000 + (1000-100) = 1900__

## Dict Comprehension
* like a listcomp, a dictcomp creates a dict quickly

In [58]:
names = ['Sally', 'Bob', 'Martha', 'Dirk']
employee_ids = [345, 286, 453, 119]
list(zip(names, employee_ids))

[('Sally', 345), ('Bob', 286), ('Martha', 453), ('Dirk', 119)]

In [59]:
id_dict = { name: emp_id + 1000
                   for name, emp_id in zip(names, employee_ids)}
id_dict

{'Bob': 1286, 'Dirk': 1119, 'Martha': 1453, 'Sally': 1345}

In [None]:
{ emp.name: emp for emp in employees}

In [60]:
d = { 'foo': 4, 'bar': -1, 'baz': -1, 'blah': 3, 'what': 2 }
d

{'bar': -1, 'baz': -1, 'blah': 3, 'foo': 4, 'what': 2}

In [None]:
d = { key: value for key, value in d.items()
               if value != -1 }
d

## Now we understand this code!

In [None]:
s = """Gur Mra bs Clguba, ol Gvz Crgref

Ornhgvshy vf orggre guna htyl.
Rkcyvpvg vf orggre guna vzcyvpvg.
Fvzcyr vf orggre guna pbzcyrk.
Pbzcyrk vf orggre guna pbzcyvpngrq.
Syng vf orggre guna arfgrq.
Fcnefr vf orggre guna qrafr."""

d = {}
for c in (65, 97):
    for i in range(26):
        d[chr(i+c)] = chr((i+13) % 26 + c)

print("".join([d.get(c, c) for c in s]))

In [None]:
list_of_letters = [d.get(c, c) for c in s]
print(''.join(list_of_letters))

# Sets

## Sets
* unordered collection, no duplicates
* kind of a one-trick pony–remove duplicates

In [61]:
s = { 'Annie', 'Betty', 'Cathy', 'Cathy', 'Donna' }
s

{'Annie', 'Betty', 'Cathy', 'Donna'}

In [62]:
s.add('Ellen')
s

{'Annie', 'Betty', 'Cathy', 'Donna', 'Ellen'}

In [63]:
s.add('Annie')
s

{'Annie', 'Betty', 'Cathy', 'Donna', 'Ellen'}

In [None]:
# we can use the 'in' operator
if 'Annie' in s:
    print('Yep!')

## Deleting from a Set
* __`remove(item)`__: remove an item if it's in the set
* __`discard(item)`__: remove an item whether or not it's in the set
* __`pop()`__: pops a random element out of the set

In [64]:
s

{'Annie', 'Betty', 'Cathy', 'Donna', 'Ellen'}

In [65]:
s.remove('Betty')

In [66]:
s

{'Annie', 'Cathy', 'Donna', 'Ellen'}

In [67]:
s.discard('Loren')

In [68]:
s.remove('Sally')

KeyError: 'Sally'

In [None]:
s

In [None]:
s.pop()

## sets (cont'd)

In [69]:
even = set(range(2, 11, 2))
odd = set(range(1, 10, 2))
print(even, odd, sep='\n')

{2, 4, 6, 8, 10}
{1, 3, 5, 7, 9}


In [70]:
prime = {2, 3, 5, 7}
prime & odd

{3, 5, 7}

In [71]:
prime & even

{2}

In [72]:
odd | even

{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}

In [None]:
prime - even

In [73]:
prime ^ odd

{1, 2, 9}

## sets + dicts

In [None]:
movies = {
    'Die Hard': { 'Bruce Willis', 'Alan Rickman', 'Bonnie Bedelia' },
    'The Sixth Sense' : { 'Toni Collete', 'Bruce Willis', 'Donnie Wahlberg' },
    'The Hunt for Red October' : { 'Sean Connery', 'Alec Baldwin' },
    'The Highlander': { 'Christopher Lambert', 'Sean Connery' },
    '16 Blocks': { 'Bruce Willis', ' Yasiin Bey', 'David Morse' }
}

In [None]:
for title, stars in movies.items():
    if 'Bruce Willis' in stars:
        print(title)

In [None]:
for title, stars in movies.items():
    if stars & { 'Alan Rickman', 'Sean Connery' }:
        print(title)

## Subsets

In [74]:
set1 = { 1, 2, 3 }
set2 = { 1, 2, 3, 5, 7, 9 }

In [75]:
set1 <= set2 # <= means "subset"

True

In [76]:
set1 <= set1 # a set is always a subset of itself

True

In [77]:
set1 < set1 # but a set if never a proper subset of itself

False

In [78]:
set1 < set2 # set1 is a proper subset of set2 because set2 has all of set1 *and more*

True

## Lab: Sets
* Use a set to find all of the unique words in the input and print them out in sorted order
* If the user entered __There is no there there__, your program should print out 
   <pre><b>
   is
   no
   there
   </b></pre>
* Note that There and there should be counted as the same word.

## Sets Recap
* unordered
* no duplicates
* operators &, |, -, ^
* use __`in`__ to test for membership
* subset vs. proper subset



# File I/O

## File I/O
* __`fileobj = open(filename, mode)`__
* mode is one or two letters
  * r = read
  * r+ = open for reading and writing
  * w = write (create/overwrite)
  * x = write, but only if file does not already exist
  * a = append, if file exists (unless a+, then create)
* second letter =
  * t = text file (default)
  * b = binary
* __`fileobj.close()`__

## File I/O: Open/Close

In [80]:
f = open('/tmp/test.txt', 'r')

In [81]:
f = open('/tmp/test.txt', 'w')
f.close()

In [82]:
f = open('/tmp/test.txt', 'x')

FileExistsError: [Errno 17] File exists: '/tmp/test.txt'

## File I/O: Read/Write

In [83]:
poem = '''TWO roads diverged in a yellow wood,
And sorry I could not travel both
And be one traveler, long I stood
And looked down one as far as I could
To where it bent in the undergrowth;

Then took the other, as just as fair,
And having perhaps the better claim,
Because it was grassy and wanted wear;
Though as for that the passing there
Had worn them really about the same,

And both that morning equally lay
In leaves no step had trodden black.
Oh, I kept the first for another day!
Yet knowing how way leads on to way,
I doubted if I should ever come back.

I shall be telling this with a sigh
Somewhere ages and ages hence:
Two roads diverged in a wood, and I—
I took the one less traveled by,
And that has made all the difference.'''

len(poem)

729

In [84]:
f = open('/tmp/poem.txt', 'w')
f.write(poem)

729

In [85]:
f.close()

In [86]:
f = open('/tmp/poem.txt', 'r')
poem2 = f.read()
f.close()

In [87]:
poem == poem2

True

In [88]:
f = open('/tmp/poem.txt', 'r')
poem3 = f.readlines()
f.close()

In [89]:
poem3

['TWO roads diverged in a yellow wood,\n',
 'And sorry I could not travel both\n',
 'And be one traveler, long I stood\n',
 'And looked down one as far as I could\n',
 'To where it bent in the undergrowth;\n',
 '\n',
 'Then took the other, as just as fair,\n',
 'And having perhaps the better claim,\n',
 'Because it was grassy and wanted wear;\n',
 'Though as for that the passing there\n',
 'Had worn them really about the same,\n',
 '\n',
 'And both that morning equally lay\n',
 'In leaves no step had trodden black.\n',
 'Oh, I kept the first for another day!\n',
 'Yet knowing how way leads on to way,\n',
 'I doubted if I should ever come back.\n',
 '\n',
 'I shall be telling this with a sigh\n',
 'Somewhere ages and ages hence:\n',
 'Two roads diverged in a wood, and I—\n',
 'I took the one less traveled by,\n',
 'And that has made all the difference.']

In [90]:
f = open('/tmp/poem.txt', 'r')
while f:
    print(line)
f.close()

SyntaxError: invalid syntax (<ipython-input-90-43340cad481d>, line 2)

In [93]:
help(f.readlines)

Help on built-in function readlines:

readlines(hint=-1, /) method of _io.TextIOWrapper instance
    Return a list of lines from the stream.
    
    hint can be specified to control the number of lines read: no more
    lines will be read if the total size (in bytes/characters) of all
    lines so far exceeds hint.



## File I/O: __`write()`__ vs. __`print()`__


In [94]:
f = open('/tmp/poem.txt', 'w')
print(poem, file=f)
f.close()

In [98]:
poem

'TWO roads diverged in a yellow wood,\nAnd sorry I could not travel both\nAnd be one traveler, long I stood\nAnd looked down one as far as I could\nTo where it bent in the undergrowth;\n\nThen took the other, as just as fair,\nAnd having perhaps the better claim,\nBecause it was grassy and wanted wear;\nThough as for that the passing there\nHad worn them really about the same,\n\nAnd both that morning equally lay\nIn leaves no step had trodden black.\nOh, I kept the first for another day!\nYet knowing how way leads on to way,\nI doubted if I should ever come back.\n\nI shall be telling this with a sigh\nSomewhere ages and ages hence:\nTwo roads diverged in a wood, and I—\nI took the one less traveled by,\nAnd that has made all the difference.'

In [95]:
f = open('/tmp/poem.txt', 'r')
poem2 = f.read()
f.close()

In [99]:
poem2

'TWO roads diverged in a yellow wood,\nAnd sorry I could not travel both\nAnd be one traveler, long I stood\nAnd looked down one as far as I could\nTo where it bent in the undergrowth;\n\nThen took the other, as just as fair,\nAnd having perhaps the better claim,\nBecause it was grassy and wanted wear;\nThough as for that the passing there\nHad worn them really about the same,\n\nAnd both that morning equally lay\nIn leaves no step had trodden black.\nOh, I kept the first for another day!\nYet knowing how way leads on to way,\nI doubted if I should ever come back.\n\nI shall be telling this with a sigh\nSomewhere ages and ages hence:\nTwo roads diverged in a wood, and I—\nI took the one less traveled by,\nAnd that has made all the difference.\n'

In [96]:
poem == poem2

False

In [100]:
len(poem)

729

In [97]:
len(poem2)

730

## __`print(*objects, sep=' ', end='\n', file=sys.stdout, flush=False)`__
* __`sep`__ = separator (default is space)
* __`end`__ = what to print at end (default is newline)
* __`file`__ = where to print, default is screen
* __`flush`__ = whether to flush output buffer, default is no

## File I/O: How to Read Data
* __`read()`__: slurps up entire file at once
  * __`read(x)`__ reads a most __`x`__ bytes
* __`readline()`__: reads a line at a time
* __`readlines()`__ reads a line at a time and returns the lines as a list of strings
* or use an iterator…

In [None]:
poem = ''
f = open('/tmp/poem.txt', 'r')
for line in f:
    poem += line
f.close()

In [None]:
len(poem)

## File I/O: __`with`__ statement
* the with statement sets up a temporary "context "and closes the file automatically so we don't have to bother with closing it

In [101]:
with open('/tmp/poem.txt', 'r') as f:
    poem2 = f.read()
    print('in with, f.closed =', f.closed)

in with, f.closed = False


In [None]:
poem == poem2

In [102]:
f.closed

True

## Lab: File I/O
* write a Python program which prompts the user for a filename, then opens that file and writes the contents of the file to a new file, in reverse order, i.e.,

<pre><b>
    Original file       Reversed file
    Line 1              Line 4
    Line 2              Line 3
    Line 3              Line 2
    Line 4              Line 1
</b></pre>

## Lab: File I/O + dicts
* write a Python program to read a file and count the number of occurrences of each word in the file
* use a dict, indexed by word, to count the occurrences
* remember __`d.get(key)`__ will return __`None`__ if there is no such key in the dict (vs. __`d[key]`__ which will throw an exception) and also the __`in`__ operator
* treat __The__ and __the__ as the same word when counting
* print out words and counts, from most common to least common
* EXTRA: remove punctuation, so __Hamlet,__ == __Hamlet__
* Road Not Taken and Hamlet are available at __`https://github.com/davewadestein/Python-Core`__

## File I/O: recap
* __`open()`__ returns file object
* __`read()`__ reads bytes
* __`readline()`__ reads a line at a time
* __`readlines()`__ reads all lines–shouldn't be used
* can also iterate through a file object a line at a time
* __`with`__ statement sets up a temporary context (block) for file I/O and automatically closes file when block is exited

# End of Day 2