# Day 2

## Day 2 Agenda
* __`enumerate()/zip()`__
* list comprehensions
* tuples
* dictionaries
* explaining __`this.py`__
* sets
* file I/O

## "Pythonic"

In [None]:
cars = ['Tesla', 'Fisker', 'Rivian', 'Lordstown']

In [None]:
i = 0
for car in cars: # for thing in container
    print('index', i, 'is', car)
    i += 1

## __`enumerate()`__
* a builtin function which associates an index with each item in an iterable
* returns an _enumerate_ object
* return two things: the index AND the item


In [None]:
for index, car in enumerate(cars, start=1):
    print('car maker', index, 'is', car)

In [None]:
type(enumerate(cars))

## __`zip(*iterables)`__ # 0 or more containers
* builtin function which matches up each item in an iterable with the corresponding item in the other iterable(s)
* technically creates an iterator that aggregates elements from each iterable
* why is it called __`zip`__?

In [None]:
first_names = ['Dave', 'Bruce', 'Taylor']
last_names = ['W-S', 'Lee', 'Swift']
employee_nums = [3456, 1, 2]

for first, last, num in zip(first_names, last_names, employee_nums):
    print(first, last, num)

In [None]:
stooges = ['Larry', 'Moe', 'Curly']
marxbros = ['Groucho', 'Harpo', 'Chico', 'Zeppo']

for stooge, marx in zip(stooges, marxbros):
    print(stooge, marx)

In [None]:
from itertools import zip_longest # module that helps with iteration
stooges = ['Larry', 'Moe', 'Curly']
marxbros = ['Groucho', 'Harpo', 'Chico', 'Zeppo']

for stooge, marx in zip_longest(stooges, marxbros, fillvalue='***'):
    print(stooge, marx)

# List Comprehensions

## List Comprehensions ("listcomps")
* quick/compact way to build a list
* "more readable"/faster
* which is easier to read?

In [None]:
fruits = 'apple lemon cherry fig lime watermelon'.split() # Pythonic
fruits

In [None]:
fruit_lengths = [] # empty to start

for fruit in fruits:
    fruit_lengths.append(len(fruit))
    
print(fruit_lengths)

In [None]:
fruit_lengths = [len(fruit) for fruit in fruits]

print(fruit_lengths)

## List Comprehensions (cont'd)
* listcomps can generate a list from the Cartesian product of two or more iterables

In [None]:
colors = ['black', 'white']
sizes = ['S', 'M', 'L', 'XL']

In [None]:
tshirts = [[color, size] for color in colors
                             for size in sizes]
tshirts

In [None]:
string = 'alphabet soup tastes great!'

In [None]:
print(list(string))

In [None]:
# generate a list of all the consonants in a
# string, discarding vowels and spaces

consonants = [char for char in string 
                       if char not in 'aeiou! ']
print(consonants)

## Lab: List Comprehensions
*  Start with Cartesian product example (colors x sizes of t-shirts) and add a third list, __`sleeves = ['short', 'long']`__ then write a new listcomp which generates the Cartesian product __`colors x sizes x sleeves`__. __`tshirts`__ should look like this:<pre><b>
    [['black', 'S', 'short'],
     ['black', 'S', 'long'],
     ['black', 'M', 'short'],
     ['black', 'M', 'long'],
     ['black', 'L', 'short'],
     ['black', 'L', 'long'],
     ['white', 'S', 'short'],
     ['white', 'S', 'long'],
     ['white', 'M', 'short'],
     ['white', 'M', 'long'],
     ['white', 'L', 'short'],
     ['white', 'L', 'long']]
     
 </b></pre>
* Use a list comprehension to create a list of the squares of the integers from 1 to 25 (i.e, 1, 4, 9, 16, …, 625)
* Given a list of words, create a second list which contains all the words from the first list which do not end with a vowel
* Use a list comprehension to create a list of the integers from 1 to 100 which are not divisible by 5
* Use a list comprehension and __`zip()`__ to create a list of lists, where the list items are name and ID number that you grabbed from separate lists of names and ID numbers
  * start with a list of, say, 5 names ['John', 'Mary', 'Edward', 'Linda', 'Dinesh']
  * and a list of, say, 5 ID numbers [1003, 2043, 8762, 7862, 1093]
  * additional wrinkle: do not include any names whose corresponding ID is -1

## listcomps recap
* keep them short
* they are not _list incomprehensions_, so keep them simple
* use line breaks since they are ignored inside [] (and (), {}) and you therefore don't need the ugly '\\' line continuation character
* note that __`for`__ loops do many things (e.g., scan a sequence to count or select items), computing aggregates (sum, averages) or any number of other processing tasks
  * in contrast, listcomps do ONE thing–generate lists!

# Tuples

## Tuples
* immutable data type
* typically heterogeneous (cf. lists)
* generally imply some structure
 * tuples typically represent a single object, but multiple aspects/attributes of it
 * if lists are typically used like the columns of a spreadsheet...
   * then tuples are typically the rows...

In [None]:
t = () # empty tuple (cf. empty list...[])
t

In [None]:
type(t)

In [None]:
t = ('foo',) # singleton tuple

In [None]:
t

In [None]:
t = 'Jones', 'John', 1023, True # no parens
t

In [None]:
# tuple unpacking
last_name, first_name, employee_num, full_time = t

In [None]:
employee_num # type(employee_num)

In [None]:
something = input('Enter something: ')
as_a_list = something.split() # split() always returns a list
as_a_tuple = tuple(as_a_list) # tuple() always returns a tuple

In [None]:
print(as_a_list, as_a_tuple, sep='\n')

In [None]:
person = 'Sara Breedlove', 1867, 'Louisiana'

In [None]:
person[1]

In [None]:
person[1] = 1868

In [None]:
# a tuple may contain a mutable object...

person = 'Curie', 'Marie', 1867, []
person

In [None]:
person[-1].extend(['physicist', 'chemist'])
person

## Lab: Tuples
* Create a tuple representing a city w/field of your own choosing (e.g., city name, state/country, population, elevation, etc.)
* "Add" a field to the tuple–since tuples are immutable, you will have to do this by concatenating tuples
* Using the _in_ operator, check to see if a particular value is in the tuple
* Using the __`.index()`__ method, find the position of a particular value in the tuple

## Recap: Tuples
* not just "constant lists" 
 (see http://jtauber.com/blog/2006/04/15/python_tuples_are_not_just_constant_lists)
* remember that lists are (typically) ordered sequences of homogeneous values (i.e., Excel/DB column)
* and tuples typically imply some structure and refer to multiple attributes of ONE item (person, country, building, etc.)
 * i.e., database/Excel row

# Dictionaries



# Dictionaries
* "unordered" grouping of key/value pairs
* sometimes called a "map", "hashmap", or "associative array"

In [None]:
d = {} # empty dict

In [None]:
d = { 'X': 10, 'V': 5, 'I': 1 } # can be initialized when declared

In [None]:
d

In [None]:
d['L'] = 50 # add something to the dict
print(d)

In [None]:
# iterating through a dict iterates through the keys 
for key in d: # for thing in container
    print(key, end=' ')

In [None]:
# ...of course we can print the values while iterating
for thing in d:
    print(thing, d[thing])

In [None]:
sbux_dict = {'venti': 20, 'tall': 12, 'grande': 16}
print(sbux_dict)

In [None]:
print(sbux_dict.keys(), sbux_dict.values(),
      sbux_dict.items(), sep='\n')

In [None]:
total_ounces = 0
for amount in sbux_dict.values():
    total_ounces += amount

total_ounces

In [None]:
sum(sbux_dict.values())

## Dictionaries: View Objects
* __`keys()`__, __`values()`__, and __`items()`__ are view objects
* view objects provide a dynamic window into the dictionary

In [None]:
keys = sbux_dict.keys()
keys

In [None]:
# keys will change automagically after we add to the dict
print(keys)
sbux_dict['trenta'] = 31
print(keys)

In [None]:
keys

# __`get()`__: Dealing with missing dict values

In [None]:
d = {'foo': 'bar'}

In [None]:
d['foo']

In [None]:
d['foot']

In [None]:
if 'foot' in d: # is 'foot' a key in this dict
    print(d['foot'])
# or just... d.get('foot')

In [None]:
print(d.get('foot'))

In [None]:
# what if we sort a dict?
for key in sorted(sbux_dict):
    print(key, sbux_dict[key])

In [None]:
# In order to iterate in order, we have to sort the
# dict by value (as opposed to key)
# By default, sorted() will sort by key--
# usually not what we want!

for k in sorted(sbux_dict, key=sbux_dict.get):
    print(k, '=>', sbux_dict[k])

## Removing items from a dict
* __`del`__ = remove an item from the dict
* __`dict.pop(key)`__ = remove item and return value
* __`dict.clear()`__ = empty out the dict

In [None]:
mydict = {'trenta': 31, 'grande': 16, 'venti': 20,
          'tall': 12}
print(mydict)

In [None]:
del mydict['trenta']
print(mydict)

In [None]:
print(mydict.pop('venti'))

In [None]:
print(mydict)

In [None]:
mydict.clear()
mydict

## Lab: dictionary
* use a dict to translate Roman numerals into their Arabic equivalents
1. load the dict with Roman numerals M (1000), D (500), C (100), L (50), X (10), V (5), I (1)
2. read in a Roman numeral
3. print Arabic equivalent
4. try it with MCLX = 1000 + 100 + 50 + 10 = 1160
4. __If you have time, deal with the case where a smaller number precedes a larger number, e.g., XC = 100 - 10 = 90, or MCM = 1000 + (1000-100) = 1900__
4. __MCMXCIX = 1999__

## Dict Comprehension
* like a listcomp, a dictcomp creates a dict quickly

In [None]:
names = ['Sally', 'Bob', 'Martha', 'Dirk']
employee_ids = [345, 286, 453, 119]
id_dict = { name: emp_id + 1000
                   for name, emp_id in zip(names, employee_ids)}
print(id_dict)

In [None]:
d = { 'foo': 4, 'bar': -1, 'baz': -1, 'blah': 3, 'what': 2 }
print(d)

In [None]:
d.items()

In [None]:
d = { key: val for key, val in d.items()
               if val != -1 }
print(d)

In [None]:
id_dict_inverse = { val : key for key, val in id_dict.items() }

In [None]:
id_dict_inverse

## Now we understand this code!

In [None]:
s = """Gur Mra bs Clguba, ol Gvz Crgref

Ornhgvshy vf orggre guna htyl.
Rkcyvpvg vf orggre guna vzcyvpvg.
Fvzcyr vf orggre guna pbzcyrk.
Pbzcyrk vf orggre guna pbzcyvpngrq.
Syng vf orggre guna arfgrq.
Fcnefr vf orggre guna qrafr."""

d = {}
for c in (65, 97):
    for i in range(26):
        d[chr(i+c)] = chr((i+13) % 26 + c)

print("".join([d.get(c, c) for c in s]))

# Sets

## Sets
* unordered collection, no duplicates
* kind of a one-trick pony–remove duplicates

In [None]:
s = { 'Annie', 'Betty', 'Cathy', 'Donna' }
print(s)

In [None]:
s.add('Ellen')
print(s)

In [None]:
s.add('Annie')
print(s)

In [None]:
# we can use the 'in' operator
if 'Annie' in s:
    print('Yep!')

## Deleting from a Set
* __`remove(item)`__: remove an item if it's in the set
* __`discard(item)`__: remove an item whether or not it's in the set
* __`pop()`__: pops a random element out of the set

In [None]:
print(s)

In [None]:
s.remove('Betty')

In [None]:
print(s)

In [None]:
s.discard('Loren')

In [None]:
print(s)

In [None]:
print(s.pop())
print(s)

In [None]:
while s: # while the set is non-empty
    print(s.pop())

## Lab: Sets
* Use a set to find all of the unique words in the input and print them out in sorted order
* If the user entered __There is no there there__, your program should print out 
   <pre><b>
   is
   no
   there
   </b></pre>
* Note that `There` and `there` should be counted as the same word.

## Sets Recap
* unordered
* no duplicates
* use __`in`__ to test for membership


# File I/O

## File I/O
* __`fileobj = open(filename, mode)`__
* mode is one or two letters
  * r = read
  * r+ = open for reading and writing
  * w = write (create/overwrite)
  * x = write, but only if file does not already exist
  * a = append, if file exists (unless a+, then create)
* second letter =
  * t = text file (default)
  * b = binary
* __`fileobj.close()`__

## File I/O: Open/Close

In [None]:
f = open('test.txt', 'r')

In [None]:
f = open('test.txt', 'w')
f.close()

In [None]:
!ls -l test.txt

In [None]:
f = open('test.txt', 'x')

## File I/O: Read/Write

In [None]:
poem = """TWO roads diverged in a yellow wood,
And sorry I could not travel both
And be one traveler, long I stood
And looked down one as far as I could
To where it bent in the undergrowth;

Then took the other, as just as fair,
And having perhaps the better claim,
Because it was grassy and wanted wear;
Though as for that the passing there
Had worn them really about the same,

And both that morning equally lay
In leaves no step had trodden black.
Oh, I kept the first for another day!
Yet knowing how way leads on to way,
I doubted if I should ever come back.

I shall be telling this with a sigh
Somewhere ages and ages hence:
Two roads diverged in a wood, and I—
I took the one less traveled by,
And that has made all the difference."""

len(poem)

In [None]:
f = open('poem.txt', 'w')
f.write(poem)

In [None]:
f.close()

In [None]:
f = open('poem.txt')
poem2 = f.read()
f.close()

In [None]:
poem == poem2

## File I/O: __`write()`__ vs. __`print()`__


In [None]:
f = open('poem.txt', 'w')
# another example of why print being a function is good
print(poem, file=f, end='') 
f.close()

In [None]:
f = open('poem.txt')
poem2 = f.read()
f.close()

In [None]:
poem == poem2

In [None]:
len(poem2)

## __`print(*objects, sep=' ', end='\n', file=sys.stdout, flush=False)`__
* __`sep`__ = separator (default is space)
* __`end`__ = what to print at end (default is newline)
* __`file`__ = where to print, default is screen
* __`flush`__ = whether to flush output buffer, default is no

## File I/O: How to Read Data
* __`read()`__: slurps up entire file at once
  * __`read(x)`__ reads a most __`x`__ bytes
* __`readline()`__: reads a line at a time
* __`readlines()`__ reads a line at a time and returns the lines as a list of strings
* or use an iterator…

In [None]:
poem = ''
f = open('poem.txt')
for line in f: # Python reads each line
    poem += line
f.close()

In [None]:
print(poem)

## File I/O: __`with`__ statement
* the __`with`__ statement sets up a temporary "context" and closes the file automatically so we don't have to bother with closing it

In [None]:
with open('poem.txt') as f1: # ~ f1 = open('poem.txt')
    poem2 = f1.read()
    # at this point file is open
    print('in with statement, f1.closed =', f1.closed)

In [None]:
poem == poem2

In [None]:
f1.closed

## Quick Lab: File I/O
* write a Python program which prompts the user for a filename, then opens that file and writes the contents of the file to a new file, in reverse order, i.e.,

<pre><b>
    Original file       Reversed file
    Line 1              Line 4
    Line 2              Line 3
    Line 3              Line 2
    Line 4              Line 1
</b></pre>

## Lab: File I/O + dicts
* write a Python program to read a file and count the number of occurrences of each word in the file
* use a __`dict`__, indexed by word, to count the occurrences
* remember __`d.get(key)`__ will return __`None`__ if there is no such key in the dict (vs. __`d[key]`__ which will throw an exception) and also the __`in`__ operator
  * or use a __`collections.defaultdict`__ if we've covered it
* treat __The__ and __the__ as the same word when counting
* print out words and counts, from most common to least common
* EXTRA: remove punctuation, so __Hamlet,__ == __Hamlet__ # refer back to "import this"
* Road Not Taken and Hamlet are in your materials

## File I/O: recap
* __`open()`__ returns file object
* __`close()`__ closes the file
* __`read()`__ reads bytes
* __`readline()`__ reads a line at a time
* __`readlines()`__ reads all lines–shouldn't be used
* can also iterate through a file object a line at a time
* __`with`__ statement sets up a temporary context (block) for file I/O and automatically closes file when block is exited

# End of Day 2