# Lab 7: Standard Library

## Overview

The goal of this lab is to become familiar with the tools provided by Python's standard library. We want you to gain practice with the most common utilities of the standard library and also to be aware of the rest of the tools in case you ever need them.

**We expect that most of the time in this lab will be spent _reading_ documentation for the modules in the standard library that you find intriguing or applicable to your interests.** If you have any questions about any standard library features, please ask us!

## Read

We get it. At first, reading documentation doesn't sound like a fun way to spend an afternoon. However, this is one of a rare few times when you will have dedicated class time to take a deep dive into a library tool. Python's standard library is huge, and although your interests may not span the whole library, we're willing to bet that you can find something you enjoy in the library.

Remember that you can follow along with the documentation's examples in the interactive interpreter - we recommend this approach, so that you're both reading about and practicing with the modules you like.

Several of the documentation pages have links to the module's source code - if you're interested in seeing examples of well-crafted Python modules, there's no better place to look than the standard library!

Above all, explore and ask questions! You should plan to spend **around the first half of lab time in this section**, reading about features of the standard library.

If you don't know which modules to look at, we have a list of some of our favorite modules that *weren't* covered in lecture, based on common general interests. Ask us about what you'd like to learn more about, and we'll point you in the right general direction.

The top-level categories of tools in the standard library are:

- Built-in [Functions](https://docs.python.org/3/library/functions.html), [Constants](https://docs.python.org/3/library/constants.html), [Types](https://docs.python.org/3/library/stdtypes.html), and [Exceptions](https://docs.python.org/3/library/exceptions.html)
- [Text Processing Services](https://docs.python.org/3/library/text.html)
- [Binary Data Services](https://docs.python.org/3/library/binary.html)
- [Data Types](https://docs.python.org/3/library/datatypes.html)
- [Numeric and Mathematical Modules](https://docs.python.org/3/library/numeric.html)
- [Functional Programming Modules](https://docs.python.org/3/library/functional.html)
- [File and Directory Access](https://docs.python.org/3/library/filesys.html)
- [Data Persistence](https://docs.python.org/3/library/persistence.html)
- [Data Compression and Archiving](https://docs.python.org/3/library/archiving.html)
- [File Formats](https://docs.python.org/3/library/fileformats.html)
- [Cryptographic Services](https://docs.python.org/3/library/crypto.html)
- [Generic Operating System Services](https://docs.python.org/3/library/allos.html)
- [Concurrent Execution](https://docs.python.org/3/library/concurrency.html)
- [Context Variables](https://docs.python.org/3/library/contextvars.html)
- [Networking and Interprocess Communication](https://docs.python.org/3/library/ipc.html)
- [Internet Data Handling](https://docs.python.org/3/library/netdata.html)
- [Structured Markup Processing Tools](https://docs.python.org/3/library/markup.html)
- [Internet Protocols and Support](https://docs.python.org/3/library/internet.html)
- [Multimedia Services](https://docs.python.org/3/library/mm.html)
- [Internationalization](https://docs.python.org/3/library/i18n.html)
- [Program Frameworks](https://docs.python.org/3/library/frameworks.html)
- [Graphical User Interfaces with Tk](https://docs.python.org/3/library/tk.html)
- [Development Tools](https://docs.python.org/3/library/development.html)
- [Debugging and Profiling](https://docs.python.org/3/library/debug.html)
- [Software Packaging and Distribution](https://docs.python.org/3/library/distribution.html)
- [Python Runtime Services](https://docs.python.org/3/library/python.html)
- [Custom Python Interpreters](https://docs.python.org/3/library/custominterp.html)
- [Importing Modules](https://docs.python.org/3/library/modules.html)
- [Python Language Services](https://docs.python.org/3/library/language.html)

### [Take Me To The Standard Library (Click Me!)](https://docs.python.org/3/library/)

## Write

In this section, you'll gain practice with some of the common modules in the Python standard library.

### Manipulating `collections`

**Before continuing, read the [`collections` documentation](https://docs.python.org/3/library/collections.html) at least through the section on `namedtuple()`.**

#### Working with `collections.namedtuple` (10 points)

In this section, we modify code that prints out a message about each of a bunch of animals.

Rewrite the following code to be more Pythonic by using `collections.namedtuple` to add readable attribute references. The attributes for these animals are `'name'`, `'species'`, `'color'`, and `'age'`.

In [1]:
lassie = ('Lassie', 'dog', 'black', 12)
buddy = ('Buddy', 'pupper', 'red', 0.5)  # Woof! Follow me on insta @buddypelu
astro = ('Astro', 'doggo', 'grey', 15)
mrpb = ('Mr. Peanutbutter', 'dog', 'golden', 35)
bojack = ('BoJack Horseman', 'horse', 'brown', 52)
pc = ('Princess Carolyn', 'cat', 'pink', 34)
tinkles = ('Mr. Tinkles', 'cat', 'white', 7)
pupper = ('Bella', 'pupper', 'brown', 0.5)
doggo = ('Max', 'doggo', 'brown', 5)
seuss = ('The Cat in the Hat', 'cat', 'stripey', 27)
pluto = ('Pluto (Disney)', 'dog', 'orange', 3)
plu2o = ('Pluto (space)', 'planet', 'brownish', 4500000000)
yertle = ('Yertle', 'turtle', 'green', 130)
horton = ('Horton', 'elephant', 'blue', 79)
for animal in [lassie, buddy, astro, mrpb, bojack, pc, tinkles, pupper, doggo, seuss, pluto, plu2o, yertle, horton]:
    if animal[1] == 'dog' or animal[1] == 'doggo' or animal[1] == 'pupper':
        if animal[3] > 5:
            print(animal[0] + ' is an old ' + animal[2] + ' ' + animal[1] + ' who is ' + str(animal[3]) + ' years old.')
        else:
            print(animal[0] + ' is a young ' + animal[2] + ' ' + animal[1] + ' who is ' + str(animal[3]) + ' years old.')
    else:
        print(animal[0] + ' is a ' + str(animal[3]) + '-year-old non-canine ' + animal[2] + ' ' + animal[1] + '.')
        
# Prints out:
# Lassie is an old black dog who is 12 years old.
# Buddy is a young red pupper who is 0.5 years old.
# Astro is an old grey doggo who is 15 years old.
# Mr. Peanutbutter is an old golden dog who is 35 years old.
# BoJack Horseman is a 52-year-old non-canine brown horse.
# Princess Carolyn is a 34-year-old non-canine pink cat.
# Mr. Tinkles is a 7-year-old non-canine white cat.
# Bella is a young brown pupper who is 0.5 years old.
# Max is a young brown doggo who is 5 years old.
# The Cat in the Hat is a 27-year-old non-canine stripey cat.
# Pluto (Disney) is a young orange dog who is 3 years old.
# Pluto (space) is a 4500000000-year-old non-canine brownish planet.
# Yertle is a 130-year-old non-canine green turtle.
# Horton is a 79-year-old non-canine blue elephant.

Lassie is an old black dog who is 12 years old.
Buddy is a young red pupper who is 0.5 years old.
Astro is an old grey doggo who is 15 years old.
Mr. Peanutbutter is an old golden dog who is 35 years old.
BoJack Horseman is a 52-year-old non-canine brown horse.
Princess Carolyn is a 34-year-old non-canine pink cat.
Mr. Tinkles is a 7-year-old non-canine white cat.
Bella is a young brown pupper who is 0.5 years old.
Max is a young brown doggo who is 5 years old.
The Cat in the Hat is a 27-year-old non-canine stripey cat.
Pluto (Disney) is a young orange dog who is 3 years old.
Pluto (space) is a 4500000000-year-old non-canine brownish planet.
Yertle is a 130-year-old non-canine green turtle.
Horton is a 79-year-old non-canine blue elephant.


In [2]:
# Rewrite the above program to be more Pythonic!
import collections
from collections import namedtuple
classify = namedtuple('cla',['name','ani','color','year'])
lst = list(map(lambda x: classify(x[0],x[1],x[2],x[3]) ,[lassie, buddy, astro, 
                                                        mrpb, bojack, pc, tinkles, pupper,
                                                        doggo, seuss, pluto,
                                                        plu2o, yertle, horton]) )
for animal in lst:
    if animal.ani == 'dog' or animal.ani == 'doggo' or animal.ani == 'pupper':
        if animal.year > 5:
            print(animal.name + ' is an old ' + animal.color + ' ' + animal.ani + ' who is ' + str(animal.year) + ' years old.')
        else:
            print(animal.name + ' is a young ' + animal.color + ' ' + animal.ani + ' who is ' + str(animal.year) + ' years old.')
    else:
        print(animal[0] + ' is a ' + str(animal[3]) + '-year-old non-canine ' + animal[2] + ' ' + animal[1] + '.')
pass  # Your implementation here

Lassie is an old black dog who is 12 years old.
Buddy is a young red pupper who is 0.5 years old.
Astro is an old grey doggo who is 15 years old.
Mr. Peanutbutter is an old golden dog who is 35 years old.
BoJack Horseman is a 52-year-old non-canine brown horse.
Princess Carolyn is a 34-year-old non-canine pink cat.
Mr. Tinkles is a 7-year-old non-canine white cat.
Bella is a young brown pupper who is 0.5 years old.
Max is a young brown doggo who is 5 years old.
The Cat in the Hat is a 27-year-old non-canine stripey cat.
Pluto (Disney) is a young orange dog who is 3 years old.
Pluto (space) is a 4500000000-year-old non-canine brownish planet.
Yertle is a 130-year-old non-canine green turtle.
Horton is a 79-year-old non-canine blue elephant.


#### Using `collections.defaultdict` and `collections.Counter` (10 points)

Download the [word](https://drive.google.com/open?id=15TXj7aeFM2WAPBRajFseV2H-ChTWsFQj) and use it as a data source. What are the three most common word lengths in the English language? Remember to strip off trailing whitespace.

In [3]:
# Change me to another file location if you've downloaded a copy of the word list.
# Recall that this file has one word per line.
from collections import defaultdict, Counter
FILENAME = 'words'
with open(FILENAME) as f:
    print([x for x, y in Counter([len(x)-1 for x in f]).most_common(3)]) 
    
# TODO(you): Print the three most common word lengths in the English language.

[9, 10, 8]


#### Working Together (20 points)

Use tools from the `collections` module to implement an `Employee` database, which maintains organizational relationships among employees. Suppose that your data is provided in a tab-separated file:

```
employee_name    employee_manager    salary    department    title
employee_name    employee_manager    salary    department    title
...
employee_name    employee_manager    salary    department    title
```

If you'd like sample data to work with, you can use the following
```
sredmond	poohbear	0	CS	Instructor
poohbear	sahami	500	CS	Lecturer
tigger	poohbear	100	CS	Tiger
htiek	sahami	500	CS	Lecturer
sahami	mtl	5000	CS	Professor
guido	guido	50000	PSF	BDFL
```
Save the above text to a file, making sure that your text editor doesn't automatically replace all of tabs with spaces!

After writing code to load this information from a file, implement the following functions.

```Python
def directly_reports_to(employee, manager):
    """Return whether or not employee directly reports to manager"""
    pass

def indirectly_reports_to(employee, manager):
    """Return whether or not employee indirectly reports to manager"""
    pass
    
def in_department(dept):
    """Return a collection of all employees of a given department"""
    pass
    
def cost_of(dept):
    """Return the sum total of salaries for all employees of a given department""""
    pass
```

The primary portion of this section is parsing the file and storing the employees in a your choice of data structure keyed by some of the employees' information.

In [4]:
import collections

# Replace me with the name of a file containing employment data.
FILENAME = 'replace-me.txt'
d = collections.namedtuple('dept',['employee','manager', 'salary','depart','title'])
f = open(FILENAME)
lst = []
for x in f:
    l = x.split()
    lst.append(d(l[0],l[1],l[2],l[3],[4]))
# TODO(you): Read the data file and store the data in a data structure.


def directly_reports_to(employee, manager):
    """Return whether or not employee directly reports to manager"""
    for x in lst:
         if x.employee == employee:
                if x.manager == manager:
                    return True
    return False


def indirectly_reports_to(employee, manager):
    """Return whether or not employee indirectly reports to manager"""
    for x in lst:
         if x.employee == employee:
                for y in lst:
                    if y.employee == x.manager:
                        if y.manager == manager:
                            return True
    return False


def in_department(dept):
    """Return a collection of all employees of a given department"""
    return ', '.join([L[0] for L in lst if dept in L])


def cost_of(dept):
    """Return the sum total of salaries for all employees of a given department"""
    return sum([int(L[2]) for L in lst if dept in L])

print(directly_reports_to('tigger', 'poohbear')) # => True
print(directly_reports_to('tigger', 'sahami'))   # => False
print(indirectly_reports_to('sredmond', 'sahami'))  # => True
print(indirectly_reports_to('sredmond', 'tigger'))  # => False
print(in_department('CS')) # => sredmond, poohbear, tigger, htiek, sahami
print(cost_of('CS')) # => 6100


True
False
True
False
sredmond, poohbear, tigger, htiek, sahami
6100


### Extracting data with `re`

If you're fairly new to regular expressions, we recommend you read through [the official Python HOWTO](https://docs.python.org/3/howto/regex.html) and walk through those examples first.

Otherwise, **read through the official [`re` documentation](https://docs.python.org/3/library/re.html) through "Match Objects"** (although the next section provides some neat examples).

#### Wordplay (10 points)

Using the list of words in the [word](https://drive.google.com/open?id=15TXj7aeFM2WAPBRajFseV2H-ChTWsFQj), determine all words that have all five vowels in order. That is, words that contain an `'a'`, `'e'`, `'i'`, `'o'`, and `'u'` in order, with any number (including 0) of non-vowel word characters before the 'a', between the vowels, and after the 'u'.

For example, your list should contain both `"abstemious"` and `"facetious"`. We found a total of 14 matches.

In [1]:
import re

# Change me to another file location if you've downloaded a copy of the word list.
# Recall that this file has one word per line.
FILENAME = 'words'
with open(FILENAME) as f:
    count = 0
    pattern = re.compile(r'^[^aeiou]*a[^aeiou]*e[^aeiou]*i[^aeiou]*o[^aeiou]*u[^aeiou]*$')
    for x in f:
        if re.match(pattern,x):
            count += 1
    print(count)

# TODO(you): Print out any words that have five vowels in order.

16


#### Regex Crossword Checker (10 points)

Take a moment to play one round of [Regex Crossword](https://regexcrossword.com/) (a highly entertaining site, if you've got hours to spare).

In the spirit of Regex Crossword, we will write a function that checks arbitrary regex crosswords. Your function should take in two lists, one representing horizontal clues and one representing vertical clues, as well as the potential solution to crossword in the form a list-of-lists in row-major order (i.e. the elements are lists representing rows of the crossword. You should return whether or not the potential solution is in fact valid.

```Python
def regex_crossword_check(horizontal_patterns, vertical_patterns, candidate):
    pass  # Your implementation here
```

For example, the call corresponding to the first "Beginner" puzzle (it's called "Beatles") would look like:

```Python
horiz = [r'HE|LL|O+', r'[PLEASE]+']
vert = [r'[^SPEAK]+', r'EP|IP|EF']
candidate = [
    ['H', 'E'],
    ['L', 'P']
]
regex_crossword_check(horiz, vert, candidate)  # => True
```

and the call corresponding to the second "Experiences" puzzle (it's called "Royal Dinner") would look like:

```Python
horiz = [r'(Y|F)(.)\2[DAF]\1', r'(U|O|I)*T[FRO]+', r'[KANE]*[GIN]*']
vert = [r'(FI|A)+', r'(YE|OT)K', r'(.)[IF]+', r'[NODE]+', r'(FY|F|RG)+']
candidate = [
    ['F', 'O', 'O', 'D', 'F'],
    ['I', 'T', 'F', 'O', 'R'],
    ['A', 'K', 'I', 'N', 'G']
]
regex_crossword_check(horiz, vert, candidate)  # => True
```

Some implementation notes:

* You may want to use `re.fullmatch` instead of `re.match` or `re.search`. The former matches a pattern string against an entire string, whereas the latter methods check to see if any prefix string or any substring, respectively, match the pattern.
* You can get the width and height of the crossword from the length of the vertical and horizontal clue lists, respectively.
* Remember your friend, `zip`!

In [13]:
import re
import string


def regex_crossword_check(horizontal_patterns, vertical_patterns, candidate):
    for pat ,i in zip(horizontal_patterns,range(len(horizontal_patterns))):
        if re.fullmatch(re.compile(pat),''.join(candidate[i])) == None:
            return False
        
    for pat ,i in zip(vertical_patterns,range(len(vertical_patterns))):
        if re.fullmatch(re.compile(pat),''.join([row[i] for row in candidate])) == None:
            return False
        
    return True
# Quick tests.
horiz = [r'HE|LL|O+', r'[PLEASE]+']
vert = [r'[^SPEAK]+', r'EP|IP|EF']
candidate = [
    ['H', 'E'],
    ['L', 'P']
]
print(regex_crossword_check(horiz, vert, candidate))  # => True


horiz = [r'(Y|F)(.)\2[DAF]\1', r'(U|O|I)*T[FRO]+', r'[KANE]*[GIN]*']
vert = [r'(FI|A)+', r'(YE|OT)K', r'(.)[IF]+', r'[NODE]+', r'(FY|F|RG)+']
candidate = [
    ['F', 'O', 'O', 'D', 'F'],
    ['I', 'T', 'F', 'O', 'R'],
    ['A', 'K', 'I', 'N', 'G']
]
print(regex_crossword_check(horiz, vert, candidate))  # => True

True
True


### Working with `itertools`

**Before continuing, make sure you read all of the [`itertools` documentation](https://docs.python.org/3/library/itertools.html).**

#### Tabulation (5 points)

Write a `tabulate` function to generate a computation lookup table. `tabulate` should take in three arguments, a function, a start number (default 0), and a step size (default 1)

```Python
def tabulate(f, start=0, step=1):
    pass
```

This function can be used as follows:

```Python
sqgen = tabulate(lambda x: x ** 2)
next(sqgen)  # => 0 (which is equal to f(0))
next(sqgen)  # => 1 (which is equal to f(1))
next(sqgen)  # => 4 (which is equal to f(2))
next(sqgen)  # => 9 (which is equal to f(3))
```

For reference, our implmentation is one line and 43 characters.

Hint: take a look at the `itertools.count` function!

In [15]:
import itertools


def tabulate(f, start=0, step=1):
    return map(f, itertools.count(start, step=step))


sqgen = tabulate(lambda x: x ** 2)
print(next(sqgen))  # => 0 (which is equal to f(0))
print(next(sqgen))  # => 1 (which is equal to f(1))
print(next(sqgen))  # => 4 (which is equal to f(2))
print(next(sqgen))  # => 9 (which is equal to f(3))

0
1
4
9


### Cute Modules

#### `turtle` - Turtle graphics

Run the following code. A graphical window should appear that shows your new turtle friend! What other interesting shapes can you make?

In [None]:
import turtle

turtle.left(180)
turtle.forward(200)
turtle.left(180)

turtle.color('red', 'yellow')
turtle.begin_fill()

for _ in range(36):
    turtle.forward(400)
    turtle.left(170)
    if abs(turtle.pos()) < 1:
        break

turtle.end_fill()
turtle.done()

#### `unicodedata` - Unicode Database

Think about your favorite emoji. Can you guess its official name?

In [None]:
import unicodedata

print(unicodedata.lookup('SLICE OF PIZZA'))  # => '🍕'

print(unicodedata.name('👌'))  # => 'OK HAND SIGN'

#### `this` and `antigravity`

Just run the following lines of code.

In [None]:
import this

In [None]:
import antigravity

> With <3 by @sredmond