# Chapter 2. A Crash Course in Python

## The Zen of Python

In [45]:
import this

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!


## Vitrual Environments

<div class="alert alert-block alert-info">
<b>Note:</b> The following code works only with the <b>Anaconda</b> distribution installed.
</div>

Create a Python 3.6 environment called "dsfs"
> conda create -n dsfs python=3.6

To activate this environment, use:
> source activate dsfs

To deactivate an active environment, use:
> source deactivate

In [None]:
# Use 'python -m pip install ipython' when using the command line
%pip install ipython

## Whitespace Formatting

In [47]:
# The pound sign marks the start of a comment. Python itself
# ignores the comments, but they're helpful for anyone reading the code.

for i in [1, 2, 3, 4, 5]:
    print(i)                    # first line in "for i" block
    for j in [1, 2, 3, 4, 5]:
        print(j)                # first line in "for j" block
        print(i + j)            # last line in "for j" block
    print(i)                    # last line in "for i" block
print("done looping")

1
1
2
2
3
3
4
4
5
5
6
1
2
1
3
2
4
3
5
4
6
5
7
2
3
1
4
2
5
3
6
4
7
5
8
3
4
1
5
2
6
3
7
4
8
5
9
4
5
1
6
2
7
3
8
4
9
5
10
5
done looping


In [48]:
# Whitespace is ignore inside parentheses and brackets
long_winded_computation = (1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10 + 11 + 12 +
                           13 + 14 + 15 + 16 + 17 + 18 + 19 + 20)

list_of_lists = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]

easier_to_read_list_of_lists = [[1, 2, 3],
                                [4, 5, 6],
                                [7, 8, 9]]

# You can use backslashes to indicate that a statement continues onto the next line, although we'll rarely do this
two_plus_three = 2 + \
                 3

for i in [1, 2, 3, 4, 5]:

    # notice the blank line
    print(i)

# To paste code into the Python interpreter, you can use the %paste magic function from IPython

1
2
3
4
5


## Modules

In [49]:
import re

my_regex = re.compile("[0-9]+", re.I)
my_regex

re.compile(r'[0-9]+', re.IGNORECASE|re.UNICODE)

In [50]:
import re as regex

my_regex = regex.compile("[0-9]+", regex.I)
my_regex

re.compile(r'[0-9]+', re.IGNORECASE|re.UNICODE)

In [51]:
import matplotlib.pyplot as plt

# plt.plot(...)

In [52]:
from collections import defaultdict, Counter

lookup = defaultdict(int)
my_counter = Counter()

In [53]:
match = 10
from re import *    # uh oh, re has a match function
print(match)        # "<function re.match>"

<function match at 0x0000021291A968E0>


## Functions

In [54]:
def double(x):
    """
    This is where you put an optional docstring that explains what the function does.
    For example, this function multiplies its input by 2.
    """
    return x * 2

double(1)

2

In [55]:
def apply_to_one(f):
    """Calls the function f with 1 as its argument"""
    return f(1)

my_double = double              # refers to the previously defined function
x = apply_to_one(my_double)     # equals 2

x

2

In [56]:
y = apply_to_one(lambda x: x + 4)   # equals 5

y

5

A **lambda** function in Python is a small anonymous function defined using the `lambda` keyword. It can have any number of arguments but only one expression. The expression is evaluated and returned. Lambda functions are often used for short, simple operations that are passed as arguments to higher-order functions like `map`, `filter`, and `sorted`.

Syntax:
```python
lambda arguments: expression
```

In [57]:
another_double = lambda x: 2 * x    # Don't do this

print(another_double(2))

def another_double(x):
    """Do this instead"""
    return 2 * x

print(another_double(2))

4
4


In [58]:
def my_print(message="my default message"):
    print(message)

my_print("hello")   # prints 'hello'
my_print()          # prints 'my default message'

hello
my default message


In [59]:
def full_name(first = "What's-his-name", last = "Something"):
    return first + " " + last

print(full_name("Joel", "Grus"))     # "Joel Grus"
print(full_name("Joel"))             # "Joel Something"
print(full_name(last="Grus"))        # "What's-his-name Grus"

Joel Grus
Joel Something
What's-his-name Grus


## Strings

In [60]:
single_quoted_string = 'data science'
double_quoted_string = "data science"

In [61]:
tab_string = "\t"       # represents the tab character
len(tab_string)         # is 1

1

In [62]:
not_tab_string = r"\t"  # represents the characters '\' and 't'
len(not_tab_string)     # is 2

2

In [63]:
# Double backslashes can also be used to escape the backslash character
not_tab_string = "\\t"  # represents the characters: '\' and 't'
len(not_tab_string)     # is 2

2

In [64]:
multi_line_string = """This is the first line.
and this is the second line
and this is the third line"""

In [65]:
first_name = "Joel"
last_name = "Grus"

In [66]:
full_name1 = first_name + " " + last_name             # string addition
full_name1

'Joel Grus'

In [67]:
full_name2 = "{0} {1}".format(first_name, last_name)  # string.format
full_name2

'Joel Grus'

In [68]:
full_name3 = f"{first_name} {last_name}"              # f-string
full_name3

'Joel Grus'

## Exceptions

In [1]:
try:
    print(0 / 0)
except ZeroDivisionError:
    print("cannot divide by zero")

cannot divide by zero


## Lists

In [70]:
interger_list = [1, 2, 3]
heterogeneous_list = ["string", 0.1, True]
list_of_lists = [interger_list, heterogeneous_list, []]

list_length = len(interger_list)     # equals 3
list_sum = sum(interger_list)        # equals 6

print(list_length)
print(list_sum)

3
6


In [71]:
x = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

zero = x[0]         # equals 0, lists are 0-indexed
print(zero)

one = x[1]          # equals 1
print(one)

nine = x[-1]        # equals 9, 'Pythonic' for last element
print(nine)

eight = x[-2]       # equals 8, 'Pythonic' for next-to-last element
print(eight)

x[0] = -1           # now x is [-1, 1, 2, 3, ..., 9]
print(x)

0
1
9
8
[-1, 1, 2, 3, 4, 5, 6, 7, 8, 9]


In [72]:
first_three = x[:3]                 # [-1, 1, 2]
print(first_three)

three_to_end = x[3:]                # [3, 4, ..., 9]
print(three_to_end)

one_to_four = x[1:5]                # [1, 2, 3, 4]
print(one_to_four)

last_three = x[-3:]                 # [7, 8, 9]
print(last_three)

without_first_and_last = x[1:-1]    # [1, 2, ..., 8]
print(without_first_and_last)

copy_of_x = x[:]                    # [-1, 1, 2, ..., 9]
print(copy_of_x)

[-1, 1, 2]
[3, 4, 5, 6, 7, 8, 9]
[1, 2, 3, 4]
[7, 8, 9]
[1, 2, 3, 4, 5, 6, 7, 8]
[-1, 1, 2, 3, 4, 5, 6, 7, 8, 9]


In [73]:
every_third = x[::3]                # [-1, 3, 6, 9]
print(every_third)

five_to_three = x[5:2:-1]           # [5, 4, 3]
print(five_to_three)

[-1, 3, 6, 9]
[5, 4, 3]


In [74]:
1 in [1, 2, 3]      # True

True

In [75]:
0 in [1, 2, 3]      # False

False

In [76]:
x = [1, 2, 3]
x.extend([4, 5, 6])     # x is now [1, 2, 3, 4, 5, 6]

x

[1, 2, 3, 4, 5, 6]

In [77]:
x = [1, 2, 3]
y = x + [4, 5, 6]       # y is [1, 2, 3, 4, 5, 6]; x is unchanged

print(y)
print(x)

[1, 2, 3, 4, 5, 6]
[1, 2, 3]


In [78]:
x = [1, 2, 3]
x.append(0)     # x is now [1, 2, 3, 0]
y = x[-1]       # equals 0
z = len(x)      # equals 4

print(x)
print(y)
print(z)

[1, 2, 3, 0]
0
4


In [79]:
x, y = [1, 2]       # now x is 1, y is 2

print(x)
print(y)

1
2


In [80]:
_, y = [1, 2]       # now y == 2, didn't care about the first element

y

2

## Tuples

In [81]:
my_list = [1, 2]
my_tuple = (1, 2)
other_tuple = 3, 4

my_list[1] = 3       # my_list is now [1, 3]

try:
    my_tuple[1] = 3
except TypeError:
    print("cannot modify a tuple")

cannot modify a tuple


In [82]:
def sum_and_product(x, y):
    return (x + y), (x * y)

sp = sum_and_product(2, 3)      # equals (5, 6)
s, p = sum_and_product(5, 10)   # s is 15, p is 50

print(sp)
print(s)
print(p)

(5, 6)
15
50


In [83]:
x, y = 1, 2     # now x is 1, y is 2

print(x)
print(y)

x, y = y, x     # Pythonic way to swap variables; now x is 2, y is 1

print(x)
print(y)

1
2
2
1


## Dictionaries

In [84]:
empty_dict = {}                     # Pythonic
empty_dict2 = dict()                # less Pythonic
grades = {"Joel": 80, "Tim": 95}    # dictionary literal

joels_grade = grades["Joel"]        # equals 80

joels_grade

80

In [85]:
try:
    kates_grade = grades["Kate"]
except KeyError:
    print("no grade for Kate!")

no grade for Kate!


In [86]:
joel_has_grade = "Joel" in grades     # True
kate_has_grade = "Kate" in grades     # False

print(joel_has_grade)
print(kate_has_grade)

True
False


In [87]:
joels_grade = grades.get("Joel", 0)   # equals 80
kates_grade = grades.get("Kate", 0)   # equals 0
no_ones_grade = grades.get("No One")  # default is None

print(joels_grade)
print(kates_grade)
print(no_ones_grade)

80
0
None


In [88]:
grades["Tim"] = 99          # replaces the old value
grades["Kate"] = 100        # adds a third entry
num_students = len(grades)  # equals 3

print(grades)
print(num_students)

{'Joel': 80, 'Tim': 99, 'Kate': 100}
3


In [89]:
tweet = {
    "user": "joelgrus",
    "text": "Data Science is Awesome",
    "retweet_count": 100,
    "hashtags": ["#data", "#science", "#datascience", "#awesome", "#yolo"]
}

tweet

{'user': 'joelgrus',
 'text': 'Data Science is Awesome',
 'retweet_count': 100,
 'hashtags': ['#data', '#science', '#datascience', '#awesome', '#yolo']}

In [90]:
tweet_keys = tweet.keys()       # iterable for the keys
tweet_values = tweet.values()   # iterable for the values
tweet_items = tweet.items()     # iterable for the (key, value) tuples

In [91]:
"user" in tweet_keys            # True, but not Pythonic

True

In [92]:
"user" in tweet                 # Pythonic way to check for keys

True

In [93]:
"joelgrus" in tweet_values      # True (slow but the only way to check)

True

In [94]:
document = ''

word_counts = {}
for word in document:
    if word in word_counts:
        word_counts[word] += 1
    else:
        word_counts[word] = 1

In [95]:
# Forgiveness is better than permission

word_counts = {}
for word in document:
    try:
        word_counts[word] += 1
    except KeyError:
        word_counts[word] = 1

In [96]:
word_counts = {}
for word in document:
    previous_count = word_counts.get(word, 0)
    word_counts[word] = previous_count + 1

In [97]:
from collections import defaultdict

word_counts = defaultdict(int)  # int() produces 0
for word in document:
    word_counts[word] += 1

In [98]:
dd_list = defaultdict(list)     # list() produces an empty list
dd_list[2].append(1)            # now dd_list contains {2: [1]}

dd_list

defaultdict(list, {2: [1]})

In [99]:
dd_dict = defaultdict(dict)     # dict() produces an empty dict
dd_dict["Joel"]["City"] = "Seattle"  # {"Joel": {"City": Seattle"}}

dd_dict

defaultdict(dict, {'Joel': {'City': 'Seattle'}})

In [100]:
dd_pair = defaultdict(lambda: [0, 0])
dd_pair[2][1] = 1               # now dd_pair contains {2: [0, 1]}

dd_pair

defaultdict(<function __main__.<lambda>()>, {2: [0, 1]})

```python
# This is an anonymous function (lambda function) that returns a list [0, 0].
lambda: [0, 0]
```

It is created a `defaultdict` named `dd_pair` with a default factory function `lambda: [0, 0]`. This means that if you try to access a key that does not exist in the dictionary, it will automatically create that key with the value `[0, 0]`.

```python
dd_pair[2][1] = 1
```

Here, you are accessing the key `2` in `dd_pair`. Since `2` does not exist yet, the `defaultdict` will use the lambda function to create it with the value `[0, 0]`. Then, it modifies the second element (index `1`) of the list associated with key `2` to `1`. So, `dd_pair` now contains `{2: [0, 1]}`.

## Counters

In [101]:
from collections import Counter

c = Counter([0, 1, 2, 0])   # c is (basically) {0: 2, 1: 1, 2: 1}

c

Counter({0: 2, 1: 1, 2: 1})

In [102]:
# Recall, document is a list of words
word_counts = Counter(document)

In [103]:
# Print the 10 most common words and their counts
for word, count in word_counts.most_common(10):
    print(word, count)

## Sets

In [104]:
primes_below_10 = {2, 3, 5, 7}

s = set()
s.add(1)       # s is now {1}
s.add(2)       # s is now {1, 2}
s.add(2)       # s is still {1, 2}
x = len(s)     # equals 2
y = 2 in s     # equals True
z = 3 in s     # equals False

print(s)
print(x)
print(y)
print(z)

{1, 2}
2
True
False


In [105]:
hundreds_of_other_words = []     # Example of a list

stopwords_list = ["a", "an", "at"] + hundreds_of_other_words + ["yet", "you"]

"zip" in stopwords_list     # False, but have to check every element

False

In [106]:
stopwords_set = set(stopwords_list)
"zip" in stopwords_set      # very fast to check

False

In [107]:
item_list = [1, 2, 3, 1, 2, 3]
num_items = len(item_list)          # 6
item_set = set(item_list)           # {1, 2, 3}
num_distinct_items = len(item_set)  # 3
distinct_item_list = list(item_set) # [1, 2, 3]

print(num_items)
print(item_set)
print(num_distinct_items)
print(distinct_item_list)

6
{1, 2, 3}
3
[1, 2, 3]


## Control Flow

In [108]:
if 1 > 2:
    message = "if only 1 were greater than 2..."
elif 1 > 3:
    message = "elif stands for 'else if'"
else:
    message = "when all else fails use else (if you want to)"

message

'when all else fails use else (if you want to)'

In [109]:
x = 10

parity = "even" if x % 2 == 0 else "odd"

parity

'even'

In [110]:
x = 0
while x < 10:
    print(f"{x} is less than 10")
    x += 1

0 is less than 10
1 is less than 10
2 is less than 10
3 is less than 10
4 is less than 10
5 is less than 10
6 is less than 10
7 is less than 10
8 is less than 10
9 is less than 10


In [111]:
# range(10) is the numbers 0, 1, ..., 9
for x in range(10):
    print(f"{x} is less than 10")

0 is less than 10
1 is less than 10
2 is less than 10
3 is less than 10
4 is less than 10
5 is less than 10
6 is less than 10
7 is less than 10
8 is less than 10
9 is less than 10


In [112]:
for x in range(10):
    if x == 3:
        continue  # go immediately to the next iteration
    if x == 5:
        break     # quit the loop entirely
    print(x)

0
1
2
4


## Truthiness

In [113]:
one_is_less_than_two = 1 < 2      # equals True

one_is_less_than_two

True

In [114]:
true_equals_false = True == False  # equals False

true_equals_false

False

In [115]:
x = None
assert x == None, "this is the not the Pythonic way to check for None"
assert x is None, "this is the Pythonic way to check for None"

In [116]:
# falsy values
False
None
[] # an empty list
{} # an empty dict
""
set()
0
0.0

0.0

In [117]:
def some_function_that_returns_a_string():  # Example function
    return 'text'

s = some_function_that_returns_a_string()
if s:
    first_char = s[0]
else:
    first_char = ""

first_char

't'

In [118]:
first_char = s and s[0]

first_char

't'

In [119]:
safe_x = x or 0

safe_x

0

In [120]:
safe_x = x if x is not None else 0

safe_x

0

In [121]:
all([True, 1, {3}])   # True, all are truthy

True

In [122]:
all([True, 1, {}])    # False, {} is falsy

False

In [123]:
any([True, 1, {}])    # True, True is truthy

True

In [124]:
all([])               # True, no falsy elements in the list

True

In [125]:
any([])               # False, no truthy elements in the list

False

## Sorting

In [126]:
x = [4, 1, 2, 3]
y = sorted(x)          # y is [1, 2, 3, 4], x is unchanged

print(y)
print(x)

[1, 2, 3, 4]
[4, 1, 2, 3]


In [127]:
x.sort()               # now x is [1, 2, 3, 4]

print(x)

[1, 2, 3, 4]


In [128]:
# Sort the list by absolute value from largest to smallest
x = sorted([-4, 1, -2, 3], key=abs, reverse=True)  # is [-4, 3, -2, 1]

print(x)

[-4, 3, -2, 1]


In [129]:
# Sort the words and counts from highest count to lowest
wc = sorted(word_counts.items(),
            key=lambda word_and_count: word_and_count[1],
            reverse=True)

## List Comprehensions

In [130]:
even_numbers = [x for x in range(5) if x % 2 == 0]  # [0, 2, 4]
squares = [x * x for x in range(5)]                 # [0, 1, 4, 9, 16]
even_squares = [x * x for x in even_numbers]        # [0, 4, 16]

print(even_numbers)
print(squares)
print(even_squares)

[0, 2, 4]
[0, 1, 4, 9, 16]
[0, 4, 16]


In [131]:
square_dict = {x: x * x for x in range(5)}  # {0: 0, 1: 1, 2: 4, 3: 9, 4: 16}
square_set = {x * x for x in [1, -1]}       # {1}

print(square_dict)
print(square_set)

{0: 0, 1: 1, 2: 4, 3: 9, 4: 16}
{1}


In [132]:
zeros = [0 for _ in even_numbers]      # Has the same length as even_numbers

print(zeros)
print(len(even_numbers))
print(len(zeros))

[0, 0, 0]
3
3


In [134]:
pairs = [(x, y)
         for x in range(10)
         for y in range(10)]  # 100 pairs (0,0) (0,1) ... (9,8), (9,9)

increasing_pairs = [(x, y)                      # only pairs with x < y,
                    for x in range(10)          # range(lo, hi) equals
                    for y in range(x + 1, 10)]  # [lo, lo + 1, ..., hi - 1]

## Automated Testing and assert

In [135]:
assert 1 + 1 == 2
assert 1 + 1 == 2, "1 + 1 should equal 2 but didn't"

In [136]:
def smallest_item(xs):
    return min(xs)

assert smallest_item([10, 20, 5, 40]) == 5
assert smallest_item([1, 0, -1, 2]) == -1

In [137]:
def smallest_item(xs):
    assert xs, "empty list has no smallest item"
    return min(xs)

## Object-Oriented Programming

In [146]:
class CountingClicker:
    """A class can/should have a docstring, just like a function"""

    def __init__(self, count=0):
        self.count = count

    def __repr__(self):
        return f"CountingClicker(count={self.count})"

    def click(self, num_times=1):
        """Click the clicker some number of times."""
        self.count += num_times

    def read(self):
        return self.count

    def reset(self):
        self.count = 0

In [147]:
clicker1 = CountingClicker()            # Initialized to 0
clicker2 = CountingClicker(100)         # Starts with count=100
clicker3 = CountingClicker(count=100)   # More explicit way of doing the same

In [153]:
clicker = CountingClicker()
assert clicker.read() == 0, "clicker should start with count 0"
clicker.click()
clicker.click()
assert clicker.read() == 2, "after two clicks, clicker should have count 2"
clicker.reset()
assert clicker.read() == 0, "after reset, clicker should be back to 0"

In [154]:
# A subclass inherits all the behavior of its parent class.
class NoResetClicker(CountingClicker):
    # This class has all the same methods as CountingClicker
    # Except that it has a reset method that does nothing.
    def reset(self):
        pass

In [155]:
clicker2 = NoResetClicker()
assert clicker2.read() == 0
clicker2.click()
assert clicker2.read() == 1
clicker2.reset()
assert clicker2.read() == 1, "reset shouldn't do anything"

## Iterables and Generators

In [156]:
def generate_range(n):
    i = 0
    while i < n:
        yield i   # every call to yield produces a value of the generator
        i += 1

for i in generate_range(10):
    print(f"i: {i}")

i: 0
i: 1
i: 2
i: 3
i: 4
i: 5
i: 6
i: 7
i: 8
i: 9


In [159]:
def natural_numbers():
    """returns 1, 2, 3, ..."""
    n = 1
    while True:
        yield n
        n += 1

In [165]:
evens_below_20 = (i for i in generate_range(20) if i % 2 == 0)

for i in evens_below_20:
    print(i)

0
2
4
6
8
10
12
14
16
18


In [166]:
# None of these computations does anything until we iterate
data = natural_numbers()
evens = (x for x in data if x % 2 == 0)
even_squares = (x ** 2 for x in evens)
even_squares_ending_in_six = (x for x in even_squares if x % 10 == 6)
# And so on

In [167]:
names = ["Alice", "Bob", "Charlie", "Debbie"]

# Not Pythonic
for i in range(len(names)):
    print(f"name {i} is {names[i]}")

name 0 is Alice
name 1 is Bob
name 2 is Charlie
name 3 is Debbie


In [168]:
# Also not Pythonic
i = 0
for name in names:
    print(f"name {i} is {name}")
    i += 1

name 0 is Alice
name 1 is Bob
name 2 is Charlie
name 3 is Debbie


In [169]:
# Pythonic
for i, name in enumerate(names):
    print(f"name {i} is {name}")

name 0 is Alice
name 1 is Bob
name 2 is Charlie
name 3 is Debbie


## Randomness

In [172]:
import random
random.seed(10)  # This ensures we get the same results every time

four_uniform_randoms = [random.random() for _ in range(4)]

# [0.5714025946899135,       # random.random() produces numbers
#  0.4288890546751146,       # uniformly between 0 and 1.
#  0.5780913011344704,       # It's the random function we'll use
#  0.20609823213950174]      # most often.

four_uniform_randoms

[0.5714025946899135,
 0.4288890546751146,
 0.5780913011344704,
 0.20609823213950174]

In [183]:
random.seed(10)         # set the seed to 10
print(random.random())  # 0.5714025946899135
random.seed(10)         # reset the seed to 10
print(random.random())  # 0.5714025946899135 again

0.5714025946899135
0.5714025946899135


In [184]:
random.randrange(10)    # choose randomly from range(10) = [0, 1, ..., 9]

6

In [185]:
random.randrange(3, 6)  # choose randomly from range(3, 6) = [3, 4, 5]

4

In [187]:
up_to_ten = [x for x in range (1, 11)]
random.shuffle(up_to_ten)
print(up_to_ten)
# [7, 2, 6, 8, 9, 4, 10, 1, 3, 5]   (your results will probably be different)

[7, 9, 5, 8, 1, 3, 10, 4, 2, 6]


In [189]:
my_best_friend = random.choice(["Alice", "Bob", "Charlie"])     # "Bob" for me

my_best_friend

'Bob'

In [191]:
lottery_numbers = range(60)
winning_numbers = random.sample(lottery_numbers, 6)  # [16, 36, 10, 6, 25, 9]

winning_numbers

[43, 19, 42, 23, 8, 29]

In [193]:
four_with_replacement = [random.choice(range(10)) for _ in range(4)]
print(four_with_replacement)  # [9, 4, 4, 2]

[0, 9, 0, 3]


## Regular Expressions

In [195]:
import re

re_examples = [                             # All of these are True, because
    not re.match("a", "cat"),               # 'cat' doesn't start with 'a'
    re.search("a", "cat"),                  # 'cat' has an 'a' in it
    not re.search("c", "dog"),              # 'dog' doesn't have a 'c' in it
    3 == len(re.split("[ab]", "carbs")),    # split on a or b to ['c','r','s']
    "R-D-" == re.sub("[0-9]", "-", "R2D2")  # replace digits with dashes
]

assert all(re_examples), "all the regex examples should be True"

## zip and Argument Unpacking

In [207]:
list1 = ['a', 'b', 'c']
list2 = [1, 2, 3]

# zip is lazy, so you have to do something like the following
[pair for pair in zip(list1, list2)]  # is [('a', 1), ('b', 2), ('c', 3)]

[('a', 1), ('b', 2), ('c', 3)]

In [209]:
pairs = [('a', 1), ('b', 2), ('c', 3)]
letters, numbers = zip(*pairs)

print(letters)  # ('a', 'b', 'c')
print(numbers)  # (1, 2, 3)

('a', 'b', 'c')
(1, 2, 3)


The asterisk `*` performs argument unpacking, which uses the elements of pairs as individual arguments to `zip`.

In [210]:
letters, numbers = zip(('a', 1), ('b', 2), ('c', 3))

print(letters)  # ('a', 'b', 'c')
print(numbers)  # (1, 2, 3)

('a', 'b', 'c')
(1, 2, 3)


In [215]:
def add(a, b): return a + b
add(1, 2)  # returns 3

3

In [216]:
try:
    add([1, 2])
except TypeError:
    print("add expects two inputs")

add(*[1, 2])  # returns 3

add expects two inputs


3

## args and kwargs

In [223]:
def doubler(f):
    # Here we define a new function that keeps a reference to f
    def g(x):
        return 2 * f(x)
    
    # And return that new function
    return g

def f1(x):
    return x + 1

g = doubler(f1)
assert g(3) == 8, "(3 + 1) * 2 should equal 8"
assert g(-1) == 0, "(-1 + 1) * 2 should equal 0"

In [224]:
def f2(x, y):
    return x + y

g = doubler(f2)
try:
    g(1, 2)
except TypeError:
    print("as defined, g only takes one argument")

as defined, g only takes one argument


In [225]:
def magic(*args, **kwargs):
    print("unnamed args:", args)
    print("keyword args:", kwargs)

magic(1, 2, key="word", key2="word2")

# prints
# unnamed args: (1, 2)
# keyword args: {'key': 'word', 'key2': 'word2'}

unnamed args: (1, 2)
keyword args: {'key': 'word', 'key2': 'word2'}


In [226]:
def other_way_magic(x, y, z):
    return x + y + z

x_y_list = [1, 2]
z_dict = {"z": 3}
assert other_way_magic(*x_y_list, **z_dict) == 6, "1 + 2 + 3 should be 6"

In [227]:
def doubler_correct(f):
    """works no matter what kind of inputs f expects"""
    def g(*args, **kwargs):
        """whatever arguments g is supplied, pass them through to f"""
        return 2 * f(*args, **kwargs)
    return g

g = doubler_correct(f2)
assert g(1, 2) == 6, "doubler should work now"

## Type Annotations

In [1]:
def add(a, b):
    return a + b

assert add(10, 5) == 15,                    "+ is valid for numbers"
assert add([1, 2], [3]) == [1, 2, 3],       "+ is valid for lists"
assert add("hi ", "there") == "hi there",   "+ is valid for strings"

try:
    add(10, "five")
except TypeError:
    print("cannot add an int to a string")

cannot add an int to a string


In [5]:
def add(a: int, b: int) -> int:
    return a + b

add(10, 5)              # You'd like this to be OK
add("hi ", "there")     # You'd like this to be not OK

'hi there'

<div class="alert alert-block alert-success">
<li>Types are an important form of documentation.</li>
</div>

```python
def dot_product(x, y): ...
```

We have not yet defined Vector, but imagine we had
```python
def dot_product(x: Vector, y: Vector) -> float: ...
```

<div class="alert alert-block alert-success">
<li>There are external tools (the most popular is <b><i>mypy</i></b>) that will read your code, inspect the type annotations, and let you know about type errors <i>before you ever run your code</i>.</li>
</div>

<div class="alert alert-block alert-success">
<li>Having to think about the types in your code forces you to design cleaner functions and interfaces.</li>
</div>

```python
from typing import Union

def secretly_ugly_function(value, operation): ...

def ugly_function(value: int,
                  operation: Union[str, int, float, bool]) -> int: ...
```

<div class="alert alert-block alert-success">
<li>Using types allows your editor to help you with things like autocomplete and to get angry at type errors.</li>
</div>

In [8]:
def total(xs: list) -> float:
    return sum(xs)

In [9]:
from typing import List     # note capital L

def total(xs: List[float]) -> float:
    return sum(xs)

In [10]:
# This is how to type-annotate variables when you define them.
# But this is unnecessary; it's "obvious" x is an int.
x: int = 5

In [11]:
values = []         # What's my type?
best_so_far = None  # What's my type?

In [12]:
from typing import Optional

values: List[int] = []
best_so_far: Optional[float] = None  # allowed to be either a float or None

In [14]:
# The type annotations in this snippet are all unnecessary
from typing import Dict, Iterable, Tuple

# keys are strings, values are ints
counts: Dict[str, int] = {'data': 1, 'science': 2}

# lists and generators are both iterable
lazy = True     # Defined for if statement
if lazy:
    evens: Iterable[int] = (x for x in range(10) if x % 2 == 0)
else:
    evens = [0, 2, 4, 6, 8]

# tuples specify a type for each element
triple: Tuple[int, float, int] = (10, 2.3, 5)

In [15]:
from typing import Callable

# The type hint says that repeater is a function that takes
# two arguments, a string and an int, and returns a string.
def twice(repeater: Callable[[str, int], str], s: str) -> str:
    return repeater(s, 2)

def comma_repeater(s: str, n: int) -> str:
    n_copies = [s for _ in range(n)]
    return ', '.join(n_copies)

assert twice(comma_repeater, "type hints") == "type hints, type hints"

In [16]:
Number = int
Numbers = List[Number]

def total(xs: Numbers) -> Number:
    return sum(xs)