# Effective Python

## CH 1: Pythonic Thinking

### Item 6: Prefer Multiple Assignment Unpacking Over Indexing

In [1]:
# instead of indexing a tuple...
item = ('Peanut butter', 'Jelly')
print(item[0])

# unpack the items
first, second = item

print(first)

# unpacking is less visual noise



Peanut butter
Peanut butter


In [2]:
# Unpacking syntax also allows you to swap values in a single line

def bubble_sort(a):
    for _ in range(len(a)):
        for i in range(1, len(a)):
            if a[i] < a[i-1]:
                a[i-1], a[i] = a[i], a[i-1]
                # print(f'{_:<3}{i:<3}{a}')

names = [4,3,9,0,3,8,2,3,1,4,1]

bubble_sort(names)

print(names)

[0, 1, 1, 2, 3, 3, 3, 4, 4, 8, 9]


In [29]:
snacks = [('bacon',350), ('donut', 240), ('muffin', 190)]

def nested_bubble_sort(a):
    # sorting for the snacks var with nested tuple numbers
    for _ in range(len(a)):
        for i in range(1, len(a)):
            if a[i][1] < a[i-1][1]:
                a[i-1], a[i] = a[i], a[i-1]

nested_bubble_sort(snacks)

for rank, (name, calories) in enumerate(snacks, 1):
    print(f'#{rank}: {name} has {calories} calories')

#1: muffin has 190 calories
#2: donut has 240 calories
#3: bacon has 350 calories


### Item7: Prefer `enumerate` over `range`

`enumerate` is a built-in lazy generator that wraps any iterator. It yields pairs of the loop index of the item in the iterator.

In [30]:
flavors = ['chocolate', 'vanilla','strawberry','grape']

it = enumerate(flavors)

# manually advancing the generator
print(next(it))
print(next(it))

(0, 'chocolate')
(1, 'vanilla')


In [32]:
# second arg of enumerate lets you specify which number to begin the count. Default is zero index.
for i, flavor in enumerate(flavors, 1):
    print(f'{i}: {flavor}')

1: chocolate
2: vanilla
3: strawberry
4: grape


### Item 8: Use `zip` to process iterators in parallel

In [33]:
names = ['Coraline', 'Jethro', 'Bill', 'Melody', 'Snerf']
counts = [len(n) for n in names]
longest_name = None
max_len = 0

for name, count in zip(names, counts):
    if count > max_len:
        longest_name = name
        max_len = count

print(f'longest name: {longest_name}, {max_len} letters')



longest name: Coraline, 8 letters


In [34]:

names.append('Rosalind')

for name, count in zip(names, counts):
    print(name, count)

Coraline 8
Jethro 6
Bill 4
Melody 6
Snerf 5


Note above that zip truncates the output to the smallest list (as it doesn't print out anything for Rosalind). Zip works by outputting tuples until one of the generators stops generating. This works well if you know iterators are of the same length.

As an alternative use the `zip_longest` function from the `itertools` pkg if the lists are of different lengths. Default fill value is `None` and can be defined by using `fillvalue` arg

In [36]:
import itertools

for name, count in itertools.zip_longest(names, counts, fillvalue="unknown"):
    print(name, count)

Coraline 8
Jethro 6
Bill 4
Melody 6
Snerf 5
Rosalind unknown


### Item 9 Avoid `else` blocks after `for` and `while` loops

### Item 10: Prevent Repitition with Assignment Expressions

Also called the walrus operator

In [57]:
fresh_fruit = {
    'apple':3,
    'banana':4,
    'lemon':5
}

def make_lemonade(count):
    print(f'Making {count} lemons into lemonade')
    
def make_cider(count):
    print(f'Making cider with {count} apples')
    
def make_smoothies(count):
    assert count > 0
    print(f"Making smoothies with {count} banana slices")
    
def slice_bananas(count):
    
    print(f'Slicing {count} bananas')
    return count * 4
    
def out_of_stock():
    print('Out of stock!')
    

    

class OutOfBananas(Exception):
    pass

In [58]:
# old way
# count = fresh_fruit.get('lemon', 0)
# condenses the variable assignment to a combined assignment and expression, hence the name

if count:= fresh_fruit.get('lemon',0): #walus operator condenses both lines
    make_lemonade(count)
else:
    out_of_stock()
    
    
if (count:= fresh_fruit.get('apple',0)) >=4:
    make_cider(count)
else:
    out_of_stock()

Making 5 lemons into lemonade
Out of stock!


In [59]:
pieces = 0
if (count:= fresh_fruit.get('banana',0)) >=2:
    pieces = slice_bananas(count)
    
try:
    smoothies = make_smoothies(pieces)
except OutOfBananas:
    out_of_stock()

Slicing 4 bananas
Making smoothies with 16 banana slices


In [53]:
# switch/case statement aren't expressly available, but the walrus helps

if (count:= fresh_fruit.get('banana',0)) >= 2:
    pieces = slice_bananas(count)
    to_enjoy = make_smoothies(pieces)
elif (count:= fresh_fruit.get('apple',0)) >=4:
    to_enjoy = make_cider(count)
elif (count:= fresh_fruit.get('lemon', 0)):
    to_enjoy = make_lemonade(count)
else:
    to_enjoy='Nothing'

Making 5 lemons into lemonade


In [54]:
# do/while loops are also not available in python, but 

FRUIT_TO_PICK = [
    {'apple': 1, 'banana': 3, 'papaya':5},
    {'lemon': 2, 'lime': 5},
    {'orange': 3, 'melon': 2},
]


def pick_fruit():
    if FRUIT_TO_PICK:
        return FRUIT_TO_PICK.pop(0)
    else:
        return []


def make_juice(fruit, count):
    return[(fruit, count)]

bottles = []
while next_fruit:= pick_fruit():
    for fruit, count in next_fruit.items():
        batch = make_juice(fruit, count)
        bottles.extend(batch)

print(bottles)
    

[('apple', 1), ('banana', 3), ('papaya', 5), ('lemon', 2), ('lime', 5), ('orange', 3), ('melon', 2)]


# CH 2: Lists and Dictionaries

### Item 13: Prefer Catch-All Unpacking Over Slicing

Python supports catch-all expressions via starred expressions `*`

In [61]:
car_ages = [0, 9, 4, 8, 7, 20, 19, 1, 6, 15]
car_ages_descending = sorted(car_ages, reverse=True)

oldest, second_oldest, *others = car_ages_descending #this syntax allows one part of the unpacking to receive the rest
print(oldest, second_oldest, others)

# code is shorter, easier to read, and does not rely on indexing which is brittle

# starred expression can appear in any position
oldest, *middle, youngest = car_ages_descending #this syntax allows one part of the unpacking to receive the rest
print(oldest, middle, youngest)

20 19 [15, 9, 8, 7, 6, 4, 1, 0]
20 [19, 15, 9, 8, 7, 6, 4, 1] 0


In [64]:
def generate_csv():
    yield ('Date', 'Make' , 'Model', 'Year', 'Price')
    for i in range(100):
        yield ('2019-03-25', 'Honda', 'Fit' , '2010', '$3400')
        yield ('2019-03-26', 'Ford', 'F150' , '2008', '$2400')

csv = generate_csv()

header, *rows = csv

print('csv header: ', header)
print('rowcount: ', len(rows))

csv header:  ('Date', 'Make', 'Model', 'Year', 'Price')
rowcount:  200


### Item 14: Sort by Complex Criteria Using the `key` Parameter

`sort()` defaults to ordering a lists contents by natural ascending order, but may not always work the way you want when you have a custom class

In [1]:
class Tool:
    def __init__(self, name, weight):
        self.name = name
        self.weight = weight
    def __repr__(self): # Item 75: repr method used for debugging
        return f'Tool({self.name!r}, {self.weight})'
    
tools = [
    Tool('level', 3.5),
    Tool('hammer',1.25),
    Tool('screwdriver', 0.5),
    Tool('chisel', 0.25)
]

In [2]:
tools.sort()

TypeError: '<' not supported between instances of 'Tool' and 'Tool'

In [3]:
print('Unsorted:', repr(tools))
tools.sort(key=lambda x: x.name)
print('Sorted:', tools)

Unsorted: [Tool('level', 3.5), Tool('hammer', 1.25), Tool('screwdriver', 0.5), Tool('chisel', 0.25)]
Sorted: [Tool('chisel', 0.25), Tool('hammer', 1.25), Tool('level', 3.5), Tool('screwdriver', 0.5)]


In [4]:
tools.sort(key=lambda x: x.weight)
print('By wieght:', tools)

By wieght: [Tool('chisel', 0.25), Tool('screwdriver', 0.5), Tool('hammer', 1.25), Tool('level', 3.5)]


In [7]:
# sorting by multiple criteria? Use tuple

power_tools = [
    Tool('drill', 4),
    Tool('circular saw', 5),
    Tool('jackhammer', 40),
    Tool('sander', 4),
]

saw = (5, 'circular saw')
jackhammer = (40, 'jackhammer')
assert not (jackhammer < saw)

drill = (4, 'drill')
sander = (4, 'sander')
assert drill[0] == sander[0]  # Same weight
assert drill[1] < sander[1]   # Alphabetically less
assert drill < sander         # Thus, drill comes first

power_tools.sort(key= lambda x: (x.weight, x.name))
print(power_tools)

[Tool('drill', 4), Tool('sander', 4), Tool('circular saw', 5), Tool('jackhammer', 40)]


## CH 3: Functions


### Item 19: Never unpack more than 3 vars when functions return multiple values


Reasons: super easy to mix up the variables, line becomes too long and awkward with PEP8

Use: Catch-all starred expressions

In [1]:
def multi_return():
    return 'first', 'middle1', 'middle2', 'middle3', 'last'

first, *middle, last = multi_return()

print(first, middle, last)

first ['middle1', 'middle2', 'middle3'] last


### Item 22: Reduce Visual Noise with Variable Positional Arguments

`*args` in a function will allow you to add multiple items in one arg

In [2]:
def log(message, *values):
    if not values:
        print(message)
    else:
        values_str = ', '.join(str(x) for x in values)
        print(f'{message}: {values_str}')

log('My numbers are', 1, 2 ,3)
log('My numbers are', 4, 6)
log('Hello')

My numbers are: 1, 2, 3
My numbers are: 4, 6
Hello


### Item 23: Provide Optional Behavior with `kwargs`

Key word arguments (kwargs) can be passed in a function using `**` before a dict object like so:

In [4]:
def remainder(numerator, denominator):
    return numerator/denominator

my_kwargs = {'numerator':6, 'denominator':6}

remainder(**my_kwargs)

1.0

In [6]:
# You can also use the ** operator multiple times if you know the dicts don't have repeated keys

my_kwargs = {'numerator':6}
other_kwargs = {'denominator':6}

remainder(**my_kwargs, **other_kwargs)

1.0

In [7]:
# Your function can also accept kwargs from the outset

def print_params(**kwargs):
    for key, value in kwargs.items():
        print(f'{key} = {value}')

print_params(alpha=1.5, beta=2.4, gamma=73)

alpha = 1.5
beta = 2.4
gamma = 73


### Item 24: User `None` and Docstrings to Specify Dynamic Default Args

If you are creating a function that must dynamically update, you cannot do that from the function's args

In [11]:
from datetime import datetime
from time import sleep

def print_datetime(message, when=datetime.now()):
    print(message, when)

print_datetime(message='now')
sleep(2)
print_datetime(message='not later')

now 2023-09-29 15:35:11.448852
not later 2023-09-29 15:35:11.448852


The issue above is that when the function is first initialized when the python file begins, it reads in the datetime and calls that datetime for every future call. It's not dynamic.

In order to make it dynamic, we must make the `when` arg `None` and specify how it works in docstrngs

This should also be done with items that produce `dict`

In [15]:
def print_datetime(message, when=None):
    '''Log message with a timestamp

    Args:
        message: string value to print
        when: timestamp of message defaults to current time
    '''

    if when is None:
        when = datetime.now()
    print(f'{when} - {message}')


print_datetime(message='now')
sleep(2)
print_datetime(message='later')

2023-09-29 16:20:59.186356 - now


2023-09-29 16:21:01.188372 - later


### Item 26: Define Function Decorators wiht `functools.wraps`

A decorator has the ability to run adidtional coe before and after each call to a funciton it wraps.

In [11]:
from functools import wraps # use this fcn when creating decorators to avoid issues

def trace(func):
    @wraps(func)
    def wrapper(*args, **kwargs):
        result = func(*args, **kwargs)
        print(f'{func.__name__}({args!r}, {kwargs!r}) '
              f'-> {result!r}'
              )
        return result
    return wrapper

@trace
def fibonacci(n):
    '''Return the nth Fib #'''
    if n in (0,1):
        return n
    return (fibonacci(n - 2) + fibonacci(n - 1))

fibonacci(6)

fibonacci((0,), {}) -> 0
fibonacci((1,), {}) -> 1
fibonacci((2,), {}) -> 1
fibonacci((1,), {}) -> 1
fibonacci((0,), {}) -> 0
fibonacci((1,), {}) -> 1
fibonacci((2,), {}) -> 1
fibonacci((3,), {}) -> 2
fibonacci((4,), {}) -> 3
fibonacci((1,), {}) -> 1
fibonacci((0,), {}) -> 0
fibonacci((1,), {}) -> 1
fibonacci((2,), {}) -> 1
fibonacci((3,), {}) -> 2
fibonacci((0,), {}) -> 0
fibonacci((1,), {}) -> 1
fibonacci((2,), {}) -> 1
fibonacci((1,), {}) -> 1
fibonacci((0,), {}) -> 0
fibonacci((1,), {}) -> 1
fibonacci((2,), {}) -> 1
fibonacci((3,), {}) -> 2
fibonacci((4,), {}) -> 3
fibonacci((5,), {}) -> 5
fibonacci((6,), {}) -> 8


8

In [12]:
help(fibonacci)

Help on function fibonacci in module __main__:

fibonacci(n)
    Return the nth Fib #



# CH 4: Comprehensions and Generators


### Item 29: Control Subexpression in Comprehensions


In [19]:
stock = {
    'nails': 125,
    'screws': 35,
    'wingnuts': 8,
    'washers': 24,
}

order = ['screws', 'wingnuts', 'nails']

def get_batches(count, size):
    return count // size


result = {name: batches for name in order if (batches := get_batches(stock.get(name,0),8))}

print(result)

{'screws': 4, 'wingnuts': 1, 'nails': 15}


### Item 30: Consider Generators Instead of Returning Lists

In [29]:
# Goal: find the indes of every word in a string:

address = ' Four score and seven years ago...'

# Typical Solution
def index_words(text):
    result = []

    for index, letter in enumerate(text):
        if (index == 0) and (letter != ' '):
            result.append(index)
        elif letter == ' ':
            result.append(index+1)
    return result

result = index_words(address)
print(result)


# Solution with a generator
def index_words_iter(text):
    for index, letter in enumerate(text):
        if (index == 0) and (letter != ' '):
            yield 0
        elif letter == ' ':
            yield index+1

it = index_words_iter(address)
print(next(it))
print(next(it))
print(next(it))
print(next(it))

result = list(index_words_iter(address))
print(result)

[1, 6, 12, 16, 22, 28]
1
6
12
16
[1, 6, 12, 16, 22, 28]


### Item 31: Be Defensive When Iterating Over Arguments

In [2]:
# Goal iterate through a list of cities to get tourism visits and normalize their values
# Most times if a fcn takes in a list it will need to iterate multiple times over that list
# Example:
def normalize(numbers):
    total = sum(numbers) # first time
    result = []
    for value in numbers: # second time
        percent = 100 * value / total
        result.append(percent)
    return result

visits = [15, 35, 80]

percentages = normalize(visits)
print(percentages)

[11.538461538461538, 26.923076923076923, 61.53846153846154]


In [3]:
# When dealing with a file of unkown size, the way we would scale this up is to use a generator

def read_visits(data_path):
    with open(data_path) as f:
        for line in f:
            yield int(line)


path = 'my_numbers.txt'

it = read_visits(path)
percentages = normalize(it)
print(percentages)

# However this returns an empty list because once a generator is finished generating, it won't run again
# So once it runs for the sum() it doesn't run for the for loop



[]


In [5]:
# To solve this, you could save the generated items in a list, but that would defeat the purpose Instead...
# Create a new container class that implements the iterator protocol .__iter__()

class ReadVisits:
    def __init__(self, data_path):
        self.data_path = data_path

    def __iter__(self):
        with open(self.data_path) as f:
            for line in f:
                yield int(line)

visits = ReadVisits(path)
percentages = normalize(visits)
print(percentages)

assert sum(percentages) == 100.0

[0.9036144578313253, 5.240963855421687, 3.9156626506024095, 2.1686746987951806, 1.5060240963855422, 40.8433734939759, 1.5060240963855422, 0.6626506024096386, 1.3855421686746987, 2.0481927710843375, 2.710843373493976, 3.3132530120481927, 9.036144578313253, 7.289156626506024, 2.5903614457831323, 0.3614457831325301, 4.578313253012048, 4.518072289156627, 5.421686746987952]


### Item 32: Consider Generator Expressions for Large List Comprehensions

List comprehensions are fine for small inputs, but for very large inputs, they can eat up all your memory, this is where generator comprehenions come in

In [10]:
value = [int(x) for x in open(path)]
value

[15, 87, 65, 36, 25, 678, 25, 11, 23, 34, 45, 55, 150, 121, 43, 6, 76, 75, 90]

In [12]:
# to make the above a generator comprehension, turn the square brackets into parenthesis

value = (int(x) for x in open(path))

print(value)
print(next(value))
print(next(value))

<generator object <genexpr> at 0x7fca16de1eb0>
15
87


In [13]:
# gnerator expressions can also be added together

roots = ((x, x**0.5) for x in value)

print(next(roots))
print(next(roots))

# notice that the second generator has advanced the original

(65, 8.06225774829855)
(36, 6.0)


### Item 33 Compose Multiple Generators with `yield from`

using generators on top of generators. `yield from` handles the otherwise necessary nested for loop and yield expression boilerplate

In [4]:
def move(period, speed):
    for _ in range(period):
        yield speed

def pause(delay):
    for _ in range(delay):
        yield 0

In [6]:

def animate_composed():
    yield from move(4, 5.-0)
    yield from pause(3)
    yield from move(2,3.0)
    

def render(delta):
    print(f'Delta: {delta:.1f}')

def run(func):
    for delta in func():
        render(delta)

run(animate_composed)

Delta: 5.0
Delta: 5.0
Delta: 5.0
Delta: 5.0
Delta: 0.0
Delta: 0.0
Delta: 0.0
Delta: 3.0
Delta: 3.0


### Item 36: Consider `itertools` for WOrking with Iterators and Generators

#### Linking Iterators Together

`chain` combines multiple iterators sequentially

In [8]:
import itertools

it = itertools.chain([1,2,3], [4,5,6])
print(list(it))

[1, 2, 3, 4, 5, 6]


`repeat` outputs a single value forever, or use the second param to specify a max number of times

In [9]:
it = itertools.repeat('hello', 3)
print(list(it))

['hello', 'hello', 'hello']


`cycle` repeats an iterator's items forever

In [12]:
it = itertools.cycle([1,2,3])
result = [next(it) for _ in range(10)]

print(result)
print(len(result))

[1, 2, 3, 1, 2, 3, 1, 2, 3, 1]
10


`tee` splits a single iterator in to the number of parallel iterators specified by the second param. (Memory can be an issue here since buffering is requied for the pending items)

In [17]:
it1, it2, it3, it4 = itertools.tee(['first', 'second', 'third'], 4)

print(list(it1))
print(list(it2))
print(list(it3))
print(list(it4))

['first', 'second', 'third']
['first', 'second', 'third']
['first', 'second', 'third']
['first', 'second', 'third']


`zip_longest` - variant of built-in `zip` function - returns a placeholder val when an iterator is exhausted (iterators of different lengths)

In [23]:
keys = ['one', 'two', 'three']
values = [1,2]

normal = list(zip(keys, values))
print('zip:        ', normal)

it = list(itertools.zip_longest(keys, values, fillvalue=None))
longest = list(it)
print('zip_longest:', longest)

zip:         [('one', 1), ('two', 2)]
zip_longest: [('one', 1), ('two', 2), ('three', None)]


#### Filtering Items from an Iterator

In [31]:
it = itertools.combinations([1,2,3,4,5], 4)

print(list(it))

[(1, 2, 3, 4), (1, 2, 3, 5), (1, 2, 4, 5), (1, 3, 4, 5), (2, 3, 4, 5)]


# CH 5: Classes and Interfaces

### Item 37 Compose Classes Instead of Nesting Built-in Types

In a situation where you have a gradebook where you are tracking student, subject, grade, and weight of each grade, eventually a reliance on dictionaries will become hard to read, mutate and keep track of. Here are some solutions regarding multple classes and inheritance

In [33]:
# Using namedtuple to create tiny immutable data structures
from collections import namedtuple, defaultdict

Grade = namedtuple('Grade', ('score', 'weight'))

# class to represent a single subject that contains a set of Grades/weights
class Subject:
    def __init__(self):
        self._grades = []

    def report_grade(self,score, weight):
        self._grades.append(Grade(score, weight))

    def average_grade(self):
        total, total_weight  = 0,0
        for grade in self._grades:
            total += grade.score * grade.weight
            total_weight += grade.weight
        return total / total_weight
    

# Class to represent a set of subject for each student
class Student:
    def __init__(self):
        self._subjects = defaultdict(Subject)
    
    def get_subject(self, name):
        return self._subjects[name]
    
    def average_grade(self):
        total, count = 0,0
        for subject in self._subjects.values():
            total += subject.average_grade()
            count += 1
        return total / count

# container class for all of the students keyed by name
class Gradebook:
    def __init__(self):
        self._students = defaultdict(Student)
    
    def get_student(self, name):
        return self._students[name]

In [35]:
book = Gradebook()

albert = book.get_student('Albert Einstein')
math = albert.get_subject("Math")
math.report_grade(75, 0.05)
math.report_grade(65, 0.15)
math.report_grade(70, 0.8)
gym = albert.get_subject('Gym')
gym.report_grade(100, 0.4)
gym.report_grade(85, 0.6)
print(albert.average_grade())

80.25


### Item 38: Accept Functions Instead of Classes for Simple Interfaces