## Item 27: Use Comprehensions Instead of `map` and `filter` ##

Python uses **list comprehensions** to provide a compact syntax for deriving a new `list` from another sequence or iterable.  Often times list comprehensions, because of their ability to implicity map and filter are going to be cleaner than the `map` and `filter` built-in functions as they don't require using lambda expressions.

In [14]:
# naive way (for loop and list.append)
a = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
squares = []
for x in a:
    squares.append(x**2)
print(squares)

[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]


In [15]:
# slightly better way (using map built-in function)

alt_squares = map(lambda x: x**2, a)
print(list(alt_squares))

[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]


In [17]:
# best way (list comprehensions)

alt_squares2 = [x**2 for x in a]
print(alt_squares2)

[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]


Unlike `map`, list comprehensions let you easily filter items from the input `list`:

In [18]:
even_squares = [x**2 for x in a if x % 2 == 0]
print(even_squares)

[4, 16, 36, 64, 100]


The `filter` built in function can be used along with `map` to achieve the same result, but is much harder to read:

In [19]:
alt = map(lambda x: x**2, filter(lambda x: x % 2 == 0, a))
print(list(alt))

[4, 16, 36, 64, 100]


## Item 28: Avoid More Than Two Control Subexpressions in Comprehensions ##

Beyond basic usage, comprehensions support multiple levels of looping, but it can quickly get unreadable.

In [20]:
matrix = [[1, 2 , 3], [4, 5, 6], [7, 8, 9]]
flat = [x for row in matrix for x in row]
print(flat)

[1, 2, 3, 4, 5, 6, 7, 8, 9]


In [23]:
# this can get abused quickly, though (barf):
my_lists = [
    [[1, 2, 3],
    [4, 5, 6],
    [7, 8, 9]]
]

flat = [x for sublist1 in my_lists
        for sublist2 in sublist1
        for x in sublist2]

print(flat)


[1, 2, 3, 4, 5, 6, 7, 8, 9]


## Item 29: Avoid Repeated Work in Comprehensions by Using Assignment Expressions ##

This one is pretty easy, remember the walrus operator `:=`?  Let's just try to use that when we can.

In [26]:
stock = {
    'nails': 125,
    'screws': 35,
    'wingnuts': 8,
    'washers': 24
}

order = ['screws', 'wingnuts', 'clips']

def get_batches(count, size):
    # note: the // operator is the floor divide operator
    return count // size

result = {}

for name in order:
    count = stock.get(name, 0)
    batches = get_batches(count, 8)
    if batches:
        result[name] = batches
        
print(result)

{'screws': 4, 'wingnuts': 1}


we can clean this up a bit using dictionary comprehension:

In [28]:
found = {name: get_batches(stock.get(name, 0), 8)
         for name in order
         if get_batches(stock.get(name, 0), 8)}
print(found)

{'screws': 4, 'wingnuts': 1}


This is much cleaner, but we are still repeating the `get_batches(stock.get(name, 0))` operation.  An easy solution to this is to use an assignment expression!

In [31]:
found = {name: batches for name in order
         if (batches := get_batches(stock.get(name, 0), 8))}

print(found)

{'screws': 4, 'wingnuts': 1}


## Item 30: Consider Generators Instead of Returning Lists ##  

Using generators can be clearer than the alternative.  
Generators can produce a sequence of outputs for arbitrarily large inputs because their working memory doesn't include all inputs and outputs.  

Once again, because python is **duck-typed** a generator is any function that returns values via the `yield` keyword instead of `return`



In [32]:
def index_words(text):
    result = []
    if text:
        result.append(0)
    for index, letter in enumerate(text):
        if letter == ' ':
            result.append(index + 1)
    return result

address = 'Four score and seven years ago...'
result = index_words(address)
print(result[:10])

[0, 5, 11, 15, 21, 27]


There are two problems with using this approach:
    1. The code is a bit dense and noisy
    2. index_words requires all results to be stored in the list before being returned.  For huge inputs, this can cause a program to run out of memory and crash.

In [34]:
import itertools

address_lines = """Four score and seven years
ago our fathers brought forth on this
continent a new nation, conceived in liberty,
and dedicated to the proposition that all men
are created equal."""

with open('address.txt', 'w') as f:
    f.write(address_lines)

def index_file(handle):
    offset = 0
    for line in handle:
        if line:
            yield offset
        for letter in line:
            offset += 1
            if letter == ' ':
                yield offset
                
with open('address.txt', 'r') as f:
    it = index_file(f)
    results = itertools.islice(it, 0, 10)
    print(list(results))

[0, 5, 11, 15, 21, 27, 31, 35, 43, 51]


## Item 31: Be Defensive When Iterating Over Arguments ##



You can create your own iterable object by implementing the `__iter__` special method.  

When Python sees a statement like `for x in foo` it actually calls `iter(foo)`.  The `iter` built-in function calls the `foo.__iter__` special method in turn.  The `__iter__` method must return an iterator object (which itself implements the `__next__` special method.  Then, the for loop repeatedly calls the `next` built-in function on the iterator object until it's exhausted (indicated by raising a `StopIteration` exception).  You can achieve this by implementing the `__iter__` special method as a **generator**:

### Motivating example:
    Say that I want to analyze tourism numbers for Texas, I have a data set containing the tourism numbers for each city and I'd like to figure out what percentage of overall tourism each city receives.  To do this I need a normalization function that sums the inputs to determine the total number of tourists per year, and then divides each city's individual visitor count by the total to find that city's contribution to the whole:

In [41]:
def normalize(numbers):
    total = sum(numbers)
    result = []
    for value in numbers:
        percent = 100 * value / total
        result.append(percent)
    return result

visits = [15, 35, 80]
percentages = normalize(visits)
print(percentages)
print(sum(percentages))

[11.538461538461538, 26.923076923076923, 61.53846153846154]
100.0


Now, we see that this works on small lists that we provide, but what if we want to scale things up and load in data from a file that contains every city in Texas?

In [43]:
# let's write data to a file:
path = 'my_numbers.txt'
with open(path, 'w') as f:
    for i in (15, 35, 80):
        f.write(f'{i}\n')   

def read_visits(data_path):
    with open(data_path) as f:
             for line in f:
                yield int(line)
                
it = read_visits(path)
percentages = normalize(it)
print(percentages)

[]


What the heck, calling `normalize` on the `read_visits` generator's return vaulue produces no results!  

This occurs because an iterator produces its results only a single time.  If you iterate over an iterator or generator that has already raised a `StopIteration` exception, you won't get any results the second time around.


In [45]:
it = read_visits(path)
print(list(it))
print(list(it))

[15, 35, 80]
[]


The problem is that you won't get errors when you iterate over an already exhausted iterator.  `for` loops, the `list` constructor and many other functions throughout the Python standard library expect the `StopIteration` exception to be raised during normal operation.  These functions can't tell the difference between an iterator that has no output and an iterator that had output and is now exhausted.  

To work around this we can try a couple different things:
1. you can explicitly exhaust an iterator and keep a copy of its entire contents in a list.  The probleme here is that the copy of the input iterator's contents could be extremely large, which is why we went with an iterator to begin with.
2. Accept a function that returns a new iterator each time it's called.
3. A better way is to provide a new container class that implements the **iterator protocol**.


In [38]:
class ReadVisits:
    """ Iterable container"""
    def __init__(self, data_path):
        self.data_path = data_path
        
    def __iter__(self):
        print('iter called')
        with open(self.data_path) as f:
            for line in f:
                yield int(line)

            
                
visits = ReadVisits(path)
percentages = normalize(visits)
print(percentages)
assert sum(percentages) == 100.0

iter called
iter called
[11.538461538461538, 26.923076923076923, 61.53846153846154]


## Item 32: Consider Generator Expressions for Large List Comprehensions ##

Say that I want to read a file and return the numbers of characters on each line.  Doing this with a list comprehension would require holding the length of every line of the file in memory.  If the file is enormous or perhaps a never ending network socket, using list comprehensions would be problematic.  

A better way would be to use **generator expressions (generator comprehensions)**, which are a generalization of list comprehensions and generators.  Generator expressions don't materialize the whole output sequence when they're run.  Instead, generator expressions evaluate to an iterator that yields one item at a time from the expression.  

You can create a generator expression by putting list-comprehension-like syntax between `()` characters.

In [51]:
# create a file of random numbers to use in this example
import random 

path = 'my_file.txt'

with open(path, 'w') as f:
    for _ in range(10):
        f.write('a' * random.randint(0, 100))
        f.write('\n')

In [56]:
# list comprehension:
value = [len(x) for x in open(path)]
print(value)

# generator expression:
it = (len(x) for x in open(path))
print(it)

for x in it:
    print(x)

[35, 99, 48, 10, 54, 60, 13, 96, 18, 29]
<generator object <genexpr> at 0x7f83445e1820>
35
99
48
10
54
60
13
96
18
29


Another powerful feature of generator expressions is that they can be composed together.  See this example where we take the iterator returned by the generator expression above and use it as the input for another generator expression:  

Each time I advance this iterator it also advances the interiot iterator, creating a domino effect of looping, evaluating conditional expressions, and passing around inputs and outputs, all while being as memory efficient as possible.

In [59]:
it = (len(x) for x in open(path))

roots = ((x, x**0.5) for x in it)

#print(next(roots))
# remember, the for loop will keep calling next() on the iterable object
for r in roots:
    print(r)

(35, 5.916079783099616)
(99, 9.9498743710662)
(48, 6.928203230275509)
(10, 3.1622776601683795)
(54, 7.3484692283495345)
(60, 7.745966692414834)
(13, 3.605551275463989)
(96, 9.797958971132712)
(18, 4.242640687119285)
(29, 5.385164807134504)


Chaining generators together like this executes very quickly in Python.  When you're looking for a way to compose functionality that's operating on a large stream of input, geneator expressions are a great choice.  The only gotcha is that the iterators returned by generator expressions are stateful, so you must be careful not to use these iterators more than once.

## Item 33: Compose Multiple Generators with `yield from` ##

The `yield from` expression allows you to yield all values from a nested genrator before returning control to the parent generator.  

Let's say I have a graphical program that's using generators to animate the movement of images onscreen.  I can define two generators that yield the expected onscreen time deltas for each part of the animation that I'm looking to create:

In [68]:
def move(period, speed):
    for _ in range(period):
        yield speed

def pause(delay):
    for _ in range(delay):
        yield 0
        
def render(delta):
    print(f'Delta: {delta:.1f}')
    # Do whatever else is needed to move the image onscreen
    
def run(func):
    for delta in func():
        render(delta)
        
def animate():
    for delta in move(4, 5.0):
        yield delta
    for delta in pause(3):
        yield delta
    for delta in move(2, 3.0):
        yield delta
        
run(animate)

Delta: 5.0
Delta: 5.0
Delta: 5.0
Delta: 5.0
Delta: 0.0
Delta: 0.0
Delta: 0.0
Delta: 3.0
Delta: 3.0


The problem with the above code is the repetitive nature of the `animate` function.  The redundancy of the `for` statements and `yield` expressions for each generator adds noise and reduces readability.  This would get pretty unreadable if we had a bunch more generators that we were working with.

A better way is to use the `yield from` expression, which allows you to yield all values from a nested generator before returning control to the parent generator.  `yield from` essentially causes the Python interpreter to handle the nested `for` loop and `yield` expressions boilerplate for you, which results in better performance.

In [69]:
def animate_composed():
    yield from move(4, 5.0)
    yield from pause(3)
    yield from move(2, 3.0)
    
run(animate_composed)

Delta: 5.0
Delta: 5.0
Delta: 5.0
Delta: 5.0
Delta: 0.0
Delta: 0.0
Delta: 0.0
Delta: 3.0
Delta: 3.0


## Item 34: Avoid Injecting Data into Generators with `send` ##

`yield` expressions provide generator functions with a simple way to produce an iterable series of output values.  However, this channel appears to be unidirectional: there's no immediately obvious way to simultaneously stream data in and out of a generator as it runs.  

Python generators support the `send` method, which upgrades `yield` expressions into a two-way channel.  The `send` method can be used to provide streaming inputs to a generator at the same time it's yielding outputs.  

When I call the `send` method instead of iterating the gneerator with a `for` loop or the `next` built-in function, the supplied parameter becomes the value of the `yield` expression when the generator is resumed.  

In [70]:
def my_generator():
    received = yield 1
    print(f'received = {received}')
    
it = iter(my_generator())
output = next(it)
print(f'Output = {output}')

try:
    next(it)
except StopIteration:
    pass

Output = 1
received = None


In [75]:
it = iter(my_generator())
output = it.send(None)
print(f'output = {output}')

try:
    it.send('Hello!')
except StopIteration:
    pass


output = 1
received = Hello!


I can take advantage of this send behavior in order to modulate the amplitude of a sine wave based on an input signal, though this can quickly get messy and return unexpected results as shown below:

In [82]:
import math


def transmit(output):
    if output is None:
        print(f'Output is None')
    else:
        print(f'Output: {output:>5.1f}')

def wave(amplitude, steps):
    step_size = 2 * math.pi / steps
    for step in range(steps):
        radians = step * step_size
        fraction = math.sin(radians)
        output = amplitude * fraction
        yield output

def wave_modulating(steps):
    step_size = 2 * math.pi / steps
    amplitude = yield             # Receive initial amplitude
    for step in range(steps):
        radians = step * step_size
        fraction = math.sin(radians)
        output = amplitude * fraction
        amplitude = yield output  # Receive next amplitude


def run_modulating(it):
    amplitudes = [
        None, 7, 7, 7, 2, 2, 2, 2, 10, 10, 10, 10, 10]
    for amplitude in amplitudes:
        output = it.send(amplitude)
        transmit(output)
        
print('Run the modulating wave function with yield')
run_modulating(wave_modulating(12))
print('\n')

'''
def complex_wave():
    yield from wave(7.0, 3)
    yield from wave(2.0, 4)
    yield from wave(10.0, 5)

run(complex_wave())
'''


def complex_wave_modulating():
    yield from wave_modulating(3)
    yield from wave_modulating(4)
    yield from wave_modulating(5)

print('Run the modulating wave function with yield from')
run_modulating(complex_wave_modulating())
print('Woah, lots of unexpected Nones')

Run the modulating wave function with yield
Output is None
Output:   0.0
Output:   3.5
Output:   6.1
Output:   2.0
Output:   1.7
Output:   1.0
Output:   0.0
Output:  -5.0
Output:  -8.7
Output: -10.0
Output:  -8.7
Output:  -5.0


Run the modulating wave function with yield from
Output is None
Output:   0.0
Output:   6.1
Output:  -6.1
Output is None
Output:   0.0
Output:   2.0
Output:   0.0
Output: -10.0
Output is None
Output:   0.0
Output:   9.5
Output:   5.9
Woah, lots of unexpected Nones


A better way to do this is to avoid the `send` method altogether and instead provide an input iterator to a set of composed generators.

In [74]:
import math

def transmit(output):
    if output is None:
        print(f'Output is None')
    else:
        print(f'Output: {output:>5.1f}')

def wave_cascading(amplitude_it, steps):
    step_size = 2 * math.pi / steps
    for step in range(steps):
        radians = step * step_size
        fraction = math.sin(radians)
        amplitude = next(amplitude_it)  # Get next input
        output = amplitude * fraction
        yield output
        
def complex_wave_cascading(amplitude_it):
    yield from wave_cascading(amplitude_it, 3)
    yield from wave_cascading(amplitude_it, 4)
    yield from wave_cascading(amplitude_it, 5)
    
def run_cascading():
    amplitudes = [7, 7, 7, 2, 2, 2, 2, 10, 10, 10, 10, 10]
    it = complex_wave_cascading(iter(amplitudes))
    for amplitude in amplitudes:
        output = next(it)
        transmit(output)
        
run_cascading()        

Output:   0.0
Output:   6.1
Output:  -6.1
Output:   0.0
Output:   2.0
Output:   0.0
Output:  -2.0
Output:   0.0
Output:   9.5
Output:   5.9
Output:  -5.9
Output:  -9.5


## Item 35: Avoid Cuasing State Transitions in Generators with `throw`##  

In addition to `yield from` expressions and the `send` method, another advanced generator feature is the `throw` method for re-raising `Exception` instances within ggenerator functions.  The way this works is simple: when the method is called, the next occurance ofa  `yield` expression re-raises the provided `Exception` instance  after its output is received instead of continuing normally:

In [84]:
class MyError(Exception):
    pass

def my_generator():
    yield 1
    yield 2
    yield 3
    
it = my_generator()
print(next(it))
print(next(it))
print(it.throw(MyError('test error')))

1
2


MyError: test error

Using throw harms readability because it requries additional nesting and boilerplate in order to riase and catch exceptions.  

A better way to provide exceptional behavior in generators is to use a class that implements the `__iter__` method along with methods to cause exceptional state transitions.

In [86]:
class Timer:
    def __init__(self, period):
        self.current = period
        self.period = period
        
    def reset(self):
        self.current = self.period
        
    def __iter__(self):
        while self.current:
            self.current -= 1
            yield self.current
            
def check_for_reset():
    # poll for external event
    pass

def announce(remaining):
    print(f'{remaining} ticks remaining')
    
def run():
    timer = Timer(4)
    for current in timer:
        if check_for_reset():
            timer.reset()
        announce(current)
        
run()

3 ticks remaining
2 ticks remaining
1 ticks remaining
0 ticks remaining


## Item 36: Consider `itertools` when Working with Iterators and Generators ##

The `itertools` built-in module conatins a large number of functions that are useful for organizing and interacting with iterators.  Whenever you find yourself dealing with tricky iteration code, it's worth looking at hte `itertools` documentation again to see if there's anything in there for you to use.



### Linking iterators together

In [98]:
import itertools

# Use chain to combime multiple iterators into a single sequential iterator:
print('--itertools.chain()')
it = itertools.chain([1, 2, 3],  [4, 5, 6])
print(list(it))

# use repeat to output a single value forever, or use the second parameter
# to specify a maximum number of times:
print('\n--itertools.repeat()')
it = itertools.repeat('hello', 3)
print(list(it))

# Use cycle to repeat an iterator's item forever:
print('\n--itertools.cycle()')
it = itertools.cycle([1, 2])
result = [next(it) for _ in range(10)]
print(result)

# Use tee to split a single iterator into the number of parallel iterators
# specified by the second parameter
print('\n--itertools.tee()')
it1, it2, it3 = itertools.tee(['first', 'second'], 3)
print(list(it1))
print(list(it2))
print(list(it3))

# Use zip_longest to return a placeholder value when an iterator is exhausted
# which may happen if iterators have different lengths:
print('\n--itertools.zip_longest()')
keys = ['one', 'two', 'three']
values = [1, 2]

normal = list(zip(keys, values))
print('zip:        ', normal)

it = itertools.zip_longest(keys, values, fillvalue='nope')
longest = list(it)
print('zip_longest:', longest)

--itertools.chain()
[1, 2, 3, 4, 5, 6]

--itertools.repeat()
['hello', 'hello', 'hello']

--itertools.cycle()
[1, 2, 1, 2, 1, 2, 1, 2, 1, 2]

--itertools.tee()
['first', 'second']
['first', 'second']
['first', 'second']

--itertools.zip_longest()
zip:         [('one', 1), ('two', 2)]
zip_longest: [('one', 1), ('two', 2), ('three', 'nope')]


### Filtering Items from an Iterator

In [105]:
# Use isslice to slice an iterator by numerical indices without copying.
print('\n--itertools.isslice()')

values = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

first_five = itertools.islice(values, 5)
print('First five: ', list(first_five))

middle_odds = itertools.islice(values, 2, 8, 2)
print('Middle odds:', list(middle_odds))


# Use takewhile to return items from an iterator until a predicate function
# returns False for an item
print('\n--itertools.takewhile(less_than_seven)')

values = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
less_than_seven = lambda x: x < 7
it = itertools.takewhile(less_than_seven, values)
print(list(it))


# Use dropwhile, which is the opposite of takewhile, to skip items
# from an iterator until the predicate function returns True for the first time
print('\n--itertools.dropwhile(less_than_seven)')

it = itertools.dropwhile(less_than_seven, values)
print(list(it))


# Use filterfalse, which is the opposite of the filter built-in function,
# to return all items from an iterator where a predicate function returns False
print('\n--filter(evens)')

evens = lambda x: x % 2 == 0

filter_result = filter(evens, values)
print('Filter:      ', list(filter_result))

print('\n--itertools.filterfalse(evens)')
filter_false_result = itertools.filterfalse(evens, values)
print('Filter false:', list(filter_false_result))


--itertools.isslice()
First five:  [1, 2, 3, 4, 5]
Middle odds: [3, 5, 7]

--itertools.takewhile(less_than_seven)
[1, 2, 3, 4, 5, 6]

--itertools.dropwhile(less_than_seven)
[7, 8, 9, 10]

--filter(evens)
Filter:       [2, 4, 6, 8, 10]

--itertools.filterfalse(evens)
Filter false: [1, 3, 5, 7, 9]


### Producing Combinations of Items from Iterators

In [114]:
# import the pretty print module to pretty print arbitrary python
# data structures in a form that can be used as input to the interpreter
from pprint import pprint

# Use accumulate to fold an item from the iterator into a running
# value by applying a function that takes two parameters.  
# it outputs the curren accumulated result for each input value:
print('\n--itertools.accumulate(values)')

values = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
sum_reduce = itertools.accumulate(values)
print('Sum:   ', list(sum_reduce))

def sum_modulo_20(first, second):
    output = first + second
    return output % 20

modulo_reduce = itertools.accumulate(values, sum_modulo_20)
print('Modulo:', list(modulo_reduce))


# Use product to return the Cartesian product of items from
# one or more iterators, which is a nice alternative to using deply
# nested list comprehensions:
print('\n--itertools.product()')

single = itertools.product([1, 2], repeat=2)
print('Single:  ', list(single))

multiple = itertools.product([1, 2], ['a', 'b'])
print('Multiple:', list(multiple))


# Use permutations to return the unique ordered permutations
# of length N with items from an iterator:
print('\n--itertools.permutations()')

it = itertools.permutations([1, 2, 3, 4], 2)
original_print = print
print = pprint
print(list(it))
print = original_print


# Use combinations to return the unordered combinations
# of length N with unrepeated items from an iterator:
print('\n--itertools.combinations()')

it = itertools.combinations([1, 2, 3, 4], 2)
print(list(it))


# combinations_with_replacement is the same as combinations
# but repeated values are allowed
print('\n--itertools.permutations_with_replacement()')

it = itertools.combinations_with_replacement([1, 2, 3, 4], 2)
original_print = print
print = pprint
print(list(it))
print = original_print


--itertools.accumulate(values)
Sum:    [1, 3, 6, 10, 15, 21, 28, 36, 45, 55]
Modulo: [1, 3, 6, 10, 15, 1, 8, 16, 5, 15]

--itertools.product()
Single:   [(1, 1), (1, 2), (2, 1), (2, 2)]
Multiple: [(1, 'a'), (1, 'b'), (2, 'a'), (2, 'b')]

--itertools.permutations()
[(1, 2),
 (1, 3),
 (1, 4),
 (2, 1),
 (2, 3),
 (2, 4),
 (3, 1),
 (3, 2),
 (3, 4),
 (4, 1),
 (4, 2),
 (4, 3)]

--itertools.combinations()
[(1, 2), (1, 3), (1, 4), (2, 3), (2, 4), (3, 4)]

--itertools.permutations_with_replacement()
[(1, 1), (1, 2), (1, 3), (1, 4), (2, 2), (2, 3), (2, 4), (3, 3), (3, 4), (4, 4)]
