### Selecting and Filtering

The *filter* function

You should already be familiar with the *filter* function -> *filter*(predicate, iterable)  
-> returns all elements of iterable where predicate(element) is **True**  
Predicate cane be **None**, in which case it is the identity function -> f(x) = x  
-> in other words, truthy elements only will be retained!

*filter* returns a **lazy iterator**

The same result can be achieved using generator expression:  
(item for item in iterable if pred(item)) -> predicate is not None  
(item for item in iterable if item) -> predicate is None  
or  
(item for item in iterable if bool(item) -> predicate is None


In [5]:
#Example:

print(list(filter(lambda x: x < 4, [1, 10, 2, 10, 3, 10])))
print(list(filter(None, [0,'','hello', 100, False])))

[1, 2, 3]
['hello', 100]


itertools.*filterfalse*  
This works the same way as the *filter* function but instead of retaining elements where the predicate evaluates to **True** it retains elements where the predicate evaluates to **False**

In [8]:
#Example:
from itertools import filterfalse

print(list(filterfalse(lambda x: x < 4, [1, 10, 2, 10, 3, 10])))
print(list(filterfalse(None, [0,'','hello', 100, False])))

[10, 10, 10]
[0, '', False]


itertools.*compress*  
It is a way of filtering one iterable, using the truthiness of items in another iterable

In [11]:
#Example:
from itertools import compress

data = ['a', 'b', 'c', 'd', 'e']
selectors = [True, False, 1, 0] #What about e? That gets associated with None, which is False.

list(compress(data, selectors))

['a', 'c']

Note, both filterfalse and compress return lazy iterators too. So does everything* from itertools (\*at least I think everything)

itertools.*takewhile*  
takewhile(pred, iterable)  
The *takewhile* function returns an iterator that will yield items while pred(item) is Truthy  
->  at that point the iterator is exhausted even if there are more items in the iterable whose predicate would be truthy

In [20]:
#Example:
from itertools import takewhile

takewhile_iter = (takewhile(lambda x: x < 5, [1, 3, 5, 2, 1]))

In [21]:
type(takewhile_iter)

itertools.takewhile

In [22]:
next(takewhile_iter)

1

In [23]:
next(takewhile_iter)

3

In [24]:
next(takewhile_iter)

StopIteration: 

Notice that 2 and 1 in position 3 and 4 from the list passed to the takewhile function meet the condition, but are not returned. This is the difference from the filter function. The iterator is exhausted once a single instance of Falsely is encountered.

In [26]:
list((takewhile(lambda x: x < 5, [1, 3, 5, 2, 1])))

[1, 3]

itertools.*dropwhile*  
dropwhile(pred, iterable)  
This is similar to takewhile, except that the iterator will start iterating (and yield all remaining elements) once pred(item) becomes Falsey

In [27]:
#Example:
from itertools import dropwhile

list(dropwhile(lambda x: x < 5, [1, 3, 5, 2, 1]))

[5, 2, 1]

#### Code Examples

In [28]:
def gen_cubes(n):
    for i in range(n):
        print(f'yielding {i}')
        yield i**3

In [29]:
def is_odd(x):
    return x % 2 == 1

In [30]:
is_odd(4), is_odd(81)

(False, True)

In [31]:
filtered = filter(is_odd, gen_cubes(10))

In [32]:
list(filtered)

yielding 0
yielding 1
yielding 2
yielding 3
yielding 4
yielding 5
yielding 6
yielding 7
yielding 8
yielding 9


[1, 27, 125, 343, 729]

In [33]:
def is_even(x):
    return x % 2 == 0

In [34]:
filtered = filter(is_even, gen_cubes(10))

In [35]:
list(filtered)

yielding 0
yielding 1
yielding 2
yielding 3
yielding 4
yielding 5
yielding 6
yielding 7
yielding 8
yielding 9


[0, 8, 64, 216, 512]

In [36]:
from itertools import filterfalse

In [37]:
filtered = filterfalse(is_odd, gen_cubes(10))

In [38]:
list(filtered)

yielding 0
yielding 1
yielding 2
yielding 3
yielding 4
yielding 5
yielding 6
yielding 7
yielding 8
yielding 9


[0, 8, 64, 216, 512]

In [39]:
filtered = filterfalse(is_even, gen_cubes(10))

In [40]:
list(filtered)

yielding 0
yielding 1
yielding 2
yielding 3
yielding 4
yielding 5
yielding 6
yielding 7
yielding 8
yielding 9


[1, 27, 125, 343, 729]

In [41]:
from itertools import dropwhile, takewhile

In [43]:
from math import sin, pi

def sine_wave(n):
    start = 0
    max_ = 2 * pi
    step = (max_ - start) / (n-1)
    
    for _ in range(n):
        yield round(sin(start), 2)
        start += step

In [53]:
list(sine_wave(15))

[0.0,
 0.43,
 0.78,
 0.97,
 0.97,
 0.78,
 0.43,
 0.0,
 -0.43,
 -0.78,
 -0.97,
 -0.97,
 -0.78,
 -0.43,
 -0.0]

In [54]:
result = (takewhile(lambda x: 0 <= x <= 0.9, sine_wave(15)))

In [55]:
list(result)

[0.0, 0.43, 0.78]

In [56]:
next(result)

StopIteration: 

In [57]:
result = (dropwhile(lambda x: 0 <= x <= 0.9, sine_wave(15)))

In [58]:
list(result)

[0.97, 0.97, 0.78, 0.43, 0.0, -0.43, -0.78, -0.97, -0.97, -0.78, -0.43, -0.0]

In [59]:
next(result)

StopIteration: 

In [60]:
from itertools import compress

In [61]:
data = list('yeetyeet')
selectors = [1, 0, 1, 0, None, '', False, True, True, True, True, True]

In [62]:
list(zip(data, selectors))

[('y', 1),
 ('e', 0),
 ('e', 1),
 ('t', 0),
 ('y', None),
 ('e', ''),
 ('e', False),
 ('t', True)]

In [66]:
[item for item, truth_value in zip(data, selectors) if truth_value]

['y', 'e', 't']

In [65]:
compress(data, selectors)

<itertools.compress at 0x23a1b3a2e88>

In [67]:
list(compress(data, selectors))

['y', 'e', 't']

In [68]:
list(compress(selectors, data))

[1, 0, 1, 0, None, '', False, True]