# Built in heroes

<!-- > "Awesome summary" -->

- hide: false
- toc: false
- badges: true
- comments: true
- categories: [python, pandas]

Title inspired by [this](https://www.youtube.com/watch?v=lyDLAutA88s) David Beazley talk. My notes on useful built in functions.

In [4]:
import functools
import itertools
import operator

## Operator

### `itemgetter()`

[Docs](https://docs.python.org/3/library/operator.html)

In [25]:
from operator import itemgetter

Basic use:

In [33]:
print(itemgetter(1, 3, 5)('Watermelon'))
print(itemgetter(slice(5, None))('Watermelon'))
print(itemgetter('name')(dict(name='Paul', age=44)))

('a', 'e', 'm')
melon
Paul


Application (from docs):

In [40]:
inventory = [('apple', 3), ('banana', 2), ('pear', 5), ('orange', 1)]

getcount = itemgetter(1)

# get second item from list
print(getcount(inventory))

# get second item from each element in list
list(map(getcount, inventory))

('banana', 2)


[3, 2, 5, 1]

Application: sorting dictionary (from Python Cookbook recipe 1.13)

In [46]:
data = [
    {'fname': 'Brian', 'lname': 'Jones', 'uid': 1003},
    {'fname': 'David', 'lname': 'Beazley', 'uid': 1002},
    {'fname': 'John', 'lname': 'Cleese', 'uid': 1001},
    {'fname': 'Big', 'lname': 'Jones', 'uid': 1004}
]

# get second row from data
print( itemgetter(2)(data) )

# get ids from all rows
print( list(map(itemgetter('uid'), data)) )

# sort data by fname and lname
sorted(data, key=itemgetter('fname', 'lname'))

{'fname': 'John', 'lname': 'Cleese', 'uid': 1001}
[1003, 1002, 1001, 1004]


[{'fname': 'Big', 'lname': 'Jones', 'uid': 1004},
 {'fname': 'Brian', 'lname': 'Jones', 'uid': 1003},
 {'fname': 'David', 'lname': 'Beazley', 'uid': 1002},
 {'fname': 'John', 'lname': 'Cleese', 'uid': 1001}]

### `attrgetter()`

In [26]:
from operator import attrgetter

Basic use:

In [11]:
def greeter():
    print('Hello')
    
f = operator.attrgetter('__name__')
f(greeter)

'greeter'

Application: sort objects without native support (from Python Cookbook recipe 1.14)

In [24]:
class User:
    def __init__(self, name, user_id):
        self.user_id = user_id
        self.name = name
        
    def __repr__(self):
        return 'User({!r}, {!r})'.format(self.name, self.user_id)

    
users = [User('Paul', 1), User('Petra', 3), User('Petra', 4), User('Paul', 5)]

users.sort(key=operator.attrgetter('name', 'user_id'))
users

[User('Paul', 1), User('Paul', 5), User('Petra', 3), User('Petra', 4)]

# Functools

## `partial`

In [9]:
print(operator.mul(2, 3))
tripple = partial(mul, 3)
tripple(2)

6


6

## Iterable unpacking

When looping over a list of records (maybe of unequal length), we can access each records elements directly using star expressions. (From Python Cookbook recipe 1.2.)

In [11]:
records = [
    ('foo', 1, 2),
    ('bar', 'hello')
]

In [13]:
# conventional loop

for record in records:
    print(record)

('foo', 1, 2)
('bar', 'hello')


In [14]:
# accessign items

for a, *b in records:
    print(a)
    print(b)

foo
[1, 2]
bar
['hello']


In [15]:
# example use

def do_foo(x, y):
    print(f'foo: args are {x} and {y}.')
    
def do_bar(x):
    print(f'bar: arg is {x}.')

for tag, *args in records:
    if tag == 'foo':
        do_foo(*args)
    elif tag == 'bar':
        do_bar(*args)

foo: args are 1 and 2.
bar: arg is hello.


## Creating a `callable_iterator`

Roll a die until a 6 is rolled

In [38]:
import random

def roll():
    return random.randint(1, 6)

roll_iter = iter(roll, 6)
roll_iter

<callable_iterator at 0x112c75850>

In [39]:
for r in roll_iter:
    print(r)

1
2
4
2


In [41]:
list(iter(roll, 4))

[3, 1, 2, 5, 3, 5, 3]

To read file until an empty line:

In [None]:
with open('filepath') as f:
    for line in iter(f.readline, ''):
        process_line(line)

## `itertools.starmap()`

Using `starmap()` to calculate a running average:

In [12]:
numbers = range(1, 11, 4)
list(itertools.starmap(lambda a, b: b / a, 
                       enumerate(itertools.accumulate(numbers), 1)))

[1.0, 3.0, 5.0]

In [13]:
list(itertools.accumulate(numbers))

[1, 6, 15]

In [14]:
list(enumerate(itertools.accumulate(numbers), 1))

[(1, 1), (2, 6), (3, 15)]

Multiplying characters

In [63]:
name = 'Emily'
list(itertools.starmap(operator.mul, enumerate(name, 1)))

['E', 'mm', 'iii', 'llll', 'yyyyy']

## `functools.reduce()`

In [2]:
import pandas as pd

df = pd.DataFrame({'AAA': [4, 5, 6, 7],
                   'BBB': [10, 20, 30, 40],
                   'CCC': [100, 50, -30, -50]})

What I usually do

In [3]:
crit1 = df.AAA > 5
crit2 = df.BBB > 30
crits = crit1 & crit2

df[crits]

Unnamed: 0,AAA,BBB,CCC
3,7,40,-50


Alternative using `functools.reduce()`

In [9]:
import functools
crit1 = df.AAA > 5
crit2 = df.BBB > 30
crits = [crit1, crit2]

mask = functools.reduce(lambda x, y: x & y, crits)
df[mask]


Unnamed: 0,AAA,BBB,CCC
3,7,40,-50


## Understanding `dropwhile()`

From [docs](https://docs.python.org/3/library/itertools.html#itertools.dropwhile)

In [22]:
def dropwhile(predicate, iterable):
    iterable = iter(iterable)
    for x in iterable:
        if not predicate(x):
            yield x
            break
    for x in iterable:
        yield x

predicate = lambda x: x < 5
iterable = [1, 2, 3, 6, 7, 3] 
list(dropwhile(predicate, iterable))

[6, 7, 3]

What happens here?

1. `iter()` is used so that the iterable becomes an iterator (which gets emptied as it's being iterated over).

2. The first for loop moves until the first element fails the condition in `predicate`, at which point that element is yielded and the program breakes out of that for loop, advancing to the next.

3. Because of step 1, `iterable` now only contains all elements after the element that caused the previous for loop to break, and all of these are yielded.

In [29]:
def sensemaker(predicate, iterable):
    iterable = iter(iterable)
    for x in iterable:
        if not predicate(x):
            print('First loop')
            print(x)
            break
    print('Second loop')
    for x in iterable:
        print(x)

sensemaker(predicate, iterable)

First loop
6
Second loop
7
3


In [30]:
def sensemaker(predicate, iterable):
#     iterable = iter(iterable)
    for x in iterable:
        if not predicate(x):
            print('First loop')
            print(x)
            break
    print('Second loop')
    for x in iterable:
        print(x)

sensemaker(predicate, iterable)

First loop
6
Second loop
1
2
3
6
7
3


If we don't turn the iterable into an iterator, it doesn't get exhausted and the second loop simply loops over all its objects.

From [more itertools](https://more-itertools.readthedocs.io/en/stable/index.html)

In [43]:
import more_itertools

more_itertools.take(4, more_itertools.pairwise(itertools.count()))

[(0, 1), (1, 2), (2, 3), (3, 4)]

## Sources

- [Fluent Python](https://www.oreilly.com/library/view/fluent-python/9781491946237/)
- [Pandas cookbook](https://pandas.pydata.org/pandas-docs/stable/user_guide/cookbook.html#cookbook)