<h3>Ch 7: Function Decorators and Closures</h3>

Function decorators let us "mark" functions in the source code to enhance their behavior in some way. It is powerful, but mastering it requires understanding closures.

One of the newest reserved keywords in Python is `nonlocal`, introduced in Python 3.0. If you want to implement your own function decorators, you must know closures inside and out, and the need for `nonlocal` becomes obvious.

Closures are also essential for effective asynchronous programming with callbacks, and for coding in a functional style whenever it makes sense.

In [3]:
# 10:52a - ? 11:47a maybe
# 12:47p - 1:16p

# 2:47p - 3:20p

# 3:39p - 



<h3>Decorators 101</h3>

A decorator is a callable that takes another function as argument (the decorated function). The decorator may perform some processing with the decorated function, and returns or replaces it with another function or callable object.

In [4]:
def decorate(func):
    pass

@decorate
def target():
    print("running target()")

# This has the same effect as writing this:
def target():
    print("running target()")

target = decorate(target)

The end result is the same: at the end of either of those snippets, the `target` name does not necessarily refer to the original `target` function, but to whatever function is returned by `decorate(target)`

In [5]:
def deco(func):
    def inner():
        print('running inner()')
    return inner

@deco
def target():
    print('running target()')

target()

running inner()


In [6]:
# target is now a reference to inner

target

<function __main__.deco.<locals>.inner()>

The first crucial fact about decorators is that they have the power to replace the decorated function with a different one. The second crucial fact is that they are executed immediately when a module is loaded.

A key feature of decorators is that they run right after the decorated function is defined. That is usually at <i>import time</i> (i.e., when the module is loaded by Python). 

In [7]:
registry = []

def register(func):
    print('running register(%s)' % func)
    registry.append(func)
    return func

@register
def f1():
    print('running f1()')

@register
def f2():
    print('running f2()')

def f3():
    print('running f3()')

def main():
    print('running main()')
    print('registry ->', registry)
    f1()
    f2()
    f3()

if __name__ == '__main__':
    main()

running register(<function f1 at 0x7f882038adc0>)
running register(<function f2 at 0x7f88203ae040>)
running main()
registry -> [<function f1 at 0x7f882038adc0>, <function f2 at 0x7f88203ae040>]
running f1()
running f2()
running f3()


In [8]:
registry = []

def register(func):
    print('running register(%s)' % func)
    registry.append(func)
    return func

@register
def f1():
    print('running f1()')

@register
def f2():
    print('running f2()')

def f3():
    print('running f3()')

def main():
    print('running main()')
    print('registry ->', registry)
    f1()
    f2()
    f3()

# if __name__ == '__main__':
#     main()

running register(<function f1 at 0x7f882038aa60>)
running register(<function f2 at 0x7f882038adc0>)


Note that register runs twice before any other function in the module. Function decorators are executed as soon as the module is imported, but the decorated functions only run when they are explicitly invoked. This is the difference between <i>import time</i> and <i>runtime</i>.

Note that a decorator function is usually defined in one module and applied to functions in other modules (not defined in the same module like we did here). The `register` decorator returns the same function passed as an argument. In practice, most decorators define an inner function and return it.

In [9]:
# Same promos example from the refactoring in last chapter.
# Decorator adds promos to the list to avoid missing new ones if not manually added to the promos list.

promos = []

def promotion(promo_func):
    promos.append(promo_func)
    return promo_func

@promotion
def fidelity(order):
    """5% discount for customers with 1000 or more fidelity points"""
    return order.total() * .05 if order.customer.fidelity >= 1000 else 0

@promotion
def bulk_item(order):
    """10% discount for each LineItem with 20 or more units"""
    discount = 0
    for item in order.cart:
        if item.quantity >= 20:
            discount += item.total() * .1
    return discount

def best_promo(order):
    """Select best discount available"""
    return max(promo(order) for promo in promos)


Now the promos don't need to use the `_promo` suffix. It makes it easy to temporarily disable a promotion: just comment out the decorator. Promo discount strategies can now be defined in other modules, anywhere in the system, as long as the `@promotion` decorator is applied to them.

<h3>Variable Scope Rules</h3>

Let's look at these before going into closures. Most decorators do change the decorated function. They usually do it by defining an inner function and returning it to replace the decorated function. Code that uses inner functions almost always depends on closures to operate correctly. 

In [10]:
def f1(a):
    print(a)
    print(b)

f1(3)

3


NameError: name 'b' is not defined

In [11]:
# Assign a value of 6 to global b, then call f1.

b = 6
f1(3)

3
6


In [12]:
b = 6
def f2(a):
    print(a)
    print(b)
    b = 9

In [13]:
f2(3)

3


UnboundLocalError: local variable 'b' referenced before assignment

Note that the output starts with 3, which proves that the `print(a)` statement was executed. But `print(b)` never runs. This might be surprising since there is a global variable b and the assignment to the local b is made after `print(b)`.

However, when Python compiles the body of the function, it decides that b is a local variable because it is assigned within the function. The generated bytecode reflects this decision and will try to fetch b from the local environment. Later, when the call `f2(3)` is made, the body of `f2` fetches and prints the value of the local variable a, but when trying to fetch the value of local variable b, it discovers that b is unbound.

This is not a bug, but a design choice. Python does not require you to declare variables, but assumes that a variable assigned in the body of a function is local. This is much better than the behavior of JavaScript, which does not require variable declarations either, but if you forget to declare that a variable is local (with var), you may clobber a global variable without knowing.

If we want the interpreter to treat b as a global variable in spite of the assignment within the function, we use the `global` declaration.

In [14]:
def f3(a):
    global b
    print(a)
    print(b)
    b = 9

In [15]:
f3(3)

3
6


In [16]:
b

9

In [17]:
f3(3)

3
9


<h3>Closures</h3>

Closures are sometimes confused with anonymous functions. Defining functions inside functions is not so common, until you start using anonymous functions. And closures only matter when you have nested functions.

A closure is a function with an extended scope that encompasses nonglobal variables referenced in the body of the function but not defined there. It does not matter whether the function is anonymous or not; what matters is that it can access nonglobal variables that are defined outside of its body.

In [18]:
class Averager():

    def __init__(self):
        self.series = []

    def __call__(self, new_value):
        self.series.append(new_value)
        total = sum(self.series)
        return total / len(self.series)

In [19]:
avg = Averager()
avg(10)

10.0

In [20]:
avg(11)

10.5

In [21]:
avg(12)

11.0

In [22]:
def make_averager():
    series = []

    def averager(new_value):
        series.append(new_value)
        total = sum(series)
        return total / len(series)

    return averager

In [23]:
avg = make_averager()
avg(10)

10.0

In [24]:
avg(11)

10.5

In [25]:
avg(12)

11.0

Note the similarities in these two approaches. We can call `Averager()` or `make_averager()` to get a callable object `avg` that will update the historical series and calculate the current mean. 

The avg of the `Averager` class keeps the history in the `self.series` instance attribute. But where does the avg function in the second example find the series? The series is a local variable of `make_averager()` because the series = [] initialization happens in the body of that function. But when avg(10) is called, `make_averager` has already returned, and its local scope is long gone. 

Within averager, series is a <i>free variable</i>. This is a technical term meaning a variable that is not bound in the local scope.

Inspecting the returned `averager` object shows how Python keeps the names of local and free variables in the `__code__` attribute that represents the compiled body of the function.



In [26]:
avg.__code__.co_varnames

('new_value', 'total')

In [27]:
avg.__code__.co_freevars

('series',)

The binding for series is kept in the `__closure__` attribute of the returned function avg. Each item in `avg.__closure__` corresponds to a name in `avg.__code__.co_freevars`. These items are `cells`, and they have an attribute `cell_contents` where the actual value can be found.

In [28]:
avg.__closure__

(<cell at 0x7f87f054f3d0: list object at 0x7f87f05733c0>,)

In [29]:
avg.__closure__[0].cell_contents

[10, 11, 12]

In summary, a closure is a function that retains the bindings of the free variables that exist when the function is defined, so that they can be used later when the function is invoked and the defining scope is no longer available.

Note that the only situation in which a function may need to deal with external variables that are nonglobal is when it is nested in another function.

<h3>The nonlocal Declaration</h3>

The previous implementation of `make_averager` was not efficient. A better implementation would just store the total and the number of items so far, and compute the mean from these two numbers.

First, a broken implementation, just to make a point:

In [30]:
def make_averager():
    count = 0
    total = 0

    def averager(new_val):
        count += 1
        total += new_val
        return total / count

    return averager

In [31]:
avg = make_averager()
avg(10)

UnboundLocalError: local variable 'count' referenced before assignment

The problem is that the statement `count += 1` actually means the same as `count = count + 1`. We are actually assigning to `count` in the body of `averager`, and that makes it a local variable. The same problem affects the `total` variable.

We didn't have this problem earlier because we never assigned to the `series` name, we called `series.append`. So we took advantage of the fact that lists are mutable.

But with immutable types like numbers, strings, tuples, etc., all you can do is read, but never update. If you try to rebind them, as in `count = count + 1`, then you are implicitly creating a local variable `count`. It is no longer a free variable, and therefore it is not saved in the closure.

To work around this, the `nonlocal` declaration was introduced in Python 3. It lets you flag a variable as a free variable even when it is assigned to a new value within the function. If a new value is assigned to a `nonlocal` variable, the binding stored in the closure is changed. A correct implementation of our newest `make_averager` looks like this:

In [32]:
def make_averager():
    count = 0
    total = 0

    def averager(new_val):
        nonlocal count, total
        count += 1
        total += new_val
        return total / count

    return averager

In [33]:
avg = make_averager()
avg(10)

10.0

In [34]:
avg(12)

11.0

The workaround for not having `nonlocal` in Python 2 is to store the variables the inner functions need to change (count, total in this example) as items or attributes of some mutable object, like a dict or simple instance, and bind that object to a free variable.

In [35]:
# Implement a simple decorator
# Clock every invocation of the decorated function and print elapsed time, args, and result
import time

def clock(func):
    def clocked(*args):
        t0 = time.perf_counter()
        result = func(*args)
        elapsed = time.perf_counter() - t0
        name = func.__name__
        arg_str = ', '.join(repr(arg) for arg in args)
        print('[%0.8fs] %s(%s) -> %r' % (elapsed, name, arg_str, result))
        return result
    return clocked

In [36]:
@clock
def snooze(seconds):
    time.sleep(seconds)

@clock
def factorial(n):
    return 1 if n < 2 else n*factorial(n-1)

if __name__ == '__main__':
    print('*' * 40, 'Calling snooze(.123)')
    snooze(.123)
    print('*' * 40, 'Calling factorial(6)')
    print('6! =', factorial(6))

**************************************** Calling snooze(.123)
[0.12347717s] snooze(0.123) -> None
**************************************** Calling factorial(6)
[0.00000046s] factorial(1) -> 1
[0.00017333s] factorial(2) -> 2
[0.00018550s] factorial(3) -> 6
[0.00019279s] factorial(4) -> 24
[0.00020258s] factorial(5) -> 120
[0.00021996s] factorial(6) -> 720
6! = 720


<h3>How It Works</h3>

Recall that the `@clock` decorator above `def factorial(n)` is actually equivalent to `factorial = clock(factorial)`. Clock gets the factorial function as its func argument. It then creates the `clocked` function, which the Python interpreter assigns to `factorial` behind the scenes. In fact, you might check the `__name__` of factorial:

In [37]:
factorial.__name__

'clocked'

So you can see here that factorial actually holds a reference to the clocked function. Each time factorial(n) is called, clocked(n) gets executed.

`clocked` does the following:
1. Records the initial time t0
2. Calls the original factorial, saving the result
3. Computes the elapsed time
4. Formats and prints the collected data
5. Retruns the result saved in step 2

This is the typical behavior of a decorator: it replaces the decorated function with a new function that accepts the same arguments and (usually) returns whatever the decorated function was supposed to return, while also doing some extra processing.

<h3>Decorators in the Standard Library</h3>

Python has 3 built-in functions that are designed to decorate methods: `property, classmethod, staticmethod`. Another frequently seen decorator is `functools.wraps`, a helper function for building well-behaved decorators. Two of the most interesting decorators in the standard library are `lru_cache` and the (new in Python 3.4) `singledispatch`.

`functools.lru_cache` is a very practical decorator. It implements memoization: an optimization technique that works by saving the results of previous invocations of an expensive function, avoiding repeat computations on previously used arguments. The letters LRU stand for Last Recently Used, meaning that the growth of the cache is limited by discarding the entries that have not been read for a while.

A good demonstration is to apply `lru_cache` to the painfully slow recursive function to generate the <i>n</i>th number in the Fibonacci sequence:

In [38]:
@clock
def fibonacci(n):
    if n < 2:
        return n
    return fibonacci(n-2) + fibonacci(n-1)

if __name__ == '__main__':
    print(fibonacci(6))

[0.00000037s] fibonacci(0) -> 0
[0.00000025s] fibonacci(1) -> 1
[0.00006792s] fibonacci(2) -> 1
[0.00000025s] fibonacci(1) -> 1
[0.00000033s] fibonacci(0) -> 0
[0.00000017s] fibonacci(1) -> 1
[0.00003742s] fibonacci(2) -> 1
[0.00007379s] fibonacci(3) -> 2
[0.00017650s] fibonacci(4) -> 3
[0.00000017s] fibonacci(1) -> 1
[0.00000021s] fibonacci(0) -> 0
[0.00000017s] fibonacci(1) -> 1
[0.00001700s] fibonacci(2) -> 1
[0.00003954s] fibonacci(3) -> 2
[0.00000017s] fibonacci(0) -> 0
[0.00000017s] fibonacci(1) -> 1
[0.00001321s] fibonacci(2) -> 1
[0.00000017s] fibonacci(1) -> 1
[0.00000029s] fibonacci(0) -> 0
[0.00000017s] fibonacci(1) -> 1
[0.00001725s] fibonacci(2) -> 1
[0.00004271s] fibonacci(3) -> 2
[0.00007242s] fibonacci(4) -> 3
[0.00012875s] fibonacci(5) -> 5
[0.00032463s] fibonacci(6) -> 8
8


There is obvious waste: fibonacci(1) is called 8 times, fibonacci(2) 5 times, etc. If we add lru_cache, performance is much improved.

In [40]:
import functools

@functools.lru_cache() # stacked decorator: @lru_cache() is applied on the function returned by @clock
@clock
def fibonacci(n):
    if n < 2:
        return n
    return fibonacci(n-2) + fibonacci(n-1)

if __name__ == '__main__':
    print(fibonacci(6))

[0.00000025s] fibonacci(0) -> 0
[0.00000029s] fibonacci(1) -> 1
[0.00005179s] fibonacci(2) -> 1
[0.00000042s] fibonacci(3) -> 2
[0.00007042s] fibonacci(4) -> 3
[0.00000037s] fibonacci(5) -> 5
[0.00026142s] fibonacci(6) -> 8
8


As an example of the performance difference, fibonacci(30) makes 31 calls with the `lru_cache()`, and over 2.6M calls without it (enough to freeze and crash jupyter notebook).

Besides making silly recursive algorithms viable, `lru_cache` really shines in applications that need to fetch information from the web. It's important to note that `lru_cache` can be tuned by passing two optional arguments: maxsize and typed. Maxsize refers to how many call results are stored. Once full, older results are discarded to make room. For the best performance, maxsize should be a power of 2. The typed argument, if set to True, stores results of different argument types separately, i.e., distinguishing between float and integer arguments that are normally considered equal, like 1 and 1.0. Since `lru_cache` uses a dict to store results, and keys are made from positional and kw args used in the calls, all the args taken by the decorated function must be <i>hashable</i>.

In [41]:
import html

def htmlize(obj):
    content = html.escape(repr(obj))
    return '<pre>{}</pre>'.format(content)

In [42]:
htmlize(abs)

'<pre>&lt;built-in function abs&gt;</pre>'

In [43]:
htmlize('Heimlich & Co.\n- a game')

'<pre>&#x27;Heimlich &amp; Co.\\n- a game&#x27;</pre>'

In [44]:
htmlize(42)

'<pre>42</pre>'

In [45]:
print(htmlize(['alpha', 66, {3, 2, 1}]))

<pre>[&#x27;alpha&#x27;, 66, {1, 2, 3}]</pre>


Because we don't have method or function overloading in Python, we can't create variations of `htmlize` with different signatures for each data type we want to handle differently. A common solution would be to turn `htmlize` into a dispatch function with a chain of `if/elif/elif` calling specialized functions like `htmlize_str`, `htmlize_int`, etc. Over time, the `htmlize` dispatcher would become too big, and the coupling between it and the specialized functions would be very tight.

The new `functools.singledispatch` decorator in Python 3.4 allows each module to contribute to the overall solution, and lets you easily provide a specialized function even for classes that you can't edit. If you decorate a plain function with `@singledispatch`, it becomes a <i>generic function</i>: a group of functions to perform the same operation in different ways, depending on the type of the first argument. Example:

In [49]:
from functools import singledispatch
from collections import abc
import numbers
import html

@singledispatch
def htmlize(obj):
    content = html.escape(repr(obj))
    return '<pre>{}</pre>'.format(content)

@htmlize.register(str)
def _(text):
    content = html.escape(text).replace('\n', '<br>\n')
    return '<p>{0}</p>'.format(content)

@htmlize.register(numbers.Integral)
def _(n):
    return '<pre>{0} (0x{0:x})</pre>'.format(n)

@htmlize.register(tuple)
@htmlize.register(abc.MutableSequence)
def _(seq):
    inner = '</li>\n<li>'.join(htmlize(item) for item in seq)
    return '<ul>\n<li>' + inner + '</li>\n</ul>'

In [50]:
htmlize('abcd')

'<p>abcd</p>'

`@singledispatch` marks the base function that handles the object type.

Each specialized function is decorated with `@<base_function.register(type)`

The name of the specialized functions is irrelevant; _ is a good choice to make this clear.

For each additional type to receive special treatment, register a new function. `numbers.Integral` is a virtual superclass of int.

You can stack several `register` decorators to support different types with the same function.

When possible, register the specialized functions to handle ABCs (abstract classes) such as numbers.Integral and abc.MutableSequence instead of concrete implementations like int and list. This allows your code to support a greater variety of compatible types. 

<h3>Stacked Decorators</h3>

When two decorators `@d1` and `@d2` are applied to a function `f` in that order, the results is the same as

`f = d1(d2(f))`

In [54]:
def d1(x):
    pass

def d2(x):
    pass

@d1
@d2
def f():
    print('f')

<h3>Parameterized Decorators</h3>

When parsing a decorator in source code, Python takes the decorated function and passes it as the first arg to the decorator function. So how do you make a decorator accept other arguments? The answer is: make a decorator factory that takes those args and returns a decorator, which is then applied to the function to be decorated. 

In [55]:
registry = []

def register(func):
    print('running register(%s)' % func)
    registry.append(func)
    return func

@register
def f1():
    print('running f1()')

print('running main()')
print('registry ->', registry)
f1()

running register(<function f1 at 0x7f882038adc0>)
running main()
registry -> [<function f1 at 0x7f882038adc0>]
running f1()


In [56]:
registry = set()

def register(active=True):
    def decorate(func):
        print('running register(active=%s)->decorate(%s)' % (active, func))
        if active:
            registry.add(func)
        else:
            registry.discard(func)

        return func
    return decorate

@register(active=False)
def f1():
    print('running f1()')

@register()
def f2():
    print('running f2()')

def f3():
    print('running f3()')

running register(active=False)->decorate(<function f1 at 0x7f88203aeee0>)
running register(active=True)->decorate(<function f2 at 0x7f882038aa60>)


In order to enable/disable the function registration performed by `register`, make it accept an optional parameter which, if False, skips registering the decorated function. Conceptually the new `register` function is not a decorator but a decorator factory. When called, it returns the actual decorator that will be applied to the target function.

<h3>The Parameterized Clock Decorator</h3>

Add a feature to the previous clock decorator: users may pass a format string to control the ouptut of the decorated function.

Clock is the parameterized decorator factory. Decorate is the actual decorator. Clocked wraps the decorator function.

_result is the actual result of the decorated function.

_args holds the actual arguments of clocked, while args is str used for display.

results is str representation of _result, for display

In [60]:
import time

DEFAULT_FMT = '[{elapsed:0.8f}s] {name}({args}) -> {result}'

def clock(fmt=DEFAULT_FMT):
    def decorate(func):
        def clocked(*_args):
            t0 = time.time()
            _result = func(*_args)
            elapsed = time.time() - t0
            name = func.__name__
            args = ', '.join(repr(arg) for arg in _args)
            result = repr(_result)
            print(fmt.format(**locals())) # **locals allows any var of clocked to be referenced in the fmt
            return _result
        return clocked
    return decorate
    
if __name__ == '__main__':
    
    @clock()
    def snooze(seconds):
        time.sleep(seconds)
        
    for i in range(3):
        snooze(.123)

[0.12457395s] snooze(0.123) -> None
[0.12810707s] snooze(0.123) -> None
[0.12450004s] snooze(0.123) -> None


In [61]:
# Another exmaple

@clock('{name}: {elapsed}s')
def snooze(seconds):
    time.sleep(seconds)

for i in range(3):
    snooze(.123)

snooze: 0.12806129455566406s
snooze: 0.12380599975585938s
snooze: 0.12804412841796875s


In [62]:
@clock('{name}({args}) dt={elapsed:0.3f}s')
def snooze(seconds):
    time.sleep(seconds)

for i in range(3):
    snooze(.123)

snooze(0.123) dt=0.123s
snooze(0.123) dt=0.128s
snooze(0.123) dt=0.123s


In conclusion, decorators are best coded as classes implementing `__call__`, and not as functions as we have done here. This is certainly better for non-trivial decorators, but this is the easier way to explain the basic idea.

Registration decorators, though simple in essence, have real applications in advanced Python frameworks.

Parameterized decorators almost always involve at least two nested functions, maybe more if you want to use `@functools.wraps` to produce a decorator that provides better support for more advanced techniques. One such technique is stacked decorators, which we saw a couple examples of.

We covered the difference between import time and runtime, variable scoping, closures, and then nonlocal declaration. Mastering closures and nonlocal is valuable not only for decorators, but also to code event-oriented programs for GUIs or asynchronous I/O with callbacks.