# What Does It Take to Be An Expert At Python

### [Notebook based off James Powell's talk at PyData 2017](https://www.youtube.com/watch?v=7lmCu8wz8ro)

<b>Definitions</b> 

Python is a language orientated around protocols - Some behavior or syntax or bytecode or some top level function and there is a way to tell python how to implement that on an arbitrary object via underscore methods. The exact correspondance is usually guessable, but if you can't guess it you can it... google python data model

<b>Metaclass Mechanism:</b> Some hook into the class construction process. Questions: Do you have these methods implemented. Meaning: Library code & User code? How do you enforce a constraint?

<b>Decorators:</b> Hooks into idea that everything creates a structure at run time. Wrap sets of functions with a before and after behavior.

<b>Generators:</b> Take a single computation that would otherwise run eagerly from the injection of its parameters to the final computation and interleaving with other code by adding yield points where you can yield the intermediate result values or one small piece of the computation and also yield back to the caller. Think of a generator of a way to take one long piece of computation and break it up into small parts.

<b>Context managers:</b> Two structures that allow you to tie two actions together. A setup action and a teardown action and make sure they always happen in concordance with each other.

In [1]:
# some behavior that I want to implement -> write some __ function __
# top-level function or top-level syntax -> corresponding __
# x + y -> __add__
# init x -> __init__
# repr(x) --> __repr__
# x() -> __call__

class Polynomial:
    pass

p1 = Polynomial()
p2 = Polynomial()

p1.coeffs = 1, 2, 3 # x2 + 3x + 3
p2.coeffs = 3, 4, 3 # 3x2 + 4x + 3

In [2]:
class Polynomial:
    def __init__(self, *coeffs):
        self.coeffs = coeffs

p1 = Polynomial(1, 2, 3) # x2 + 3x + 3
p2 = Polynomial(3, 4, 3) # 3x2 + 4x + 3
print(repr(p1))

<__main__.Polynomial object at 0x7f94dc268ed0>


In [3]:
class Polynomial:
    def __init__(self, *coeffs):
        self.coeffs = coeffs
        
    def __repr__(self):
        return 'Polynomial(*{!r})'.format(self.coeffs)

p1 = Polynomial(1, 2, 3) # x2 + 3x + 3
p2 = Polynomial(3, 4, 3) # 3x2 + 4x + 3
print(repr(p1))

Polynomial(*(1, 2, 3))


In [4]:
class Polynomial:
    def __init__(self, *coeffs):
        self.coeffs = coeffs
        
    def __repr__(self):
        return 'Polynomial(*{!r})'.format(self.coeffs)

    def __add__(self, other):
        return Polynomial(*(x+y for x, y in zip(self.coeffs, other.coeffs)))

p1 = Polynomial(1, 2, 3) # x2 + 3x + 3
p2 = Polynomial(3, 4, 3) # 3x2 + 4x + 3
print(p1 + p2)

Polynomial(*(4, 6, 6))


In [5]:
class Polynomial:
    def __init__(self, *coeffs):
        self.coeffs = coeffs
        
    def __repr__(self):
        return 'Polynomial(*{!r})'.format(self.coeffs)

    def __add__(self, other):
        return Polynomial(*(x+y for x, y in zip(self.coeffs, other.coeffs)))
    
    def __len__(self):
        return len(self.coeffs)

p1 = Polynomial(1, 2, 3) # x2 + 3x + 3
p2 = Polynomial(3, 4, 3) # 3x2 + 4x + 3
print(len(p1))

3


Protocol oriented data model.

In [6]:
class Polynomial:
    def __init__(self, *coeffs):
        self.coeffs = coeffs
        
    def __repr__(self):
        return 'Polynomial(*{!r})'.format(self.coeffs)

    def __add__(self, other):
        return Polynomial(*(x+y for x, y in zip(self.coeffs, other.coeffs)))
    
    def __len__(self):
        return len(self.coeffs)
    
    def __call__(self):
        pass

p1 = Polynomial(1, 2, 3) # x2 + 3x + 3
p2 = Polynomial(3, 4, 3) # 3x2 + 4x + 3

### Metaclasses

In [7]:
# library.py

class Base:
    def foo(self):
        return 'foo'    

Where could this code break? If there is no 'foo' method.

So, derived class is forcing constraint on base class

In [8]:
# user.py
# from library import Base

assert hasattr(Base, 'foo'), "you broke it, you fool!"

class  Derived(Base):
    def bar(self):
        return self.foo()

try-catch can only catch bugs at run time.

In [9]:
# Use Case 2
# library.py

class Base:
    def foo(self):
        return self.bar()    

In [10]:
# user.py
# from library import Base

class  Derived(Base):
    def bar(self):
        return 'bar'

In python classes are runtime executable code

In [11]:
for _ in range(10):
    class Base: pass

In [12]:
class Base:
    for _ in range(10):
        def bar(self):
            pass

In [13]:
def _():
    class Base:
        pass

In [14]:
from dis import dis
dis(_)

  2           0 LOAD_BUILD_CLASS
              2 LOAD_CONST               1 (<code object Base at 0x7f94dc26bed0, file "<ipython-input-13-fadda37b635f>", line 2>)
              4 LOAD_CONST               2 ('Base')
              6 MAKE_FUNCTION            0
              8 LOAD_CONST               2 ('Base')
             10 CALL_FUNCTION            2
             12 STORE_FAST               0 (Base)
             14 LOAD_CONST               0 (None)
             16 RETURN_VALUE

Disassembly of <code object Base at 0x7f94dc26bed0, file "<ipython-input-13-fadda37b635f>", line 2>:
  2           0 LOAD_NAME                0 (__name__)
              2 STORE_NAME               1 (__module__)
              4 LOAD_CONST               0 ('_.<locals>.Base')
              6 STORE_NAME               2 (__qualname__)

  3           8 LOAD_CONST               1 (None)
             10 RETURN_VALUE


In [15]:
import builtins
import importlib
importlib.reload(builtins)

<module 'builtins' (built-in)>

In [16]:
# library.py

class Base:
    def foo(self):
        return self.bar()    

old_bc = __build_class__

def my_bc(*a, **kw):
    print('my buildclass ->', a, kw)
    return old_bc(*a, **kw)

builtins.__build_class__ = my_bc

In [17]:
# user.py

class Derived(Base):
    def bar(self):
        return 'bar'

my buildclass -> (<function Derived at 0x7f94dc26b200>, 'Derived', <class '__main__.Base'>) {}


How to make sure user implements bar function?

People actually do not solve this problem as implemented below.

In [18]:
class Base:
    def foo(self):
        return self.bar()    

old_bc = __build_class__

def my_bc(fun, name, base=None, **kw):
    if base == Base:
        print('Check if bar method defined')
    if base is not None:    
        return old_bc(fun, name, **kw)
    
    return old_bc(fun, name, **kw)

builtins.__build_class__ = my_bc

my buildclass -> (<function Base at 0x7f94dc26b950>, 'Base') {}


#### Second Approach

Metaclasses: Classes that derive from type that have special methods for them.

But fundamentally allow you to intercept the construction of derived types

In [1]:
#library.py
import builtins
import importlib
importlib.reload(builtins)

class BaseMeta(type):
    def __new__(cls, name, bases, body):
        print('BaseMeta.__new__', cls, name, bases, body)
        return super().__new__(cls, name, bases, body)

class Base(metaclass=BaseMeta):
    def foo(self):
        return self.bar()

BaseMeta.__new__ <class '__main__.BaseMeta'> Base () {'__module__': '__main__', '__qualname__': 'Base', 'foo': <function Base.foo at 0x7f8ac82f68c0>}


#### Third Approach

In [2]:
#library.py

class BaseMeta(type):
    def __new__(cls, name, bases, body):
        if name != 'Base' and not 'bar' in body:
            raise TypeError("bad user class")
        return super().__new__(cls, name, bases, body)

class Base(metaclass=BaseMeta):
    def foo(self):
        return self.bar()
    # Method to hook into when a subclass is initialized
    def __init_subclass__(*a, **kw):
        print('init_subclass', a, kw)
        return super().__init_subclass__(*a, **kw)

In [3]:
help(Base.__init_subclass__)

Help on method __init_subclass__ in module __main__:

__init_subclass__(*a, **kw) method of __main__.BaseMeta instance
    This method is called when a class is subclassed.
    
    The default implementation does nothing. It may be
    overridden to extend subclasses.



## Decorators

In [4]:
# Example:
# @dec
# def f():
#     pass

def add(x, y=10):
    return x + y

print(add(10, 20))
print(add.__name__, add.__module__, add.__defaults__, add.__code__.co_code, add.__code__.co_varnames)

30
add __main__ (10,) b'|\x00|\x01\x17\x00S\x00' ('x', 'y')


In [5]:
from inspect import getsource, getfile

print(getfile(add)) 
print(getsource(add))

<ipython-input-4-801923d31ab8>
def add(x, y=10):
    return x + y



In [6]:
print('add(10)', add(10))
print('add(20, 30)', add(20, 30))
print('add("a", "b")', add("a", "b"))

add(10) 20
add(20, 30) 50
add("a", "b") ab


What if you want to time your code?

In [7]:
def sub(x, y=10):
    return x - y

print('sub(10)', sub(10))
print('sub(20, 30)', sub(20, 30))

sub(10) 0
sub(20, 30) -10


In [8]:
from time import time

def timer(func):
    def f(x, y=10):
        before = time()
        rv = func(x, y)
        after = time()
        print('elapsed', after - before)
        return rv
    return f

In [9]:
add = timer(add)
print('add(10)', add(10))
print('add(20, 30)', add(20, 30))
print('add("a", "b")', add("a", "b"))

elapsed 7.152557373046875e-07
add(10) 20
elapsed 7.152557373046875e-07
add(20, 30) 50
elapsed 7.152557373046875e-07
add("a", "b") ab


Don't need to do `add = timer(add)` with decorators...

In [10]:
@timer
def add_dec(x, y=10):
    return x + y

@timer
def sub_dec(x, y=10):
    return x - y

print('add(10)', add_dec(10))
print('add(20, 30)', add_dec(20, 30))
print('add("a", "b")', add_dec("a", "b"))
print('sub(10)', sub_dec(10))
print('sub(20, 30)', sub_dec(20, 30))

elapsed 7.152557373046875e-07
add(10) 20
elapsed 4.76837158203125e-07
add(20, 30) 50
elapsed 4.76837158203125e-07
add("a", "b") ab
elapsed 4.76837158203125e-07
sub(10) 0
elapsed 4.76837158203125e-07
sub(20, 30) -10


Don't hardcode parameters in decorator functions.

In [11]:
def timer_k(func):
    def f(*args, **kwargs):
        before = time()
        rv = func(*args, **kwargs)
        after = time()
        print('elapsed', after - before)
        return rv
    return f

In [12]:
@timer_k
def add_dec(x, y=10):
    return x + y

@timer_k
def sub_dec(x, y=10):
    return x - y

print('add(10)', add_dec(10))
print('add(20, 30)', add_dec(20, 30))
print('add("a", "b")', add_dec("a", "b"))
print('sub(10)', sub_dec(10))

elapsed 1.430511474609375e-06
add(10) 20
elapsed 7.152557373046875e-07
add(20, 30) 50
elapsed 4.76837158203125e-07
add("a", "b") ab
elapsed 4.76837158203125e-07
sub(10) 0


In [13]:
n = 2

def ntimes(f):
    def wrapper(*args, **kwargs):
        for _ in range(n):
            print('running {.__name__}'.format(f))
            rv = f(*args, **kwargs)
        return rv
    return wrapper
    
        
@ntimes
def add_dec(x, y=10):
    return x + y

@ntimes
def sub_dec(x, y=10):
    return x - y

#### Higher Order Decorators

Closure Object Duality

In [14]:
def ntimes(n):
    def inner(f):
        def wrapper(*args, **kwargs):
            for _ in range(n):
                print('running {.__name__}'.format(f))
                rv = f(*args, **kwargs)
            return rv
        return wrapper
    return inner
    
        
@ntimes(2)
def add_hdec(x, y=10):
    return x + y

@ntimes(4)
def sub_hdec(x, y=10):
    return x - y

In [15]:
print(add_hdec(20))

running add_hdec
running add_hdec
30


## Generators

In [16]:
# top-level syntax, function -> underscore method
# x()               __call__

def add1(x, y):
    return x + y

class Adder:
    def __call__(self, x, y):
        return x + y
add2 = Adder()

print(add1(10, 20))
print(add2(10, 20))

30
30


In [17]:
def add1(x, y):
    return x + y

class Adder:
    def __init__(self):
        self.z = 0
        
    def __call__(self, x, y):
        self.z += 1
        return x + y + self.z

add2 = Adder()

This example has storage... and has eager return of the result sets.

It eagerly gives you entire result even if you want some part of output.

In [18]:
from time import sleep

def compute():
    rv = []
    for i in range(10):
        sleep(.5)
        rv.append(i)
    return rv

In [16]:
compute()

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [19]:
# for x in xs:
#    pass

# xi = iter(xs)    -> __iter__
# while True:
#   x = next(xi)   -> __next__        
        
class Compute:    
    def __iter__(self):
        self.last = 0
        return self
    
    def __next__(self):
        rv = self.last
        self.last += 1
        if self.last > 10:
            raise StopIteration()
        sleep(.5)
        return self.last
        
for val in Compute():
    print(val)

1
2
3
4
5
6
7
8
9
10


This is a generator... don't eagerly compute. Return to user as they ask for it...

In [20]:
def compute():
    for i in range(10):
        sleep(.5)
        yield i
        
for val in Compute():
    print(val)        

1
2
3
4
5
6
7
8
9
10


Generator formulation: it yields results and also gives control back to the user. Helps in interleaving user and library code and also enforces sequencing. Can also be called a coroutine.

In [21]:
class Api:
    def run_this_first(self):
        first()
    def run_this_second(self):
        second()
    def run_this_last(self):
        last()

In [22]:
def api():
    first()
    yield
    second()
    yield
    last()

## Context Managers

Resource Allocation is initialization

In [23]:
# Example:
# with open('ctx.py') as f:
#     pass

In [24]:
from sqlite3 import connect

# with ctx() as x:
#   pass

# x = ctx().__enter__
# try:
#   pass
# finally:
#    x.__exit__

with connect('test.db') as conn:
    cur = conn.cursor()
    cur.execute('create table points(x int, y int)')
    cur.execute('insert into points (x, y) values(1, 1)')
    cur.execute('insert into points (x, y) values(1, 2)')
    cur.execute('insert into points (x, y) values(2, 1)')
    for row in cur.execute("select x, y from points"):
        print(row)
    for row in cur.execute('select sum(x*y) from points'):
        print(row)
    cur.execute('drop table points')

(1, 1)
(1, 2)
(2, 1)
(5,)


We should write context manager for creating and dropping table irrespective of errors within the code

`__exit__` cannot be called before `__enter__`. So, sequencing i.e. generators.

In [25]:
class temptable:
    def __init__(self, cur):
        self.cur = cur
    def __enter__(self):
        print('__enter__')
        self.cur.execute('create table points(x int, y int)')
    def __exit__(self, *args):
        print('__exit__')
        self.cur.execute('drop table points')
        
with connect('test.db') as conn:
    cur = conn.cursor()
    with temptable(cur):
        cur.execute('insert into points (x, y) values(1, 1)')
        cur.execute('insert into points (x, y) values(1, 2)')
        cur.execute('insert into points (x, y) values(2, 1)')
        cur.execute('insert into points (x, y) values(2, 2)')
        for row in cur.execute("select x, y from points"):
            print(row)
        for row in cur.execute('select sum(x * y) from points'):
            print(row)        

__enter__
(1, 1)
(1, 2)
(2, 1)
(2, 2)
(9,)
__exit__


How to make it better? Use generators.

In [26]:
def temptable(cur):
    cur.execute('create table points(x int, y int)')
    print('created table')
    yield
    cur.execute('drop table points')
    print('dropped table')
    
class contextmanager:
    def __init__(self, cur):
        self.cur = cur
    def __enter__(self):
        self.gen = temptable(self.cur)
        next(self.gen)
    def __exit__(self, *args):
        next(self.gen, None)
        
with connect('test.db') as conn:
    cur = conn.cursor()
    with contextmanager(cur):
        cur.execute('insert into points (x, y) values(1, 1)')
        cur.execute('insert into points (x, y) values(1, 2)')
        cur.execute('insert into points (x, y) values(2, 1)')
        cur.execute('insert into points (x, y) values(2, 2)')
        for row in cur.execute("select x, y from points"):
            print(row)
        for row in cur.execute('select sum(x * y) from points'):
            print(row)

created table
(1, 1)
(1, 2)
(2, 1)
(2, 2)
(9,)
dropped table


In [27]:
# Making it more generalised
# __call__ method helps in getting the arguments whereas __init__ initialises the function
def temptable(cur):
    cur.execute('create table points(x int, y int)')
    print('created table')
    yield
    cur.execute('drop table points')
    print('dropped table')
    
class contextmanager:
    def __init__(self, gen):
        self.gen = gen
    def __call__(self, *args, **kwargs):
        self.args, self.kwargs = args, kwargs
        return self
    def __enter__(self):
        self.gen_inst = self.gen(*self.args, **self.kwargs)
        next(self.gen_inst)
    def __exit__(self, *args):
        next(self.gen_inst, None)
        
with connect('test.db') as conn:
    cur = conn.cursor()
    with contextmanager(temptable)(cur):
        cur.execute('insert into points (x, y) values(1, 1)')
        cur.execute('insert into points (x, y) values(1, 2)')
        cur.execute('insert into points (x, y) values(2, 1)')
        cur.execute('insert into points (x, y) values(2, 2)')
        for row in cur.execute("select x, y from points"):
            print(row)
        for row in cur.execute('select sum(x * y) from points'):
            print(row)        

created table
(1, 1)
(1, 2)
(2, 1)
(2, 2)
(9,)
dropped table


Have we seen this pattern before?

In [28]:
class contextmanager:
    def __init__(self, gen):
        self.gen = gen
    def __call__(self, *args, **kwargs):
        self.args, self.kwargs = args, kwargs
        return self
    def __enter__(self):
        self.gen_inst = self.gen(*self.args, **self.kwargs)
        next(self.gen_inst)
    def __exit__(self, *args):
        next(self.gen_inst, None)

def temptable(cur):
    cur.execute('create table points(x int, y int)')
    print('created table')
    yield
    cur.execute('drop table points')
    print('dropped table')
temptable = contextmanager(temptable)
            
with connect('test.db') as conn:
    cur = conn.cursor()
    with temptable(cur):
        cur.execute('insert into points (x, y) values(1, 1)')
        cur.execute('insert into points (x, y) values(1, 2)')
        cur.execute('insert into points (x, y) values(2, 1)')
        cur.execute('insert into points (x, y) values(2, 2)')
        for row in cur.execute("select x, y from points"):
            print(row)

created table
(1, 1)
(1, 2)
(2, 1)
(2, 2)
dropped table


Now using a generator...

In [29]:
class contextmanager:
    def __init__(self, gen):
        self.gen = gen
    def __call__(self, *args, **kwargs):
        self.args, self.kwargs = args, kwargs
        return self
    def __enter__(self):
        self.gen_inst = self.gen(*self.args, **self.kwargs)
        next(self.gen_inst)
    def __exit__(self, *args):
        next(self.gen_inst, None)
        
@contextmanager
def temptable(cur):
    cur.execute('create table points(x int, y int)')
    print('created table')
    yield
    cur.execute('drop table points')
    print('dropped table')
            
with connect('test.db') as conn:
    cur = conn.cursor()
    with temptable(cur):
        cur.execute('insert into points (x, y) values(1, 1)')
        cur.execute('insert into points (x, y) values(1, 2)')
        cur.execute('insert into points (x, y) values(2, 1)')
        cur.execute('insert into points (x, y) values(2, 2)')
        for row in cur.execute("select x, y from points"):
            print(row)

created table
(1, 1)
(1, 2)
(2, 1)
(2, 2)
dropped table


Wrapping up all above together we get...

The context manager is already provided in a library. It combines generators, decorators and context managers together. 

A context manager pairs set-up and teardown actions. Also, teardown action occurs if set-up has taken place. A generator enforces sequencing and interleaving. Context Manager requires interleaving becaues set-up is interleaved with actual action in the block then we do teardown. There is also sequencing setup -> teardown.

Also, we need something to adapt the generator to this data model. We have dunder methods and take generator and fit it into them. So, we take the generator and wrap it up through decorators.

In [30]:
from sqlite3 import connect
from contextlib import contextmanager
        
@contextmanager
def temptable(cur):
    cur.execute('create table points(x int, y int)')
    print('created table')
    try:
        yield
    finally:
        cur.execute('drop table points')
        print('dropped table')
            
with connect('test.db') as conn:
    cur = conn.cursor()
    with temptable(cur):
        cur.execute('insert into points (x, y) values(1, 1)')
        cur.execute('insert into points (x, y) values(1, 2)')
        cur.execute('insert into points (x, y) values(2, 1)')
        cur.execute('insert into points (x, y) values(2, 2)')
        for row in cur.execute("select x, y from points"):
            print(row)

created table
(1, 1)
(1, 2)
(2, 1)
(2, 2)
dropped table


## Summary

<b> Expert level code</b> is not that which uses all features but the one that has certain clarity to where and when a feature should be used. A code that doesn't waste time of the person who writes it. As you have patterns and python provides mechanisms and all works seamlessly.

A code that doesn't have a lot of additional mechanisms assosciated with it and doesn't have people creating protocols or frameworks when language provides the core pieces and you just need ot understand them and assemble them. 

Core conceptual understanding of what these core features(metaclasses, decorators, generators, context managers) mean is more important.

<b>Python</b> is a language orientated around protocols, there's some behaviour, syntax, bytecode or top level function and there is a way to tell python how to implement them on an object using dunder methods.

Python is simplistic in terms of its execution model, code runs from top to bottom, and things that would not be 
executable code in other languages like class or generator definitions are live code in python.

We can hook into them, also define functions within functions based of runtime data etc.

<b>Metaclasses</b> are some hook into class construction process.

Classes are constructed at runtime you could hook code there and you can hook into creation of subclasses and asks questions like "do you have these methods implemented?". The meaning behind these is simple to enforce constraint to user code from library code.

<b>Decorators</b>, just some syntax to make it nicer but hooks into this idea in python that everything gets constructed at run time. So, that when you combine them with the ability to define functions within functions you can use to wrap sets of functions with some before and after behavior. Example: Timing, Authenticaton, Logging etc.

<b>Generators</b> are merely a way to take a single computation that would otherwise run eagerly from injection of its parameters to its final computation and interleave with other codes. By, adding yield points where you can yield intermediate result values or small computation values and also yield control back to the caller.

Think of it as taking one long piece of computation and break it up in small parts where the computation can run a small sub unit of computation in which user code can step in and do whatever is needed. Maybe use partial values or some return values. So, greater control over how much computation to run and memory.

<b>Context managers</b> are merely some structure that allow you to tie two actions together a set up action and tear down action and make sure they always happen in concordance of each other even if some error occurs in the middle.

They are related but mostly orthogonal. Remembering what these features are about and what they are for.