# Why should I write code when I can write code that writes code?
> The temptation to employ code-generating techniques in Python is strong. Much of what is called "metaprogramming" in Python refers to the various techniques through which we can write higher-order code: code that generates the code that solves our problem. This talk discusses various common approaches for code-generation, their applicability to solving real problems, and the reality of using these techniques in your work.

In [None]:
# Consider a simple function
def f(x, y):
    return x + y
print(f'{f(10, 20) = }')

f(10, 20) = 30


What happens is that python takes the source code and converts that into byte code and actually executes the byte code.

In [None]:
from dis import dis
dis(f)

  3           0 LOAD_FAST                0 (x)
              2 LOAD_FAST                1 (y)
              4 BINARY_ADD
              6 RETURN_VALUE


WHat happpens beyond is that python takes this source code on dis transforms it into an AST. The ast looks something like this.

In [None]:
code = '''
def f(x, y):
    return x + y
'''

from ast import parse, dump
parse(code)

<_ast.Module at 0x7f24b81cec10>

In [None]:
code = '''
def f(x, y):
    return x + y
'''

from pprint import pprint
pprint(dump(parse(code)))

("Module(body=[FunctionDef(name='f', args=arguments(posonlyargs=[], "
 "args=[arg(arg='x', annotation=None, type_comment=None), arg(arg='y', "
 'annotation=None, type_comment=None)], vararg=None, kwonlyargs=[], '
 'kw_defaults=[], kwarg=None, defaults=[]), '
 "body=[Return(value=BinOp(left=Name(id='x', ctx=Load()), op=Add(), "
 "right=Name(id='y', ctx=Load())))], decorator_list=[], returns=None, "
 'type_comment=None)], type_ignores=[])')


Once it creates the AST, it goes and tries to generate bytecode from that AST (like generating a symbol table, constant folding, optimization) but at the end you have a code object that looks like this.

In [None]:
code = '''
def f(x, y):
    return x + y
'''

ast = parse(code)
bytecode = compile(ast, '', mode='exec')
bytecode

<code object <module> at 0x7f249fc463a0, file "", line 2>

In [None]:
# bytecode is exactly some bytes which tell python interpreter what to do
bytecode.co_code

b'd\x00d\x01\x84\x00Z\x00d\x02S\x00'

In [None]:
list(bytecode.co_code)

[100, 0, 100, 1, 132, 0, 90, 0, 100, 2, 83, 0]

Bytecode is singe byte numeric values representing what is the operation that the python interpreter should do and the best way to look at that is to see what each number means.

In [None]:
from dis import opname
pprint([opname[x] for x in bytecode.co_code])

['LOAD_CONST',
 '<0>',
 'LOAD_CONST',
 'POP_TOP',
 'MAKE_FUNCTION',
 '<0>',
 'STORE_NAME',
 '<0>',
 'LOAD_CONST',
 'ROT_TWO',
 'RETURN_VALUE',
 '<0>']


Now if we want to go deeper than that, we can think of how does it work in python. So the most naive understanding is there's a mechanism in python that looks at the bytecodes and runs them one by one. This is exactly what is done in CPython. In CPython there is a Pyeval method which does infinite looping to go through this bytecode. So if I write a function that calls another function than we can see multiple loops of Pyeval.

What happens when we import a module? Here is a very siple example of what happens when you import a module for the first time in python. Python will see if it has imported the module before, if I have then return the value from sys.modules, if it is the first time then it looks for a .py file, opens it, gets the bytecode, compiles it and returns a namespace for it.

In [None]:
from ast import parse
from sys import modules
from pathlib import Path

def import_(mod):
    if mod not in modules:
        file = Path(mod).with_suffix('.py')
        with open(file) as f:
            source = f.read()
        ast = parse(source)
        code = compile(ast, mod, mode='exec')
        ns = {}
        exec(code, ns)
        modules[mod] = ns
    return modules[mod]

f = import_('testmod')['f']
print(f'{f(10, 20) = }')

f(10, 20) = 30


If we look at the building of a class.

In [None]:
class T:
    def f(self):
        return f'T.f({self!r})'
    
T().f()

'T.f(<__main__.T object at 0x7f93981c59a0>)'

Now to see what happens in the background is something as follows (Note the below code is not what really happens).

In [None]:
body = '''
def f(self):
    return f'T.f({self!r})'
'''

def build_class(name, body):
    ns = {} # prepare
    exec(body, ns)
    t = type(name, (), ns) # __init__, __new__
    return t

T = build_class('T', body)
T().f()

'T.f(<__main__.T object at 0x7f937b412670>)'

What actually happens is a lot uglier. What happens is that you take the body of the function, you put it into another function, you execute that function to build the class body at runtime, you do that within a namespace.

Say you have two functions f and g.

In [None]:
def f():
    if account in active_account and user in authorized_users:
        do_work()
        
def g():
    if account in active_account and user in authorized_users:
    do_other_work()

Say the underlying codebase for valid users change, so you have to update both the above functions, but that is difficult. So we may write an abstraction as a function.

In [None]:
def f():
    check_authorized()
    do_work()
    
def g():
    check_authorized()
    do_work()

We can even use decorators to help with the problem.

In [None]:
@check_authorized
def f():
    do_work()
    
@check_authorized
def g():
    do_other_work()

So we are dealing with udpate anamalies here. We want our code to remain updated in different parts of our project without having to do much work. Auto code generation is one way to deal with this problem.

Often people talk about code generaton as a metaprogramming approach. Most approaches of metaprogramming in python generally fall into four categories:
* use some build-in functionality
* hook into some built-in functionality
* construct something dynamically 
    * (at various layers)

In [None]:
def f(x, y):
    return x + y

def g(x, y):
    return x ** y

A very simple functional programming approach to combine the above functions is to use a operation method.

In [None]:
from operator import add, pow
def func(x, y, op):
    return op(x, y)

Now if you want a function f and g which performs that functionality you can create a function inside a function.

In [None]:
def create_func(op):
    def func(x, y):
        return op(x, y)
    return func
f = create_func(add)
g = create_func(pow)

If you dig deeper we can see the way we generate these functions in decorators are very closely tied to the way we create an instance of a class.