## How is Python Implemented?

Upto now, we have learnt about environments, nested environments, and executing using bindings defined in these environments. However, to do the actual execution, we have relied on the python runtime to do our work.

Much earlier in the course, when we introduced the python stack and the C stack, and showed all the pretty pictures of frames from pythontutor.com, we alluded to the stack-based execution model of Python. Now we'll get into much more detail about this, and see how python works, and how some of its features are implemented

### The virtual machine

Python runs on a virtual machine. 

A **virtual machine** is a software implementation of a real machine. As such, it will implement registers and stacks and other such constructs, along with an "assembly" language to program it.

We are interested here in a "process virtual machine".

Wikipedia:
>A process VM, sometimes called an application virtual machine, or Managed Runtime Environment (MRE), runs as a normal application inside a host OS and supports a single process. It is created when that process is started and destroyed when it exits. Its purpose is to provide a platform-independent programming environment that abstracts away details of the underlying hardware or operating system, and allows a program to execute in the same way on any platform.

Perhaps the best known example of a process virtual machine is the JVM. But python has one too, and indeed, python is compiled to an assembly like **bytecode**. This bytecode is then "interpreted" by the machine: you can think of this bytecode as the machine code for the Python Virtual Machine.

The compiler also emits some other fields, which are needed to run the interpreter. The bytecode and this additional information is stored in a `code` object.

### The stack and frames

(for how the C stack works, see http://duartes.org/gustavo/blog/post/journey-to-the-stack/ and http://duartes.org/gustavo/blog/post/epilogues-canaries-buffer-overflows/ )

![](http://aosabook.org/en/500L/interpreter-images/interpreter-callstack.png)

There are three stacks alive during the running of a python program. Since we run on a virtual machine, the call stack and stack frames are dependent on the virtual machine, rather than the real machine your code runs on. This is the critical difference between `stupidlang` and what we are doing now.

- the first is the **call stack**. This is the stack of environments you are familiar with. Often its not explicitly represented as a stack, but a recursive lookup of environments. Or, as in the C case, offsets into memory.
- the second is the **data stack or the value stack**. There is one of these per environment frame, and is used to run code in the context of that environment. This is where data-manupulating opcodes like `BINARY_ADD` run, in conjunction with namespace related opcodes such as `STORE_FAST` and `LOAD_FAST`, seen above.
- there is a third stack to handle compund statements: statements that contain other statements. This stack is known as the **block stack**.

All of these are put together in the frame structure:

```c

typedef struct _frame {
   PyObject_VAR_HEAD
   struct _frame *f_back;   /* previous frame, or NULL */
   PyCodeObject *f_code;    /* code segment */
   PyObject *f_builtins;    /* builtin symbol table */
   PyObject *f_globals;     /* global symbol table */
   PyObject *f_locals;      /* local symbol table */
   PyObject **f_valuestack; /* points after the last local */
   PyObject **f_stacktop;   /* current top of valuestack */
   PyObject *f_trace;       /* trace function */
 
   /* used for swapping generator exceptions */
   PyObject *f_exc_type, *f_exc_value, *f_exc_traceback;
 
   PyThreadState *f_tstate; /* call stack's thread state */
   int f_lasti;             /* last instruction if called */
   int f_lineno;            /* current line # (if tracing) */
   int f_iblock;            /* index in f_blockstack */
 
   /* for try and loop blocks */
   PyTryBlock f_blockstack[CO_MAXBLOCKS];
 
   /* dynamically: locals, free vars, cells and valuestack */
   PyObject *f_localsplus[1]; /* dynamic portion */
} PyFrameObject;
```



### Code Objects

Code objects are created whenever a **block** of python code is compiled. Here a **block** is defined as (from the manual):

>a piece of Python program text that is executed as a unit. The following are blocks: a module, a function body, and a class definition.

Also treated as blocks are:
- every line in a repl
- strings passed to `python -c`

The text is transformed into an AST, and then `PyAST_Compile` is called on it, to produce code objects.

Code objects are immutable.

In [1]:
def f(x):
    a=1
    y = a+x
    return y

In [2]:
fcode = f.__code__
type(fcode)

code

In [3]:
def print_code(c):
    for x in dir(c):
        if x.startswith('co'):
            print(x, '=', getattr(c, x))

In [4]:
print_code(fcode)

co_argcount = 1
co_cellvars = ()
co_code = b'd\x01\x00}\x01\x00|\x01\x00|\x00\x00\x17}\x02\x00|\x02\x00S'
co_consts = (None, 1)
co_filename = <ipython-input-1-587cb7262809>
co_firstlineno = 1
co_flags = 67
co_freevars = ()
co_kwonlyargcount = 0
co_lnotab = b'\x00\x01\x06\x01\n\x01'
co_name = f
co_names = ()
co_nlocals = 3
co_stacksize = 2
co_varnames = ('x', 'a', 'y')


In [5]:
list(fcode.co_code)

[100, 1, 0, 125, 1, 0, 124, 1, 0, 124, 0, 0, 23, 125, 2, 0, 124, 2, 0, 83]

#### The `dis` machinery

In [9]:
import dis
dis.opname[100], dis.opname[125], dis.opname[124] #see dis docs for all

('LOAD_CONST', 'STORE_FAST', 'LOAD_FAST')

In [10]:
dis.dis(f)

  2           0 LOAD_CONST               1 (1)
              3 STORE_FAST               1 (a)

  3           6 LOAD_FAST                1 (a)
              9 LOAD_FAST                0 (x)
             12 BINARY_ADD
             13 STORE_FAST               2 (y)

  4          16 LOAD_FAST                2 (y)
             19 RETURN_VALUE


Columns:

1. line number
2. index into the bytecode string
3. instruction
4. argument to the instruction
5. what the argument means


In [11]:
dis.show_code(f)

Name:              f
Filename:          <ipython-input-1-587cb7262809>
Argument count:    1
Kw-only arguments: 0
Number of locals:  3
Stack size:        2
Flags:             OPTIMIZED, NEWLOCALS, NOFREE
Constants:
   0: None
   1: 1
Variable names:
   0: x
   1: a
   2: y


In [13]:
#from https://bitbucket.org/yaniv_aknin/pynards/src/c4b61c7a1798766affb49bfba86e485012af6d16/common/blog.py?at=default&fileviewer=file-view-default
import dis
import types

def get_code_object(obj, compilation_mode="exec"):
    if isinstance(obj, types.CodeType):
        return obj
    elif isinstance(obj, types.FrameType):
        return obj.f_code
    elif isinstance(obj, types.FunctionType):
        return obj.__code__
    elif isinstance(obj, str):
        try:
            return compile(obj, "<string>", compilation_mode)
        except SyntaxError as error:
            raise ValueError("syntax error in passed string") from error
    else:
        raise TypeError("get_code_object() can not handle '%s' objects" %
                        (type(obj).__name__,))

def diss(obj, mode="exec", recurse=False):
    _visit(obj, dis.dis, mode, recurse)

def ssc(obj, mode="exec", recurse=False):
    _visit(obj, dis.show_code, mode, recurse)

def _visit(obj, visitor, mode="exec", recurse=False):
    obj = get_code_object(obj, mode)
    visitor(obj)
    if recurse:
        for constant in obj.co_consts:
            if type(constant) is type(obj):
                print()
                print('recursing into %r:' % (constant,))
                _visit(constant, visitor, mode, recurse)


In [14]:
diss(f)

  2           0 LOAD_CONST               1 (1)
              3 STORE_FAST               1 (a)

  3           6 LOAD_FAST                1 (a)
              9 LOAD_FAST                0 (x)
             12 BINARY_ADD
             13 STORE_FAST               2 (y)

  4          16 LOAD_FAST                2 (y)
             19 RETURN_VALUE


In [15]:
ssc(f)

Name:              f
Filename:          <ipython-input-1-587cb7262809>
Argument count:    1
Kw-only arguments: 0
Number of locals:  3
Stack size:        2
Flags:             OPTIMIZED, NEWLOCALS, NOFREE
Constants:
   0: None
   1: 1
Variable names:
   0: x
   1: a
   2: y


What happens when `BINARY_ADD` opcode is run? Look here, in `ceval.c`: https://github.com/python/cpython/blob/master/Python/ceval.c#L1559 .
    
Wierd, huh, we run `PyNumber_Add`. How does operator overloading work then? See later...

Lets see a built-in function call:


In [16]:
def fprint(a):
    b=abs(a)
    return b
diss(fprint)

  2           0 LOAD_GLOBAL              0 (abs)
              3 LOAD_FAST                0 (a)
              6 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
              9 STORE_FAST               1 (b)

  3          12 LOAD_FAST                1 (b)
             15 RETURN_VALUE


and a conditional (there is a problem in this code only to be caught at runtime):

In [18]:
def fcond(a):
    if a < 0:
        b=abs(a)
    return b
diss(fcond)

  2           0 LOAD_FAST                0 (a)
              3 LOAD_CONST               1 (0)
              6 COMPARE_OP               0 (<)
              9 POP_JUMP_IF_FALSE       24

  3          12 LOAD_GLOBAL              0 (abs)
             15 LOAD_FAST                0 (a)
             18 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
             21 STORE_FAST               1 (b)

  4     >>   24 LOAD_FAST                1 (b)
             27 RETURN_VALUE


In [20]:
ssc(fcond)

Name:              fcond
Filename:          <ipython-input-18-3259e3577241>
Argument count:    1
Kw-only arguments: 0
Number of locals:  2
Stack size:        2
Flags:             OPTIMIZED, NEWLOCALS, NOFREE
Constants:
   0: None
   1: 0
Names:
   0: abs
Variable names:
   0: a
   1: b


In [23]:
print_code(fcond.__code__)

co_argcount = 1
co_cellvars = ()
co_code = b'|\x00\x00d\x01\x00k\x00\x00r\x18\x00t\x00\x00|\x00\x00\x83\x01\x00}\x01\x00|\x01\x00S'
co_consts = (None, 0)
co_filename = <ipython-input-18-3259e3577241>
co_firstlineno = 1
co_flags = 67
co_freevars = ()
co_kwonlyargcount = 0
co_lnotab = b'\x00\x01\x0c\x01\x0c\x01'
co_name = fcond
co_names = ('abs',)
co_nlocals = 2
co_stacksize = 2
co_varnames = ('a', 'b')


#### Inside a code object

- `co_name` : name of code object
- `co_filename` : The filename from which the code was compiled. If not a file a name is/can be assigned
- `co_varnames` : names of the local variables (including arguments). look at `co_argcount`, `co_kwonlyargcount` and `co_flags` as well
- `co_cellvars` : local variables stored in `cells` We'll see this soon, they are used to implement closures
- `co_freevars` : names of free variables (variables referenced but not defined).
- `co_names` : other names, including function attributes
- `co_argcount` : number of positional arguments
- `co_kwonlyargcount` : number of keyword arguments
- `co_nlocals` : number of local variables used in code object (including arguments).
- `co_firstlineno` : line offset for code object’s source code began, relative to the module it was defined in, starting from one. 
- `co_stacksize` : maximum size required of the value stack when running this object, statically computed by the compiler

In [12]:
print_code(f.__code__)
diss(f)

co_argcount = 1
co_cellvars = ()
co_code = b'd\x01\x00}\x01\x00|\x01\x00|\x00\x00\x17}\x02\x00|\x02\x00S'
co_consts = (None, 1)
co_filename = <ipython-input-1-587cb7262809>
co_firstlineno = 1
co_flags = 67
co_freevars = ()
co_kwonlyargcount = 0
co_lnotab = b'\x00\x01\x06\x01\n\x01'
co_name = f
co_names = ()
co_nlocals = 3
co_stacksize = 2
co_varnames = ('x', 'a', 'y')
  2           0 LOAD_CONST               1 (1)
              3 STORE_FAST               1 (a)

  3           6 LOAD_FAST                1 (a)
              9 LOAD_FAST                0 (x)
             12 BINARY_ADD
             13 STORE_FAST               2 (y)

  4          16 LOAD_FAST                2 (y)
             19 RETURN_VALUE


- `co_code` : A string representing the sequence of bytecode instructions, contains a stream of opcodes and their operands (or rather, indexes which are used with other code object fields to represent their operands, as we saw above).
- `co_consts` : literals used by the bytecode. Remember everything in a code object must be immutable, running diss and ssc on the code snippets a=(1,2,3) versus [1,2,3] and yet again versus a=(1,2,3,[4,5,6]) recommended to dig this field.

In [34]:
ssc('a=(1,2, 3, [4,5,6])')
print('--------------')
ssc('a=(1,2, 3)')


Name:              <module>
Filename:          <string>
Argument count:    0
Kw-only arguments: 0
Number of locals:  0
Stack size:        6
Flags:             NOFREE
Constants:
   0: 1
   1: 2
   2: 3
   3: 4
   4: 5
   5: 6
   6: None
Names:
   0: a
--------------
Name:              <module>
Filename:          <string>
Argument count:    0
Kw-only arguments: 0
Number of locals:  0
Stack size:        3
Flags:             NOFREE
Constants:
   0: 1
   1: 2
   2: 3
   3: None
   4: (1, 2, 3)
Names:
   0: a


- `co_lnotab` :  bytecode offsets to line numbers mapping
- `co_flags` : integer encoding flags about this code object was created 
- `co_zombieframe` : an optimization of stack frame allocation: not exposed.

In [15]:
class A():
    
    def __init__(self, a):
        self.a = a

In [16]:
dis.dis(A)

Disassembly of __init__:
  4           0 LOAD_FAST                1 (a)
              3 LOAD_FAST                0 (self)
              6 STORE_ATTR               0 (a)
              9 LOAD_CONST               0 (None)
             12 RETURN_VALUE



In [17]:
dis.show_code(A.__init__)

Name:              __init__
Filename:          <ipython-input-15-65a79aa58fbc>
Argument count:    2
Kw-only arguments: 0
Number of locals:  2
Stack size:        2
Flags:             OPTIMIZED, NEWLOCALS, NOFREE
Constants:
   0: None
Names:
   0: a
Variable names:
   0: self
   1: a


### Various situations

With this technology in hand, lets see how a whole bunch of situations are implemented

#### Global variables

In [21]:
def fglb():
    global xxx
    global aaa
    y = 2
    xxx=3
    aaa += 1
diss(fglb)

  4           0 LOAD_CONST               1 (2)
              3 STORE_FAST               0 (y)

  5           6 LOAD_CONST               2 (3)
              9 STORE_GLOBAL             0 (xxx)

  6          12 LOAD_GLOBAL              1 (aaa)
             15 LOAD_CONST               3 (1)
             18 INPLACE_ADD
             19 STORE_GLOBAL             1 (aaa)
             22 LOAD_CONST               0 (None)
             25 RETURN_VALUE


`STORE_GLOBAL` performs a binding or re-binding in the global namespace while `LOAD_GLOBAL` is generated when the compiler realizes that the variable is referenced in the function's body but never bound there. Here `aaa` may not be defined outside and could lead to a runtime erroe, but this is perfectly legal code from the perspective of the function.

The `*_FAST` opcodes are used when the compiler can infer that the variables are defined in the local namespace. There are optimized versions of `*_NAME` opcodes

In [24]:
diss('cc = dd -1')

  1           0 LOAD_NAME                0 (dd)
              3 LOAD_CONST               0 (1)
              6 BINARY_SUBTRACT
              7 STORE_NAME               1 (cc)
             10 LOAD_CONST               1 (None)
             13 RETURN_VALUE


In [35]:
ssc('cc = dd -1')

Name:              <module>
Filename:          <string>
Argument count:    0
Kw-only arguments: 0
Number of locals:  0
Stack size:        2
Flags:             NOFREE
Constants:
   0: 1
   1: None
Names:
   0: dd
   1: cc


### Closures

The compiler will treat variables defined in outer functions  that will be used in an inner lexical scope different:

In [36]:
def g():
    a = 1
    b = 2
    def h():
        b=a#b is a local in this context
        c=5
    return h
diss(g, recurse=True)

  2           0 LOAD_CONST               1 (1)
              3 STORE_DEREF              0 (a)

  3           6 LOAD_CONST               2 (2)
              9 STORE_FAST               0 (b)

  4          12 LOAD_CLOSURE             0 (a)
             15 BUILD_TUPLE              1
             18 LOAD_CONST               3 (<code object h at 0x104d62390, file "<ipython-input-36-5b074cc104ce>", line 4>)
             21 LOAD_CONST               4 ('g.<locals>.h')
             24 MAKE_CLOSURE             0
             27 STORE_FAST               1 (h)

  7          30 LOAD_FAST                1 (h)
             33 RETURN_VALUE

recursing into <code object h at 0x104d62390, file "<ipython-input-36-5b074cc104ce>", line 4>:
  5           0 LOAD_DEREF               0 (a)
              3 STORE_FAST               0 (b)

  6           6 LOAD_CONST               1 (5)
              9 STORE_FAST               1 (c)
             12 LOAD_CONST               0 (None)
             15 RETURN_VALUE


`a`, as opposed to `b` is both loaded and stored using `*_DEREF` in both the outer and inner scope. The variable is stored using a "cell". And as the cell is only accessed after the outer has finished (it is created at that time), you can even define the inner before you store `a`. Notice the "cell" in the outer and the "free" in the inner below.

In [37]:
ssc(g, recurse=True)

Name:              g
Filename:          <ipython-input-36-5b074cc104ce>
Argument count:    0
Kw-only arguments: 0
Number of locals:  2
Stack size:        3
Flags:             OPTIMIZED, NEWLOCALS
Constants:
   0: None
   1: 1
   2: 2
   3: <code object h at 0x104d62390, file "<ipython-input-36-5b074cc104ce>", line 4>
   4: 'g.<locals>.h'
Variable names:
   0: b
   1: h
Cell variables:
   0: a

recursing into <code object h at 0x104d62390, file "<ipython-input-36-5b074cc104ce>", line 4>:
Name:              h
Filename:          <ipython-input-36-5b074cc104ce>
Argument count:    0
Kw-only arguments: 0
Number of locals:  2
Stack size:        1
Flags:             OPTIMIZED, NEWLOCALS, NESTED
Constants:
   0: None
   1: 5
Variable names:
   0: b
   1: c
Free variables:
   0: a


What is a cell? Its a special place the "DEREF"s are stored. When a closure is created, the variables bound in an outer scope are copied to a "cell: area, and when the function is run, they are copied into the "fast" area of the function.

#### nonlocal

Now we can understand why we need `nonlocal` in some situations.

In [38]:
g.__code__.co_cellvars

('a',)

In [39]:
def g2():
    a = 1
    b = 2
    def h():
        c=b#here we refer to b from g2 but h thinks its local
        b=a
    return h
diss(g2, recurse=True)

  2           0 LOAD_CONST               1 (1)
              3 STORE_DEREF              0 (a)

  3           6 LOAD_CONST               2 (2)
              9 STORE_FAST               0 (b)

  4          12 LOAD_CLOSURE             0 (a)
             15 BUILD_TUPLE              1
             18 LOAD_CONST               3 (<code object h at 0x104d625d0, file "<ipython-input-39-87717a4edf23>", line 4>)
             21 LOAD_CONST               4 ('g2.<locals>.h')
             24 MAKE_CLOSURE             0
             27 STORE_FAST               1 (h)

  7          30 LOAD_FAST                1 (h)
             33 RETURN_VALUE

recursing into <code object h at 0x104d625d0, file "<ipython-input-39-87717a4edf23>", line 4>:
  5           0 LOAD_FAST                0 (b)
              3 STORE_FAST               1 (c)

  6           6 LOAD_DEREF               0 (a)
              9 STORE_FAST               0 (b)
             12 LOAD_CONST               0 (None)
             15 RETURN_VALUE


In [40]:
g2sh=g2()
g2sh()

UnboundLocalError: local variable 'b' referenced before assignment

We can fix this by using `nonlocal`, but its illegal to this outside a inner lexical scope.

In [41]:
def g2():
    a = 1
    b = 2
    def h():
        nonlocal b
        c=b#here we refer to b from g2 but h thinks its local
        b=a
    return h
diss(g2, recurse=True)

  2           0 LOAD_CONST               1 (1)
              3 STORE_DEREF              0 (a)

  3           6 LOAD_CONST               2 (2)
              9 STORE_DEREF              1 (b)

  4          12 LOAD_CLOSURE             0 (a)
             15 LOAD_CLOSURE             1 (b)
             18 BUILD_TUPLE              2
             21 LOAD_CONST               3 (<code object h at 0x104d62420, file "<ipython-input-41-601bed7b46b6>", line 4>)
             24 LOAD_CONST               4 ('g2.<locals>.h')
             27 MAKE_CLOSURE             0
             30 STORE_FAST               0 (h)

  8          33 LOAD_FAST                0 (h)
             36 RETURN_VALUE

recursing into <code object h at 0x104d62420, file "<ipython-input-41-601bed7b46b6>", line 4>:
  6           0 LOAD_DEREF               1 (b)
              3 STORE_FAST               0 (c)

  7           6 LOAD_DEREF               0 (a)
              9 STORE_DEREF              1 (b)
             12 LOAD_CONST          

In [42]:
ssc(g2, recurse=True)

Name:              g2
Filename:          <ipython-input-41-601bed7b46b6>
Argument count:    0
Kw-only arguments: 0
Number of locals:  1
Stack size:        3
Flags:             OPTIMIZED, NEWLOCALS
Constants:
   0: None
   1: 1
   2: 2
   3: <code object h at 0x104d62420, file "<ipython-input-41-601bed7b46b6>", line 4>
   4: 'g2.<locals>.h'
Variable names:
   0: h
Cell variables:
   0: a
   1: b

recursing into <code object h at 0x104d62420, file "<ipython-input-41-601bed7b46b6>", line 4>:
Name:              h
Filename:          <ipython-input-41-601bed7b46b6>
Argument count:    0
Kw-only arguments: 0
Number of locals:  1
Stack size:        1
Flags:             OPTIMIZED, NEWLOCALS, NESTED
Constants:
   0: None
Variable names:
   0: c
Free variables:
   0: a
   1: b


In [31]:
g2sh=g2()
g2sh()

Notice we have not talked anything about frames yet. Thats because, so far, we have only defined functions. Lets see what happens when we execute them.

### Execution and Binding lookup

Frame creation occurs in when a code object needs to be evaulated:

- when a function is called
- when a module is imported (top-level code is executed)
- when a class is defined
- every  command in the repl
- when eval or exec are used
- when the -c switch is used 

Let's go back to our frame structure in CPython:

```c

typedef struct _frame {
   PyObject_VAR_HEAD
   struct _frame *f_back;   /* previous frame, or NULL */
   PyCodeObject *f_code;    /* code segment */
   PyObject *f_builtins;    /* builtin symbol table */
   PyObject *f_globals;     /* global symbol table */
   PyObject *f_locals;      /* local symbol table */
   PyObject **f_valuestack; /* points after the last local */
   PyObject **f_stacktop;   /* current top of valuestack */
   PyObject *f_trace;       /* trace function */
 
   /* used for swapping generator exceptions */
   PyObject *f_exc_type, *f_exc_value, *f_exc_traceback;
 
   PyThreadState *f_tstate; /* call stack's thread state */
   int f_lasti;             /* last instruction if called */
   int f_lineno;            /* current line # (if tracing) */
   int f_iblock;            /* index in f_blockstack */
 
   /* for try and loop blocks */
   PyTryBlock f_blockstack[CO_MAXBLOCKS];
 
   /* dynamically: locals, free vars, cells and valuestack */
   PyObject *f_localsplus[1]; /* dynamic portion */
} PyFrameObject;
```

- `f_code` points to precisely one code object per frame. So when we have a call stack of frames, this corresponds to call stack of code objects.
- when python code is evaluated, it is done so in 3 namespaces corresponding to three symbol tables: `f_builtins`, `f_globals`, and `f_locals`. A name will first be resolved in the local scope, then in the global scope, and then in the builtin scope. For nested scopes like in closures, we'll first search the local scopes of the outer functions and only then go to the global and the builtin scope. This rule can be thought of as **LEGB**.
- a frame is a variable sized object as seen in `f_localsplus` 

Remember how we had `co_nlocals, co_cellvars, co_freevars and co_stacksize` when we created code objects? At that point, nothing was run, so these parts of the code object were dead. Now, when we run the function, these parts come alive in the `f_localsplus` space: this is the local-variables + cells + free variables + value(data) stack. We can inspect the stacks, but not everytning in the C implementation is introspecte into python


#### closures at runtime

Lets now see how closures work at runtime.

In [77]:
import inspect

def g():
    a = 1
    b = 2
    outerf = inspect.currentframe()
    print(outerf,outerf.f_locals, "p",outerf.f_back, "pp", outerf.f_back.f_back)
    def m():
        z=3
        print(a)
        print("in m")
        innerf = inspect.currentframe()
        print(innerf,innerf.f_locals)
        print('--------------')
        print("p", innerf.f_back, innerf.f_back.f_locals)
    m()
    def h():
        b=a#b is a local in this context
        c=5
        innerf = inspect.currentframe()
        print("in h")
        print(innerf,innerf.f_locals)
        print('--------------')
        print("p",innerf.f_back)
    return h

In [78]:
gh=g()#notice the parents. i got tripped by this.

<frame object at 0x100986618> {'a': 1, 'outerf': <frame object at 0x100986618>, 'b': 2} p <frame object at 0x1006ad118> pp <frame object at 0x1011b1818>
1
in m
<frame object at 0x1006ae628> {'z': 3, 'a': 1, 'innerf': <frame object at 0x1006ae628>}
--------------
p <frame object at 0x100986618> {'a': 1, 'outerf': <frame object at 0x100986618>, 'm': <function g.<locals>.m at 0x104d67488>, 'b': 2}


In [79]:
g.__closure__, gh.__closure__

(None, (<cell at 0x103b6ac78: int object at 0x10026dfa0>,))

In [80]:
gh()#on running, the cells in closure are copied into localvars

in h
<frame object at 0x103512bb8> {'a': 1, 'innerf': <frame object at 0x103512bb8>, 'b': 1, 'c': 5}
--------------
p <frame object at 0x104d63240>


From http://stupidpythonideas.blogspot.com/2015/12/how-lookup-works.html

>When you construct a nested function at runtime, there's zero or more LOAD_CLOSURE bytecodes to push the current frame's cells onto the stack, then a MAKE_CLOSURE instead of MAKE_FUNCTION, which does the extra step of popping those cells off and stashing them in the function's `__closure__` attribute. And when you call a function, its `__closure__` cells get copied into the `f_localsplus` array.

Why so? `f_localsplus` is an array and access is really fast. This is also where `LOAD_FAST` and `STORE_FAST` work from! See the above link for a lots of details.

### The block stack

compound statements require some internal state to be evaluated. In a loop, there might be a break or continue, and we need to know then where to go. When we handle exceptions, we need to know what the innermost exception handler is. All of this is done by using the block stack.

In [38]:
def l():
    for i in [1,2,3]:
        print(i)
diss(l)

  2           0 SETUP_LOOP              33 (to 36)
              3 LOAD_CONST               1 (1)
              6 LOAD_CONST               2 (2)
              9 LOAD_CONST               3 (3)
             12 BUILD_LIST               3
             15 GET_ITER
        >>   16 FOR_ITER                16 (to 35)
             19 STORE_FAST               0 (i)

  3          22 LOAD_GLOBAL              0 (print)
             25 LOAD_FAST                0 (i)
             28 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
             31 POP_TOP
             32 JUMP_ABSOLUTE           16
        >>   35 POP_BLOCK
        >>   36 LOAD_CONST               0 (None)
             39 RETURN_VALUE


`f_blockstack` in the frame is a fixed size stack. `f_iblock` is an offset  into the stack (we store `PyTryBlock`s in the stack). `SETUP_LOOP` above now pushes a new try-block onto the stack (more precisely it just updates the index in memory). Then the instructions go in there, finally to be popped after the list is done.

A `break` pops the block stack and find where in the bytecode we should go to resume by looking at the popped `PyTryBlock`. (Here the inner block with the `BREAK_LOOP` opcode.)

In [39]:
def l():
    for i in [1,2,3]:
        for j in range(i, i+2):
            if j==4:
                break
        print(i)
diss(l)

  2           0 SETUP_LOOP              76 (to 79)
              3 LOAD_CONST               1 (1)
              6 LOAD_CONST               2 (2)
              9 LOAD_CONST               3 (3)
             12 BUILD_LIST               3
             15 GET_ITER
        >>   16 FOR_ITER                59 (to 78)
             19 STORE_FAST               0 (i)

  3          22 SETUP_LOOP              40 (to 65)
             25 LOAD_GLOBAL              0 (range)
             28 LOAD_FAST                0 (i)
             31 LOAD_FAST                0 (i)
             34 LOAD_CONST               2 (2)
             37 BINARY_ADD
             38 CALL_FUNCTION            2 (2 positional, 0 keyword pair)
             41 GET_ITER
        >>   42 FOR_ITER                19 (to 64)
             45 STORE_FAST               1 (j)

  4          48 LOAD_FAST                1 (j)
             51 LOAD_CONST               4 (4)
             54 COMPARE_OP               2 (==)
             57 POP_JUMP_IF_FAL

You can imagine that exception handling is similar

In [40]:
def divide(a,b):
    try:
        c = a /b
    except ZeroDivisionError:
        return None
    return c
diss(divide)

  2           0 SETUP_EXCEPT            14 (to 17)

  3           3 LOAD_FAST                0 (a)
              6 LOAD_FAST                1 (b)
              9 BINARY_TRUE_DIVIDE
             10 STORE_FAST               2 (c)
             13 POP_BLOCK
             14 JUMP_FORWARD            22 (to 39)

  4     >>   17 DUP_TOP
             18 LOAD_GLOBAL              0 (ZeroDivisionError)
             21 COMPARE_OP              10 (exception match)
             24 POP_JUMP_IF_FALSE       38
             27 POP_TOP
             28 POP_TOP
             29 POP_TOP

  5          30 LOAD_CONST               0 (None)
             33 RETURN_VALUE
             34 POP_EXCEPT
             35 JUMP_FORWARD             1 (to 39)
        >>   38 END_FINALLY

  6     >>   39 LOAD_FAST                2 (c)
             42 RETURN_VALUE


Its beyond our scope to go into details, but please see the implementation in `byterun` if you are interested in understanding how this works.

What about generators? This is a topic for next time.

In [41]:
#debugging? bit more on exceptions? bit more on objects

### Objects and operator overloading

Everything in python is an object. Even stack frames and code objects. The following must be defined for something to be an object:

```c
typedef struct _object {
    Py_ssize_t ob_refcnt;
    struct _typeobject *ob_type;
} PyObject;
```

Most objects have at-least this, with other fields thrown in depending upon what objects these are. And type and class are one and the same thing.

Now when we invoke `BINARY_ADD`, and thus `PyNumber_Add`, what happens is this:

A type object, either for a number, or something else we define by making a class, has a slot which holds the 'addition' operation for that type. Yes this is what gets dispatched too when we call `__add__`, and thse slots are defined in `C` for the built-in types so that things are fast. We could too, if we wanted fast objects: numpy does this. 

See https://github.com/python/cpython/blob/1364858e6ec7abfe04d92b7796ae8431eda87a7a/Objects/abstract.c#L892 for `PyNumber_Add` and follow the rabbithole down from it: the slot in question is `nb_add`. For user defined classes, see here: https://github.com/python/cpython/blob/16f526e7cb0b5d81de47f513ad112cda61574331/Objects/typeobject.c#L2269 and `fixup_slot_dispatchers` referenced therein.

We wont say much about classes here: to do so would require going into descriptors, and the order of attribute lookup and such, which we havent done yet (and wont do in this class). If you are interested you should read the meta-programming chapter in Fluent Python and the class lookup section here: http://stupidpythonideas.blogspot.com/2015/12/how-lookup-works.html .

### Putting it together

Lets put it all together now to get a picture of how the python interpreter runs.

#### The startup

1. `-c` to execute a string, `-m` to execute a module as a script, or execute a file or run a REPL.
2. standard c-library initialization happens for python executable. Then `Modules/python.c`: `main`, which soon calls `Modules/main.c`: `Py_Main` and then `Python/pythonrun.c`: `Py_Initialize` is called. It creates the *interpreter state* and *thread state*, sets up `sys` and the builtins
3. code is transformed into a CST and from there to an AST and finally compiled into a code object using `Python/ast.c`: `PyAST_FromNode`. We are ready now to interpret the machine code with the "virtual machine" interpreter.
4. Now `Python/pythonrun.c`: `PyEval_EvalCode` is run, with the code object and a namespace (currently both the `__main__`). Each Python thread; here the main one; has a thread state pointing to the call stack of currently executing frames. Now out `__main__` namespace is the one frame at the top of this stack and its byte-code is evaluated one by on by `PyEval_EvalFrameEx` in `ceval.c`

#### The thread-state

Running python code in a thread means evaluating frames or more precisely the opcodes in the code object associated a frame usuing bindings from the frame and using the call stack to find enclosed scopes before we look at globals and builtins. There is one call stack per frame, and this bookkeeping of frames is done by `Include/pystate.h`: `PyThreadState`.

```c
typedef struct _ts {
 struct _ts *next;
 PyInterpreterState *interp;
 struct _frame *frame;
 int recursion_depth;
 int tracing;
 int use_tracing;
 Py_tracefunc c_profilefunc;
 Py_tracefunc c_tracefunc;
 PyObject *c_profileobj;
 PyObject *c_traceobj;
 PyObject *curexc_type;
 PyObject *curexc_value;
 PyObject *curexc_traceback;
 PyObject *exc_type;
 PyObject *exc_value;
 PyObject *exc_traceback;
 PyObject *dict;
 int tick_counter;
 int gilstate_counter;
 PyObject *async_exc;
 long thread_id;
} PyThreadState;
```

#### The interpreter state

This is at `Include/pystate.h`: `PyInterpreterState` and is created on `PyInitialize` or `Py_NewInterpreter` (for multiple-interpreter processes), along with an corresponding `PyThreadState`.

```c
typedef struct _is {

    struct _is *next;
    struct _ts *tstate_head;

    PyObject *modules;
    PyObject *sysdict;
    PyObject *builtins;
    PyObject *modules_reloading;

    PyObject *codec_search_path;
    PyObject *codec_search_cache;
    PyObject *codec_error_registry;

#ifdef HAVE_DLOPEN
    int dlopenflags;
#endif
#ifdef WITH_TSC
    int tscdump;
#endif

} PyInterpreterState;
```

Thus we now have:
- opcodes belonging to a code object 
- which belongs to a currently evaluating frame (in a stack of frames)
- which belongs to a thread (in a bunch os threads)
- which belongs to an interpreter (possibly in a bunch of interpreters)
- in a process

(from niltowrite)
![](https://niltowrite.files.wordpress.com/2010/05/states4.png)

#### the famous GIL

Notice that the IS has only one TS, and a TS can belong to only one IS. The former observation is the observation of the GIL: only one thread can be executing python code at a time. This is a synchronization problem, and can be bypassed by running IO code or code in an extension module written in C (numpy for example). 

Why the GIL? The short answer is that it makes single-threaded code faster and allows for easy integration on non thread-safe C libs. See http://programmers.stackexchange.com/questions/186889/why-was-python-written-with-the-gil for more details. 

Thread safety: https://en.wikipedia.org/wiki/Thread_safety (re-entrancy: https://www.quora.com/When-is-a-function-reentrant-How-does-that-relate-to-it-being-thread-safe)

We'll have more to say about concurrency and parallelism in the next few...

#### The evaluation of opcodes

Whenever a thread is running it holds the GIL, and gives it up when in a C extension module or doing IO. So there is a kind of co-operative multitasking going on. If we are completely CPU bound, then a check is made every 100 ticks, where a tick is roughly correspondent to an opcode execution.
