# Python Bytecode and Disassembly
## Compiling Python Code and Code Objects
Usually the difference between compiled and interpreted languages is that a compiled language must be converted from high-level code to CPU instructions whereas interpreted languages read code line by line and execute it. However, Python is not fully interpreted like shells are nor is it compiled in the traditional sense we know. Instead Python source code is compiled to a simpler instruction set called bytecode which is executed by the Python Virtual Machine. This compilation step is usually performed at runtime as needed and is why you may `.pyc` files in a `__pycache__` folder after execution. This is where Python caches is compilation to use next execution if the code remains unchanged. It is possible to compile your Python files before execution using `python -m compileall [file]` and `.pyc` files can be run like normal Python files using `python [pyc_file]`.

To compile a string of Python code to a code object we can use the built in `compile` function. This gives us a code object which has different attributes like `co_code` which contains the bytecode in byte form, `co_consts` which contains all constants, and `co_varnames` which contains a list of variable names used in the code. Notice how any methods, classes, or lambda expressions within your code is compiled to seperate code objects stored within `co_consts`.

In [1]:
code = "print('hello world!')"
code = compile(code, 'test', 'exec')
code

<code object <module> at 0x7f5ce4343240, file "test", line 1>

In [2]:
code.co_code

b'e\x00d\x00\x83\x01\x01\x00d\x01S\x00'

In [3]:
code.co_consts

('hello world!', None)

In [4]:
eval(code)

hello world!


## Disassembling Code
[`dis`](https://docs.python.org/3/library/dis.html) is a built-in module for disassembling CPython bytecode. The built in `dis.dis` method prints out the disassembly of any code object passed to it.

In [5]:
import dis
dis.dis(code)

  1           0 LOAD_NAME                0 (print)
              2 LOAD_CONST               0 ('hello world!')
              4 CALL_FUNCTION            1
              6 POP_TOP
              8 LOAD_CONST               1 (None)
             10 RETURN_VALUE


This disassembly format may be hard to read at first but from left to right is the line number in the code, byte offset, instruction name, and arguments. Python bytecode is run in a stack-based virtual machine and is oriented entirely around stack based data structures.
### Common Bytecode Instructions
Here are some of the common instructions you'll see in Python Bytecode
#### `POP_TOP`
Bytecode is run in a stack and this instruction removes the item at the top of the stack (usually refered to as top-of-stack or TOS)
#### `LOAD_NAME`
Pushes the value associated with the name onto the stack. Takes a single name argument.
#### `LOAD_CONST`
Pushes the constant argument onto the stack.
#### `LOAD_GLOBAL`
Pushes the value associated with the global name argument onto the stack. Works similarily to `LOAD_NAME`.
#### `STORE_NAME`
Stores the TOS item to the variable name given in argument. Removes the TOS from stack.
#### `STORE_FAST`
Stores the TOS item to the variable name given.
#### `CALL_FUNCTION`
Calls a callable/function item with positional arguments. The argument for this instruction is the number of positional arguments should be passed to the function from the TOS. Below the arguments is the callable item. This instruction will pop all the positional arguments and callable item off the stack and push the return value of the function.
#### `LOAD_METHOD`
Loads a method item with name matching argument from the TOS item. TOS is popped and the object and method are pushed to TOS if it is an unbound method. Otherwise, `NULL` and the method are pushed to TOS.
#### `CALL_METHOD`
Calls a method with positional arguments. The argument for this instruction is number of positional arguments passed from TOS. Below the arguments is the callable method and method object. Arguments, method, and method object are poped and the return value is pushed to TOS. This instruction is designed to be used with the `LOAD_METHOD` instruction.
#### `RETURN_VALUE`
Returns the TOS item to the caller of the function.

### Math Instructions
#### Binary Operations

In [6]:
code = """
a = 2
b = 3
sum = a + b
difference = a - b
product = a * b
quotient = a / b
floored = a // b
power = a ** b
"""
code = compile(code, 'test', 'exec')
dis.dis(code)

  2           0 LOAD_CONST               0 (2)
              2 STORE_NAME               0 (a)

  3           4 LOAD_CONST               1 (3)
              6 STORE_NAME               1 (b)

  4           8 LOAD_NAME                0 (a)
             10 LOAD_NAME                1 (b)
             12 BINARY_ADD
             14 STORE_NAME               2 (sum)

  5          16 LOAD_NAME                0 (a)
             18 LOAD_NAME                1 (b)
             20 BINARY_SUBTRACT
             22 STORE_NAME               3 (difference)

  6          24 LOAD_NAME                0 (a)
             26 LOAD_NAME                1 (b)
             28 BINARY_MULTIPLY
             30 STORE_NAME               4 (product)

  7          32 LOAD_NAME                0 (a)
             34 LOAD_NAME                1 (b)
             36 BINARY_TRUE_DIVIDE
             38 STORE_NAME               5 (quotient)

  8          40 LOAD_NAME                0 (a)
             42 LOAD_NAME                1 (b

Binary operation instructions remove the TOS and second most top-of-stack item (TOS1), perform the operation, and push the result back onto the stack.

`BINARY_ADD`: `TOS = TOS + TOS1`

`BINARY_SUBTRACT`: `TOS = TOS - TOS1`

`BINARY_MULTIPLY`: `TOS = TOS * TOS1`

`BINARY_TRUE_DIVIDE`: `TOS = TOS / TOS1`

`BINARY_FLOOR_DIVIDE`: `TOS = TOS // TOS1`

`BINARY_POWER`: `TOS = TOS ** TOS1`

`BINARY_MODULO`: `TOS = TOS % TOS1`

`BINARY_XOR`: `TOS = TOS ^ TOS1`

`BINARY_AND`: `TOS = TOS & TOS1`

`BINARY_OR`: `TOS = TOS | TOS1`

#### Inplace Operations

In [7]:
code = """
a = 13
b = 15
a += b
b -= b
a *= b
a /= b
"""
code = compile(code, 'test', 'exec')
dis.dis(code)

  2           0 LOAD_CONST               0 (13)
              2 STORE_NAME               0 (a)

  3           4 LOAD_CONST               1 (15)
              6 STORE_NAME               1 (b)

  4           8 LOAD_NAME                0 (a)
             10 LOAD_NAME                1 (b)
             12 INPLACE_ADD
             14 STORE_NAME               0 (a)

  5          16 LOAD_NAME                1 (b)
             18 LOAD_NAME                1 (b)
             20 INPLACE_SUBTRACT
             22 STORE_NAME               1 (b)

  6          24 LOAD_NAME                0 (a)
             26 LOAD_NAME                1 (b)
             28 INPLACE_MULTIPLY
             30 STORE_NAME               0 (a)

  7          32 LOAD_NAME                0 (a)
             34 LOAD_NAME                1 (b)
             36 INPLACE_TRUE_DIVIDE
             38 STORE_NAME               0 (a)
             40 LOAD_CONST               2 (None)
             42 RETURN_VALUE


Inplace operation instructions are used when there for inplace operators in Python code. Like binary operation instructions, they pop TOS and TOS1, perform the operation, and push the result onto the stack.

`INPLACE_ADD`: `TOS += TOS1`

`INPLACE_SUBTRACT`: `TOS -= TOS1`

`INPLACE_MULTIPLY`: `TOS *= TOS1`

`INPLACE_POWER`: `TOS **= TOS1`

`INPLACE_TRUE_DIVIDE`: `TOS /= TOS1`

`INPLACE_FLOOR_DIVIDE`: `TOS //= TOS1`

`INPLACE_MODULO`: `TOS %= TOS1`

`INPLACE_XOR`: `TOS ^= TOS1`

`INPLACE_AND`: `TOS &= TOS1`

`INPLACE_OR`: `TOS |= TOS1`

#### Unary Operation Instructions
These operation instructions take the TOS, apply an operation, and push the result back onto the stack.

`UNARY_POSITIVE`: `TOS = +TOS`

`UNARY_NEGATIVE`: `TOS = -TOS`

`UNARY_NOT`: `TOS = not TOS`

`UNARY_INVERT`: 

### Comparison Instructions

In [8]:
code = """
1 == 1
1 != 0
1 <= 0
1 in [1,2,3]
a = 'a'
b = a
b is a
"""
code = compile(code, 'test', 'exec')
dis.dis(code)

  2           0 LOAD_CONST               0 (1)
              2 LOAD_CONST               0 (1)
              4 COMPARE_OP               2 (==)
              6 POP_TOP

  3           8 LOAD_CONST               0 (1)
             10 LOAD_CONST               1 (0)
             12 COMPARE_OP               3 (!=)
             14 POP_TOP

  4          16 LOAD_CONST               0 (1)
             18 LOAD_CONST               1 (0)
             20 COMPARE_OP               1 (<=)
             22 POP_TOP

  5          24 LOAD_CONST               0 (1)
             26 LOAD_CONST               2 ((1, 2, 3))
             28 CONTAINS_OP              0
             30 POP_TOP

  6          32 LOAD_CONST               3 ('a')
             34 STORE_NAME               0 (a)

  7          36 LOAD_NAME                0 (a)
             38 STORE_NAME               1 (b)

  8          40 LOAD_NAME                1 (b)
             42 LOAD_NAME                0 (a)
             44 IS_OP                    0


#### `COMPARE_OP`
Takes a comparison operator as an argument, takes the TOS and TOS1, performs the comparison on them, and pushes the True or False result onto the stack
#### `IS_OP`
Has a single invert argument, takes the TOS and TOS1, performs a `is` comparison on them, unless the invert argument is set to 1 where the instruction performs a `is not` comparison. Pushes the result onto the stack
#### `CONTAINS_OP`
Has a single invert argument, takes the TOS and TOS1, performs a `in` comparison on them, unless the invert argument is set to 1 where the instruction performs a `not in` comparison. Pushes the result onto the stack

### Conditionals
Conditionals and loops in Python bytecode are performed using jump instructions

In [9]:
code = """
num = 5
if num % 2 == 0:
    print('even')
else:
    print('odd')
"""
code = compile(code, 'test', 'exec')
dis.dis(code)

  2           0 LOAD_CONST               0 (5)
              2 STORE_NAME               0 (num)

  3           4 LOAD_NAME                0 (num)
              6 LOAD_CONST               1 (2)
              8 BINARY_MODULO
             10 LOAD_CONST               2 (0)
             12 COMPARE_OP               2 (==)
             14 POP_JUMP_IF_FALSE       26

  4          16 LOAD_NAME                1 (print)
             18 LOAD_CONST               3 ('even')
             20 CALL_FUNCTION            1
             22 POP_TOP
             24 JUMP_FORWARD             8 (to 34)

  6     >>   26 LOAD_NAME                1 (print)
             28 LOAD_CONST               4 ('odd')
             30 CALL_FUNCTION            1
             32 POP_TOP
        >>   34 LOAD_CONST               5 (None)
             36 RETURN_VALUE


`if`, `elif`, and `else` statements in Python bytecode are done using the `POP_JUMP_IF_FALSE` and `POP_JUMP_IF_TRUE` instructions. The conditional blocks end with a `JUMP_FORWARD` instruction to jump to the next instruction outside of conditional blocks. Jump targets are easy to find in the disassembler's output as they are marked by a `>>` before the bytecode offset.
#### `POP_JUMP_IF_FALSE`
If TOS is false jumps to bytecode instruction at target argument. Pops TOS.
#### `POP_JUMP_IF_TRUE`
If TOS is true jumps to bytecode instruction at target argument. Pops TOS.
#### `JUMP_FORWARD`
Increments the bytecode counter by the delta argument.