#### Ensuring environment

In [4]:
!python --version
!which python

Python 3.7.17
/home/chairs/workspace/personal-projects/lisp-compiler/quickLi/venv/bin/python


#### Exploring dis module

In [4]:
import sys, struct
import dis 
import opcode
import types

**Notes**

To compile a string containing Python source code and return a code object containing stack machine instructions and relevant memory addresses.
`compile(source, filename, mode, flag, dont_inherit, optimize)`\
`source` - String, Bytes object, or AST object. source code to compile\
`filenme` - String, location of source code string if applicable\
`mode` - robustness of call (exec, eval, single, etc.)

In [24]:
source = '''
x = 5
x += 26042424242
print(x)
'''
c = compile(source, '', 'exec')
print(type(c))
eval(c)

<class 'code'>
26042424247


In [27]:
c.co_code

b'd\x00Z\x00e\x00d\x017\x00Z\x00e\x01e\x00\x83\x01\x01\x00d\x02S\x00'

The result is a bytes literal which is prefixed with b'. It is an immutable sequence of bytes and has a type of bytes. Each byte can have a decimal value of 0 to 255. So a bytes literal is an immutable sequence of integers between 0 to 255. Each byte can be shown by an ASCII character whose character code is the same as the byte value or it can be shown by a leading \x followed by two characters. The leading \x escape means that the next two characters are interpreted as hex digits for the character code.

In [28]:
ascii = [int(b) for b in c.co_code]
    
# get opcode mnemonics using dis module 
i = 0
while i <= len(ascii) - 1:
    print(dis.opname[ascii[i]], ascii[i + 1])
    i += 2
# more effective
# dis.dis(source)

LOAD_CONST 0
STORE_NAME 0
LOAD_NAME 0
LOAD_CONST 1
INPLACE_ADD 0
STORE_NAME 0
LOAD_NAME 1
LOAD_NAME 0
CALL_FUNCTION 1
POP_TOP 0
LOAD_CONST 2
RETURN_VALUE 0


When the interpreter executes EXTENDED_ARG, its oparg (which is 1) is left-shifted by eight bits and stored in a temporary variable. Let’s call it extended_arg (do not confuse it with the opname EXTENDED_ARG). So the binary value 0b1 (the binary value of 1) is converted to 0b100000000. This is like multiplying 1 by 256 in the decimal system and extened_arg will be equal to 256. Now we have two bytes in extened_arg. When the interpreter reaches to the next instruction, this two-byte value is added to its oparg (which is 4 here) using a bitwise or\
`EXTENDED_ARG 1`\
`CALL_FUNCTION 4`\
is interpreted as:

`EXTENDED_ARG 1`\
`CALL_FUNCTION 260`

In [30]:
# sometimes we have opargs that do not fit in the default one byte
extended_arg = 1 << 8 # same as 1 * 256
extended_arg = extended_arg | 4
# extended_arg = 256 + 4 = 260