<center>
    <img src="https://upload.wikimedia.org/wikipedia/commons/a/a8/%D0%9B%D0%9E%D0%93%D0%9E_%D0%A8%D0%90%D0%94.png" width=500px/>
    <font>Python 2025</font><br/>
    <br/>
    <br/>
    <b style="font-size: 2em">CPython Bytecode</b><br/>
    <br/>
    <font>Алексей Стыценко</font><br/>
</center>

Сначала немного новостей

# Вышел Python 3.14 🥳

https://docs.python.org/3/whatsnew/3.14.html

Курс продолжаем на 3.13.7 без изменений

# Как работает интерпретатор

Возьмём простую программу

In [7]:
program = """\
x = 1
y = 2
print(3*(x+y))
"""

In [8]:
exec(program)

9


Как интерпретатор её выполняет?

Шаг 1: Токенизация

In [10]:
import io
import tokenize

list(tokenize.generate_tokens(io.StringIO(program).readline))

[TokenInfo(type=1 (NAME), string='x', start=(1, 0), end=(1, 1), line='x = 1\n'),
 TokenInfo(type=55 (OP), string='=', start=(1, 2), end=(1, 3), line='x = 1\n'),
 TokenInfo(type=2 (NUMBER), string='1', start=(1, 4), end=(1, 5), line='x = 1\n'),
 TokenInfo(type=4 (NEWLINE), string='\n', start=(1, 5), end=(1, 6), line='x = 1\n'),
 TokenInfo(type=1 (NAME), string='y', start=(2, 0), end=(2, 1), line='y = 2\n'),
 TokenInfo(type=55 (OP), string='=', start=(2, 2), end=(2, 3), line='y = 2\n'),
 TokenInfo(type=2 (NUMBER), string='2', start=(2, 4), end=(2, 5), line='y = 2\n'),
 TokenInfo(type=4 (NEWLINE), string='\n', start=(2, 5), end=(2, 6), line='y = 2\n'),
 TokenInfo(type=1 (NAME), string='print', start=(3, 0), end=(3, 5), line='print(3*(x+y))\n'),
 TokenInfo(type=55 (OP), string='(', start=(3, 5), end=(3, 6), line='print(3*(x+y))\n'),
 TokenInfo(type=2 (NUMBER), string='3', start=(3, 6), end=(3, 7), line='print(3*(x+y))\n'),
 TokenInfo(type=55 (OP), string='*', start=(3, 7), end=(3, 8), line

Шаг 2: Построение дерева разбора

Грамматика языка: https://docs.python.org/3/reference/grammar.html

<div class="alert alert-warning">
На самом деле c версии 3.9 промежуточное дерево разбора не строится, строится сразу AST.

До версии 3.10 в стандартной библиотеке был модуль `parser`, с помощью которого можно было построить дерево разбора. Он устарел, и с версии 3.10 был удалён. На следующем слайде дерево разбора, построенное в версии 3.9.
</div>

Шаг 2: Построение дерева разбора (python 3.9)

In [18]:
import symbol
import token
import parser

# хелпер для вывода дерева разбора в читаемом виде
# https://realpython.com/cpython-source-code-guide/#lexing-and-parsing
def lex(expression):
    symbols = {v: k for k, v in symbol.__dict__.items() if isinstance(v, int)}
    tokens = {v: k for k, v in token.__dict__.items() if isinstance(v, int)}
    lexicon = {**symbols, **tokens}
    st = parser.expr(expression)
    st_list = parser.st2list(st)

    def replace(l: list):
        r = []
        for i in l:
            if isinstance(i, list):
                r.append(replace(i))
            else:
                if i in lexicon:
                    r.append(lexicon[i])
                else:
                    r.append(i)
        return r

    return replace(st_list)

Шаг 2: Построение дерева разбора (python 3.9)

In [19]:
lex('3*(x+y)')

['eval_input',
 ['testlist',
  ['test',
   ['or_test',
    ['and_test',
     ['not_test',
      ['comparison',
       ['expr',
        ['xor_expr',
         ['and_expr',
          ['shift_expr',
           ['arith_expr',
            ['term',
             ['factor', ['power', ['atom_expr', ['atom', ['NUMBER', '3']]]]],
             ['STAR', '*'],
             ['factor',
              ['power',
               ['atom_expr',
                ['atom',
                 ['LPAR', '('],
                 ['testlist_comp',
                  ['namedexpr_test',
                   ['test',
                    ['or_test',
                     ['and_test',
                      ['not_test',
                       ['comparison',
                        ['expr',
                         ['xor_expr',
                          ['and_expr',
                           ['shift_expr',
                            ['arith_expr',
                             ['term',
                              ['factor',
     

Шаг 3: Абстрактное синтаксическое дерево (AST)

In [11]:
import ast
print(ast.dump(ast.parse(program), indent=4))

Module(
    body=[
        Assign(
            targets=[
                Name(id='x', ctx=Store())],
            value=Constant(value=1)),
        Assign(
            targets=[
                Name(id='y', ctx=Store())],
            value=Constant(value=2)),
        Expr(
            value=Call(
                func=Name(id='print', ctx=Load()),
                args=[
                    BinOp(
                        left=Constant(value=3),
                        op=Mult(),
                        right=BinOp(
                            left=Name(id='x', ctx=Load()),
                            op=Add(),
                            right=Name(id='y', ctx=Load())))]))])


Шаг 4: Компиляция в байткод

In [26]:
code = compile(program, '<string>', 'exec')

In [27]:
code

<code object <module> at 0x107ca08a0, file "<string>", line 1>

In [32]:
code.co_code

b'\x95\x00S\x00r\x00S\x01r\x01\\\x02"\x00S\x02\\\x00\\\x01-\x00\x00\x00-\x05\x00\x005\x01\x00\x00\x00\x00\x00\x00 \x00g\x03'

In [33]:
len(code.co_code)

40

In [34]:
print(list(code.co_code))

[149, 0, 83, 0, 114, 0, 83, 1, 114, 1, 92, 2, 34, 0, 83, 2, 92, 0, 92, 1, 45, 0, 0, 0, 45, 5, 0, 0, 53, 1, 0, 0, 0, 0, 0, 0, 32, 0, 103, 3]


In [35]:
import dis
dis.opname[83], dis.opname[114], dis.opname[45], dis.opname[53], dis.opname[0]

('LOAD_CONST', 'STORE_NAME', 'BINARY_OP', 'CALL', 'CACHE')

Байткод

In [37]:
dis.dis(program)

  0           RESUME                   0

  1           LOAD_CONST               0 (1)
              STORE_NAME               0 (x)

  2           LOAD_CONST               1 (2)
              STORE_NAME               1 (y)

  3           LOAD_NAME                2 (print)
              PUSH_NULL
              LOAD_CONST               2 (3)
              LOAD_NAME                0 (x)
              LOAD_NAME                1 (y)
              BINARY_OP                0 (+)
              BINARY_OP                5 (*)
              CALL                     1
              POP_TOP
              RETURN_CONST             3 (None)


Как читать вывод dis: https://stackoverflow.com/a/47529318

Что делают инструкции: https://docs.python.org/3.13/library/dis.html#python-bytecode-instructions

Байткод

In [40]:
program = '''\
def foo(x, y):
    return 3*(x + y)
foo(1, 2)
'''
dis.dis(program)

  0           RESUME                   0

  1           LOAD_CONST               0 (<code object foo at 0x107c13830, file "<dis>", line 1>)
              MAKE_FUNCTION
              STORE_NAME               0 (foo)

  3           LOAD_NAME                0 (foo)
              PUSH_NULL
              LOAD_CONST               1 (1)
              LOAD_CONST               2 (2)
              CALL                     2
              POP_TOP
              RETURN_CONST             3 (None)

Disassembly of <code object foo at 0x107c13830, file "<dis>", line 1>:
  1           RESUME                   0

  2           LOAD_CONST               1 (3)
              LOAD_FAST_LOAD_FAST      1 (x, y)
              BINARY_OP                0 (+)
              BINARY_OP                5 (*)
              RETURN_VALUE


Байткод

In [42]:
list(dis.get_instructions(program))

[Instruction(opname='RESUME', opcode=149, arg=0, argval=0, argrepr='', offset=0, start_offset=0, starts_line=True, line_number=0, label=None, positions=Positions(lineno=0, end_lineno=1, col_offset=0, end_col_offset=0), cache_info=None),
 Instruction(opname='LOAD_CONST', opcode=83, arg=0, argval=<code object foo at 0x107c13910, file "<disassembly>", line 1>, argrepr='<code object foo at 0x107c13910, file "<disassembly>", line 1>', offset=2, start_offset=2, starts_line=True, line_number=1, label=None, positions=Positions(lineno=1, end_lineno=2, col_offset=0, end_col_offset=20), cache_info=None),
 Instruction(opname='MAKE_FUNCTION', opcode=26, arg=None, argval=None, argrepr='', offset=4, start_offset=4, starts_line=False, line_number=1, label=None, positions=Positions(lineno=1, end_lineno=2, col_offset=0, end_col_offset=20), cache_info=None),
 Instruction(opname='STORE_NAME', opcode=114, arg=0, argval='foo', argrepr='foo', offset=6, start_offset=6, starts_line=False, line_number=1, label=

Ссылки

- https://leanpub.com/insidethepythonvirtualmachine/read
- https://realpython.com/cpython-source-code-guide/