Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-102676: Add more convenience properties to dis.Instruction #103969

Merged
merged 13 commits into from
Jun 11, 2023
38 changes: 38 additions & 0 deletions Doc/library/dis.rst
Original file line number Diff line number Diff line change
Expand Up @@ -43,12 +43,12 @@
adaptive bytecode can be shown by passing ``adaptive=True``.


Example: Given the function :func:`myfunc`::

Check warning on line 46 in Doc/library/dis.rst

View workflow job for this annotation

GitHub Actions / Docs

py:func reference target not found: myfunc

def myfunc(alist):
return len(alist)

the following command can be used to display the disassembly of

Check warning on line 51 in Doc/library/dis.rst

View workflow job for this annotation

GitHub Actions / Docs

py:func reference target not found: myfunc
:func:`myfunc`:

.. doctest::
Expand Down Expand Up @@ -342,10 +342,23 @@
human readable name for operation


.. data:: baseopcode

numeric code for the base operation if operation is specialized. Otherwise equal to :data:`opcode`


.. data:: baseopname

human readable name for the base operation if operation is specialized. Otherwise equal to :data:`opname`


.. data:: arg

numeric argument to operation (if any), otherwise ``None``

.. data:: oparg

alias for :data:`arg`

.. data:: argval

Expand All @@ -363,6 +376,22 @@
start index of operation within bytecode sequence


.. data:: start_offset

start index of operation within bytecode sequence including prefixed ``EXTENDED_ARG`` operations if present.
Otherwise equal to :data:`offset`
tomasr8 marked this conversation as resolved.
Show resolved Hide resolved


.. data:: cache_offset

start index of the cache entries following the operation


.. data:: end_offset

end index of the cache entries following the operation


.. data:: starts_line

line started by this opcode (if any), otherwise ``None``
Expand All @@ -373,6 +402,11 @@
``True`` if other code jumps to here, otherwise ``False``


.. data:: jump_target

bytecode index of the jump target if this is a jump operation, otherwise ``None``


.. data:: positions

:class:`dis.Positions` object holding the
Expand All @@ -384,6 +418,10 @@

Field ``positions`` is added.

.. versionchanged:: 3.12
tomasr8 marked this conversation as resolved.
Show resolved Hide resolved

Fields ``start_offset``, ``cache_offset``, ``end_offset``, ``baseopname``, ``baseopcode``, ``jump_target`` and ``oparg`` are added.
tomasr8 marked this conversation as resolved.
Show resolved Hide resolved


.. class:: Positions

Expand Down Expand Up @@ -793,7 +831,7 @@

.. opcode:: LOAD_BUILD_CLASS

Pushes :func:`builtins.__build_class__` onto the stack. It is later called

Check warning on line 834 in Doc/library/dis.rst

View workflow job for this annotation

GitHub Actions / Docs

py:func reference target not found: builtins.__build_class__
to construct a class.


Expand Down Expand Up @@ -850,14 +888,14 @@

.. opcode:: STORE_NAME (namei)

Implements ``name = STACK.pop()``. *namei* is the index of *name* in the attribute

Check warning on line 891 in Doc/library/dis.rst

View workflow job for this annotation

GitHub Actions / Docs

py:attr reference target not found: co_names
:attr:`co_names` of the code object. The compiler tries to use
:opcode:`STORE_FAST` or :opcode:`STORE_GLOBAL` if possible.


.. opcode:: DELETE_NAME (namei)

Implements ``del name``, where *namei* is the index into :attr:`co_names`

Check warning on line 898 in Doc/library/dis.rst

View workflow job for this annotation

GitHub Actions / Docs

py:attr reference target not found: co_names
attribute of the code object.


Expand Down Expand Up @@ -896,7 +934,7 @@
value = STACK.pop()
obj.name = value

where *namei* is the index of name in :attr:`co_names`.

Check warning on line 937 in Doc/library/dis.rst

View workflow job for this annotation

GitHub Actions / Docs

py:attr reference target not found: co_names

.. opcode:: DELETE_ATTR (namei)

Expand All @@ -905,7 +943,7 @@
obj = STACK.pop()
del obj.name

where *namei* is the index of name into :attr:`co_names`.

Check warning on line 946 in Doc/library/dis.rst

View workflow job for this annotation

GitHub Actions / Docs

py:attr reference target not found: co_names


.. opcode:: STORE_GLOBAL (namei)
Expand Down
102 changes: 84 additions & 18 deletions Lib/dis.py
Original file line number Diff line number Diff line change
Expand Up @@ -262,6 +262,7 @@ def show_code(co, *, file=None):
'argval',
'argrepr',
'offset',
'start_offset',
'starts_line',
'is_jump_target',
'positions'
Expand All @@ -275,6 +276,8 @@ def show_code(co, *, file=None):
_Instruction.argval.__doc__ = "Resolved arg value (if known), otherwise same as arg"
_Instruction.argrepr.__doc__ = "Human readable description of operation argument"
_Instruction.offset.__doc__ = "Start index of operation within bytecode sequence"
_Instruction.start_offset.__doc__ = "Start index of operation within bytecode sequence including extended args if present. " \
"Otherwise equal to Instruction.offset"
tomasr8 marked this conversation as resolved.
Show resolved Hide resolved
_Instruction.starts_line.__doc__ = "Line started by this opcode (if any), otherwise None"
_Instruction.is_jump_target.__doc__ = "True if other code jumps to here, otherwise False"
_Instruction.positions.__doc__ = "dis.Positions object holding the span of source code covered by this instruction"
Expand All @@ -285,6 +288,23 @@ def show_code(co, *, file=None):
_OPNAME_WIDTH = 20
_OPARG_WIDTH = 5

def _get_jump_target(op, arg, offset):
"""Gets the bytecode offset of the jump target if this is a jump instruction,
otherwise returns None
tomasr8 marked this conversation as resolved.
Show resolved Hide resolved
"""
deop = _deoptop(op)
caches = _inline_cache_entries[deop]
if deop in hasjrel:
if _is_backward_jump(deop):
arg = -arg
target = offset + 2 + arg*2
target += 2 * caches
elif deop in hasjabs:
target = arg*2
else:
target = None
return target

class Instruction(_Instruction):
"""Details for a bytecode operation
tomasr8 marked this conversation as resolved.
Show resolved Hide resolved

Expand All @@ -295,12 +315,48 @@ class Instruction(_Instruction):
argval - resolved arg value (if known), otherwise same as arg
argrepr - human readable description of operation argument
offset - start index of operation within bytecode sequence
start_offset - start index of operation within bytecode sequence including extended args if present.
Otherwise equal to Instruction.offset
starts_line - line started by this opcode (if any), otherwise None
is_jump_target - True if other code jumps to here, otherwise False
positions - Optional dis.Positions object holding the span of source code
covered by this instruction
"""

@property
def oparg(self):
"""Alias for Instruction.arg"""
tomasr8 marked this conversation as resolved.
Show resolved Hide resolved
return self.arg

@property
def baseopcode(self):
"""numeric code for the base operation if operation is specialized.
Otherwise equal to Instruction.opcode
tomasr8 marked this conversation as resolved.
Show resolved Hide resolved
"""
return _deoptop(self.opcode)

@property
def baseopname(self):
"""human readable name for the base operation if operation is specialized.
Otherwise equal to Instruction.opname
"""
return opname[self.baseopcode]

@property
def cache_offset(self):
"""start index of the cache entries following the operation"""
return self.offset + 2

@property
def end_offset(self):
"""end index of the cache entries following the operation"""
return self.cache_offset + _inline_cache_entries[self.opcode]*2

@property
def jump_target(self):
"""bytecode index of the jump target if this is a jump operation, otherwise None"""
return _get_jump_target(self.opcode, self.arg, self.offset)

def _disassemble(self, lineno_width=3, mark_as_current=False, offset_width=4):
"""Format instruction details for inclusion in disassembly output

Expand Down Expand Up @@ -332,12 +388,23 @@ def _disassemble(self, lineno_width=3, mark_as_current=False, offset_width=4):
fields.append(self.opname.ljust(_OPNAME_WIDTH))
# Column: Opcode argument
if self.arg is not None:
fields.append(repr(self.arg).rjust(_OPARG_WIDTH))
arg = repr(self.arg)
# If opname is longer than _OPNAME_WIDTH, but the total length together with
# oparg is less than _OPNAME_WIDTH + _OPARG_WIDTH (with at least one space in between),
# we allow opname to overflow into the space reserved for oparg.
# This results in fewer misaligned opargs in the disassembly output
tomasr8 marked this conversation as resolved.
Show resolved Hide resolved
opname_excess = max(0, len(self.opname) - _OPNAME_WIDTH)
if opname_excess + len(arg) < _OPARG_WIDTH:
fields.append(arg.rjust(_OPARG_WIDTH - opname_excess))
tomasr8 marked this conversation as resolved.
Show resolved Hide resolved
else:
fields.append(arg.rjust(_OPARG_WIDTH))
# Column: Opcode argument details
if self.argrepr:
fields.append('(' + self.argrepr + ')')
return ' '.join(fields).rstrip()

def __str__(self):
return self._disassemble()

def get_instructions(x, *, first_line=None, show_caches=False, adaptive=False):
"""Iterator for the opcodes in methods, functions or code
Expand Down Expand Up @@ -451,7 +518,7 @@ def _get_instructions_bytes(code, varname_from_oparg=None,
for i in range(start, end):
labels.add(target)
starts_line = None
for offset, op, arg in _unpack_opargs(code):
for offset, start_offset, op, arg in _unpack_opargs(code):
if linestarts is not None:
starts_line = linestarts.get(offset, None)
if starts_line is not None:
Expand Down Expand Up @@ -516,7 +583,7 @@ def _get_instructions_bytes(code, varname_from_oparg=None,
argrepr = _intrinsic_2_descs[arg]
yield Instruction(_all_opname[op], op,
arg, argval, argrepr,
offset, starts_line, is_jump_target, positions)
offset, start_offset, starts_line, is_jump_target, positions)
caches = _inline_cache_entries[deop]
if not caches:
continue
Expand All @@ -536,7 +603,7 @@ def _get_instructions_bytes(code, varname_from_oparg=None,
else:
argrepr = ""
yield Instruction(
"CACHE", CACHE, 0, None, argrepr, offset, None, False,
"CACHE", CACHE, 0, None, argrepr, offset, offset, None, False,
Positions(*next(co_positions, ()))
)

Expand Down Expand Up @@ -622,6 +689,7 @@ def _disassemble_str(source, **kwargs):

def _unpack_opargs(code):
extended_arg = 0
extended_args_offset = 0 # Number of EXTENDED_ARG instructions preceding the current instruction
caches = 0
for i in range(0, len(code), 2):
# Skip inline CACHE entries:
Expand All @@ -642,7 +710,13 @@ def _unpack_opargs(code):
else:
arg = None
extended_arg = 0
yield (i, op, arg)
if deop == EXTENDED_ARG:
extended_args_offset += 1
yield (i, i, op, arg)
else:
start_offset = i - extended_args_offset*2
yield (i, start_offset, op, arg)
extended_args_offset = 0

def findlabels(code):
"""Detect all offsets in a byte code which are jump targets.
Expand All @@ -651,18 +725,10 @@ def findlabels(code):

"""
labels = []
for offset, op, arg in _unpack_opargs(code):
for offset, _, op, arg in _unpack_opargs(code):
if arg is not None:
deop = _deoptop(op)
caches = _inline_cache_entries[deop]
if deop in hasjrel:
if _is_backward_jump(deop):
arg = -arg
label = offset + 2 + arg*2
label += 2 * caches
elif deop in hasjabs:
label = arg*2
else:
label = _get_jump_target(op, arg, offset)
if label is None:
continue
if label not in labels:
labels.append(label)
Expand Down Expand Up @@ -691,7 +757,7 @@ def _find_imports(co):

consts = co.co_consts
names = co.co_names
opargs = [(op, arg) for _, op, arg in _unpack_opargs(co.co_code)
opargs = [(op, arg) for _, _, op, arg in _unpack_opargs(co.co_code)
if op != EXTENDED_ARG]
for i, (op, oparg) in enumerate(opargs):
if op == IMPORT_NAME and i >= 2:
Expand All @@ -713,7 +779,7 @@ def _find_store_names(co):
}

names = co.co_names
for _, op, arg in _unpack_opargs(co.co_code):
for _, _, op, arg in _unpack_opargs(co.co_code):
if op in STORE_OPS:
yield names[arg]

Expand Down