Skip to content

Unmarshal, disasm and pretty-print python bytecode.

License

Notifications You must be signed in to change notification settings

gmodena/pycdump

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pycdump

Unmarshal, disasm and pretty-print Python 3.7 bytecode.

Intro

This repository documents some experiments in CPython bytecode analysis (pyc files).

The work is inspired by Ned Batchelder's blog post http://nedbatchelder.com/blog/200804/the_structure_of_pyc_files.html. CPython has changed quite a bit since the time that blog was written.

Some of the changes from 2 to 3 are documented in https://www.python.org/dev/peps/pep-3147/. The documentation is a bit out of date. In recent version of Python for instance, imp (referenced in the PEP) has been deprecated in favour of importlib. On 3.7.3:

>>> import importlib 
>>> hasattr(importlib, 'get_tag')
False

Things are bound to further change in the future https://www.python.org/dev/peps/pep-0552/

File structure

According to documentation a pyc file is composed of a 16 byte header - 4 32-bit words - and a variable size payload.

From byte 16 onwards the payload stores a marshalled code object (https://docs.python.org/3/c-api/code.html).

Code objects provide these attributes (and a couple more):

  • co_argcount number of arguments (not including * or ** args)
  • co_code string of raw compiled bytecode
  • co_consts tuple of constants used in the bytecode
  • co_filename name of file in which this code object was created
  • co_firstlineno number of first line in Python source code
  • co_flags bitmap: 1=optimized | 2=newlocals | 4=*arg | 8=**arg
  • co_lnotab encoded mapping of line numbers to bytecode indices
  • co_name name with which this code object was defined
  • co_names tuple of names of local variables
  • co_nlocals number of local variables
  • co_stacksize virtual machine stack space required
  • co_varnames tuple of names of arguments and local variables

co_consts is a nested data structure that can contain code object instances

Compile a py to pyc

python -m compileall example.py 

Disasm & dump

python dump.py __pycache__/example.cpython-37.pyc

About

Unmarshal, disasm and pretty-print python bytecode.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages