A Python cross-version decompiler
Switch branches/tags
release-pyton-2.4-2.9.9 release-python-2.4-3.2.2 release-python-2.4-3.2.1 release-python-2.4-3.2.0 release-python-2.4-3.1.3 release-python-2.4-3.1.2 release-python-2.4-3.1.1 release-python-2.4-3.1.0 release-python-2.4-3.0.1 release-python-2.4-3.0.0 release-python-2.4-2.16.0 release-python-2.4-2.15.1 release-python-2.4-2.15.0 release-python-2.4-2.14.3 release-python-2.4-2.14.2 release-python-2.4-2.14.1 release-python-2.4-2.14.0 release-python-2.4-2.13.3 release-python-2.4-2.13.2 release-python-2.4-2.13.0 release-python-2.4-2.12.0 release-python-2.4-2.11.5 release-python-2.4-2.11.4 release-python-2.4-2.11.3 release-python-2.4-2.11.1 release-python-2.4-2.11.0 release-python-2.4-2.10.1 release-python-2.4-2.10.0 release-python-2.4-2.9.11 release-python-2.4-2.9.10 release-python-2.4-2.9.9 release-3.2.2 release-3.2.1 release-3.2.0 release-3.1.3 release-3.1.2 release-3.1.1 release-3.1.0 release-3.0.1 release-3.0.0 release-2.16.0 release-2.15.1 release-2.14.3 release-2.14.2 release-2.14.1 release-2.14.0 release-2.13.2 release-2.13.1 release-2.13.0 release-2.12.0 release-2.11.5 release-2.11.4 release-2.11.3 release-2.11.2 release-2.11.1 release-2.11.0 release-2.10.1 release-2.10.0 release-2.9.11 release-2.9.10 release-2.9.9 release-2.9.8 release-2.9.7 release-2.9.6 release-2.9.5 release-2.9.4 release-2.9.3 release-2.9.2 release-2.9.1 release-2.9.0 release-2.8.4 release-2.8.3 release-2.8.2 release-2.8.1 release-2.8.0 release-2.7.1 release-2.7.0 release-2.6.2 release-2.6.1 release-2.6.0 release-2.5.0 release-2.4.0 release-2.4-2.9.7 release-2.3.5 release-2.3.4 release-2.3.3 release-2.3.2 release-2.3.1 release-2.3.0 release-2.2.0 release-2.1.3 release-2.1.2 release-2.1.1 release-2.1.0 release-2.0.0
Nothing to show
Clone or download
Latest commit 4c6bdd5 Oct 17, 2018
Permalink
Failed to load latest commit information.
.circleci Another CircleCI 2.0 try Jun 25, 2018
.github/ISSUE_TEMPLATE Note that bytecode should be provided. Oct 2, 2018
admin-tools Python 3.0 fixes + administrivia Jun 12, 2018
appveyor Try appveyor May 8, 2017
bin Python packaging - yet again. May 14, 2016
pytest Another Python 3.0 (while) parse bug Jun 23, 2018
test extend Python 2.6- lastc grammar-rule Oct 5, 2018
uncompyle6 Fix indentation iftrue_stmt24 Oct 17, 2018
.gitignore Administrivia Apr 7, 2018
.travis.yml Python 3.7 is too new for TravisCI Jul 13, 2018
COPYING Move to GPL3 license Feb 27, 2018
DECOMPYLE-2.4-CHANGELOG.txt Add hartmut Goebel's changes before 2.4 Sep 3, 2016
HISTORY.md Start simplifying higher-level API Feb 27, 2018
HOW-TO-REPORT-A-BUG.md Update HOW-TO-REPORT-A-BUG.md Jul 5, 2018
MANIFEST.in Move to GPL3 license Feb 27, 2018
Makefile Fix 3.7 aysnc def testing Apr 20, 2018
NEWS Get ready for release 3.2.3 Jun 13, 2018
PKG-INFO Small changes and administrivia May 19, 2016
README.rst One last grammar typo Jun 13, 2018
__pkginfo__.py Skip botched 3.8.5 release Jul 3, 2018
appveyor.yml See if we can get Appveyor working again... May 21, 2018
compile_tests first commit Jun 5, 2012
requirements-dev.txt Guidleines for reporting bugs and openning feature requests Aug 2, 2018
requirements.txt Administrivia: Remove six dependency.. Jun 12, 2018
setup.cfg declare Python3 support in wheel and trove Apr 18, 2016
setup.py Allow Python 3.0 and fix default param bug in 3.0 Jun 11, 2018
tox.ini Remove redundant 2.7 (and 2.x) grammar rules Nov 22, 2016

README.rst

buildstatus Latest Version Supported Python Versions

uncompyle6

A native Python cross-version decompiler and fragment decompiler. The successor to decompyle, uncompyle, and uncompyle2.

Introduction

uncompyle6 translates Python bytecode back into equivalent Python source code. It accepts bytecodes from Python version 1.3 to version 3.7, spanning over 22 years of Python releases. We include Dropbox's Python 2.5 bytecode and some PyPy bytecode.

Why this?

Ok, I'll say it: this software is amazing. It is more than your normal hacky decompiler. Using compiler technology, the program creates a parse tree of the program from the instructions; nodes at the upper levels that look a little like what might come from a Python AST. So we can really classify and understand what's going on in sections of Python bytecode.

Building on this, another thing that makes this different from other CPython bytecode decompilers is the ability to deparse just fragments of source code and give source-code information around a given bytecode offset.

I use the tree fragments to deparse fragments of code at run time inside my trepan debuggers. For that, bytecode offsets are recorded and associated with fragments of the source code. This purpose, although compatible with the original intention, is yet a little bit different. See this for more information.

Python fragment deparsing given an instruction offset is useful in showing stack traces and can be encorporated into any program that wants to show a location in more detail than just a line number at runtime. This code can be also used when source-code information does not exist and there is just bytecode. Again, my debuggers make use of this.

There were (and still are) a number of decompyle, uncompyle, uncompyle2, uncompyle3 forks around. Almost all of them come basically from the same code base, and (almost?) all of them are no longer actively maintained. One was really good at decompiling Python 1.5-2.3 or so, another really good at Python 2.7, but that only. Another handles Python 3.2 only; another patched that and handled only 3.3. You get the idea. This code pulls all of these forks together and moves forward. There is some serious refactoring and cleanup in this code base over those old forks.

This demonstrably does the best in decompiling Python across all Python versions. And even when there is another project that only provides decompilation for subset of Python versions, we generally do demonstrably better for those as well.

How can we tell? By taking Python bytecode that comes distributed with that version of Python and decompiling these. Among those that successfully decompile, we can then make sure the resulting programs are syntactically correct by running the Python interpreter for that bytecode version. Finally, in cases where the program has a test for itself, we can run the check on the decompiled code.

We are serious about testing, and use automated processes to find bugs. In the issue trackers for other decompilers, you will find a number of bugs we've found along the way. Very few to none of them are fixed in the other decompilers.

Requirements

The code here can be run on Python versions 2.6 or later, PyPy 3-2.4, or PyPy-5.0.1. Python versions 2.4-2.7 are supported in the python-2.4 branch. The bytecode files it can read have been tested on Python bytecodes from versions 1.4, 2.1-2.7, and 3.0-3.6 and the above-mentioned PyPy versions.

Installation

This uses setup.py, so it follows the standard Python routine:

pip install -e .  # set up to run from source tree
                  # Or if you want to install instead
python setup.py install # may need sudo

A GNU makefile is also provided so make install (possibly as root or sudo) will do the steps above.

Testing

make check

A GNU makefile has been added to smooth over setting running the right command, and running tests from fastest to slowest.

If you have remake installed, you can see the list of all tasks including tests via remake --tasks

Usage

Run

$ uncompyle6 *compiled-python-file-pyc-or-pyo*

For usage help:

$ uncompyle6 -h

If you want strong verification of the correctness of the decompilation process, add the --verify option. But there are situations where this will indicate a failure, although the generated program is semantically equivalent. Using option --weak-verify will tell you if there is something definitely wrong. Generally, large swaths of code are decompiled correctly, if not the entire program.

You can also cross compare the results with pycdc . Since they work differently, bugs here often aren't in that, and vice versa.

Known Bugs/Restrictions

The biggest known and possibly fixable (but hard) problem has to do with handling control flow. (Python has probably the most diverse and screwy set of compound statements I've ever seen; there are "else" clauses on loops and try blocks that I suspect many programmers don't know about.)

All of the Python decompilers that I have looked at have problems decompiling Python's control flow. In some cases we can detect an erroneous decompilation and report that.

In older versions of Python it was possible to verify bytecode by decompiling bytecode, and then compiling using the Python interpreter for that bytecode version. Having done this the bytecode produced could be compared with the original bytecode. However as Python's code generation got better, this is no longer feasible.

There verification that we use that doesn't check bytecode for equivalence but does check to see if the resulting decompiled source is a valid Python program by running the Python interpreter. Because the Python language has changed so much, for best results you should use the same Python version in checking as was used in creating the bytecode.

There are however an interesting class of these programs that is readily available give stronger verification: those programs that when run check some computation, or even better themselves.

And already Python has a set of programs like this: the test suite for the standard library that comes with Python. We have some code in test/stdlib to facilitate this kind of checking.

Python support is strongest in Python 2 for 2.7 and drops off as you get further away from that. Support is also probably pretty good for python 2.3-2.4 since a lot of the goodness of early the version of the decompiler from that era has been preserved (and Python compilation in that era was minimal)

There is some work to do on the lower end Python versions which is more difficult for us to handle since we don't have a Python interpreter for versions 1.6, and 2.0.

In the Python 3 series, Python support is is strongest around 3.4 or 3.3 and drops off as you move further away from those versions. Python 3.0 is weird in that it in some ways resembles 2.6 more than it does 3.1 or 2.7. Python 3.6 changes things drastically by using word codes rather than byte codes. As a result, the jump offset field in a jump instruction argument has been reduced. This makes the EXTENDED_ARG instructions are now more prevalent in jump instruction; previously they had been rare. Perhaps to compensate for the additional EXTENDED_ARG instructions, additional jump optimization has been added. So in sum handling control flow by ad hoc means as is currently done is worse.

Between Python 3.5, 3.6 and 3.7 there have been major changes to the MAKE_FUNCTION and CALL_FUNCTION instructions.

Currently not all Python magic numbers are supported. Specifically in some versions of Python, notably Python 3.6, the magic number has changes several times within a version. We support only the released magic. There are also customized Python interpreters, notably Dropbox, which use their own magic and encrypt bytcode. With the exception of the Dropbox's old Python 2.5 interpreter this kind of thing is not handled.

We also don't handle PJOrion obfuscated code. For that try: PJOrion Deobfuscator to unscramble the bytecode to get valid bytecode before trying this tool. This program can't decompile Microsoft Windows EXE files created by Py2EXE, although we can probably decompile the code after you extract the bytecode properly. For situations like this, you might want to consider a decompilation service like Crazy Compilers. Handling pathologically long lists of expressions or statements is slow.

There is lots to do, so please dig in and help.

See Also