A simple Java inspection and disassembly library for Python.
Switch branches/tags
Nothing to show
Clone or download
Pull request Compare This branch is 17 commits ahead of sadimusi:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
solum
.gitignore
LICENCE
README.md
setup.py

README.md

Solum

Solum is [intended] to be a very simple library for inspecting and disassembling JVM class files.

Note: When poking around, keep in mind that you can print just about everything returned by methods of ClassFile. This can give you quick insight into what's available.

Playing with JAR files

First and foremost, remember that JAR's are just .zip files with a different extension. You can access Python's ZipFile instance as the zp property of a JarFile(). The JarFile class exists to make a few Java-specific tasks easier.

Opening a Jar

import sys
from solum import JarFile

if __name__ == "__main__":
    jar = JarFile(sys.argv[1])

Getting the contents of a single file

import sys
from solum import JarFile

if __name__ == "__main__":
    jar = JarFile(sys.argv[1])
    print jar.read("title/splashes.txt")

Getting a class

import sys
from solum import JarFile

if __name__ == "__main__":
    jar = JarFile(sys.argv[1])
    cf = jar.open_class("HelloWorld")  # The .class is optional

Playing with .class files

Every class in Java is represented by a .class file. The entire contents of a class is available within the context of one of these files.

Opening a .class

Opening these directly using Solum is as painless as opening a Jar.

from solum import ClassFile

if __name__ == "__main__":
    cf = ClassFile("HelloWorld.class")
    print cf.this # Prints "HelloWorld"
    print cf.superclass # Prints "java/lang/object"

Finding Constants, Fields, and Methods

Every reference to a class, field, string, integer, etc... is stored as a constant. We can easily search for everything or narrow it down by various criteria. Getting all of the constants is achieved by doing a search without any criteria.

from solum import ClassFile

if __name__ == "__main__":
    cf = ClassFile("HelloWorld.class")
    # Return all of the constants in the file
    print cf.constants.find()

This usually isn't very useful by itself, so we want to narrow it down. Lets say we want to get all constants that represents strings in the program.

from solum import ClassFile, ConstantType

if __name__ == "__main__":
    cf = ClassFile("HelloWorld.class")
    # Return all of the strings in the class
    print cf.constants.find(ConstantType.STRING)

In our HelloWorld.class example, this predictably has just one result,

[{'tag': 8, 'string': {'tag': 1, 'value': 'Hello, World'}, 'string_index': 18}]

Lets try this another way. Instead of searching just for strings put in there by the programmer, we'll get all of the text in the class. But hey, we also only want to find it if it's less than 6 characters.

from solum import ClassFile, ConstantType

def test(constant):
    return len(constant["value"]) < 6

if __name__ == "__main__":
    cf = ClassFile("HelloWorld.class")
    print cf.constants.find(ConstantType.UTF8, f=test)

This example will call the function given in f for each constant of type UTF8, and only return those for which test() returns True. This gives us a lot of flexibility in getting only what we really want.

We can also make use of find_one(), which will return the first valid match or None if there was none.

from solum import ClassFile, ConstantType

def test(constant):
    return "Hello" in constant["value"]

if __name__ == "__main__":
    cf = ClassFile("HelloWorld.class")
    print cf.constants.find_one(ConstantType.UTF8, f=test)

Fields and methods have similar interfaces. Want to find all methods that return void?

from solum import ClassFile, ConstantType

if __name__ == "__main__":
    cf = ClassFile("HelloWorld.class")
    print cf.methods.find(returns="void")

How about only those named "max" that take two integers?

from solum import ClassFile

if __name__ == "__main__":
    cf = ClassFile("HelloWorld.class")
    print cf.methods.find(name="max", args=("integer", "integer"))

Method Disassembly

Disassembly is just as easy. Simply find the method(s) you want to disassemble, and iterate over their instructions property. To see what other methods are available on the instructions property, take a look at the Disassembler() class in bytecode.py.

from solum import ClassFile

if __name__ == "__main__":
    cf = ClassFile("HelloWorld.class")
    main = cf.methods.find_one(name="main")

    for ins in main.instructions:
        print ins