New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

failure on abstract class and implementer emulation #50

Closed
teDDyGH opened this Issue Jul 12, 2016 · 10 comments

Comments

Projects
None yet
2 participants
@teDDyGH

teDDyGH commented Jul 12, 2016

Hello, I discovered a problem in smalivm emulating code relating to abstract class or casting.
I'm testing a static function passing all parameters but a warning is logged and result is not retrieved; this is due to an Array containing a set of real class and casted to relative abstract class; in real execution "invoke-virtual" is invoked on a real class implementing the abstract class and retrieved by the array, but emulator is non able to manage it.

[main] WARN InvokeOp - Attempting to execute local method without implementation: Lffffff/fnnfnf;->bК041A041AКК041A041A(Ljava/lang/String;C)Ljava/lang/String;. Assuming maxiumum ambiguity.

Code is like this:

...
    invoke-virtual {v0, p2}, Ljava/util/ArrayList;->get(I)Ljava/lang/Object;
    move-result-object v0
    check-cast v0, Lffffff/fnnfnf;
    new-instance v1, Ljava/lang/Character;
    invoke-direct {v1, p1}, Ljava/lang/Character;-><init>(C)V
    invoke-virtual {v1}, Ljava/lang/Character;->charValue()C
    move-result v1
    invoke-virtual {v0, p0, v1}, Lffffff/fnnfnf;->bК041A041AКК041A041A(Ljava/lang/String;C)Ljava/lang/String;
...

...
.class public abstract Lffffff/fnnfnf;
.super Ljava/lang/Object;
...

...
.class public Lffffff/nfnfff;
.super Lffffff/fnnfnf;
...
@CalebFenton

This comment has been minimized.

Show comment
Hide comment
@CalebFenton

CalebFenton Jul 12, 2016

Owner

Thanks for the error report. This looks like it could be an edge case. I'll see if I can reproduce with my own code, but in the mean time could you tell me the sha1 or md5 of the apk this is from? Also, what method were you executing?

Owner

CalebFenton commented Jul 12, 2016

Thanks for the error report. This looks like it could be an edge case. I'll see if I can reproduce with my own code, but in the mean time could you tell me the sha1 or md5 of the apk this is from? Also, what method were you executing?

@teDDyGH

This comment has been minimized.

Show comment
Hide comment
@teDDyGH

teDDyGH Jul 13, 2016

Apk is not public available and has big size dimension (>50MB).
Analyzed function is a string decoder function; I attach a subset of smali file usable for this test and java code used to debug it.
Final result value is unknown instead of expected decoded string.

smali.zip
Program.java.txt

teDDyGH commented Jul 13, 2016

Apk is not public available and has big size dimension (>50MB).
Analyzed function is a string decoder function; I attach a subset of smali file usable for this test and java code used to debug it.
Final result value is unknown instead of expected decoded string.

smali.zip
Program.java.txt

@CalebFenton

This comment has been minimized.

Show comment
Hide comment
@CalebFenton

CalebFenton Jul 13, 2016

Owner

Thanks for the smali. Can you post the hash?

On Wed, Jul 13, 2016, 2:31 AM teDDyGH notifications@github.com wrote:

Apk is not public available and has big size dimension (>50MB).
Analyzed function is a string decoder function; I attach a subset of smali
file usable for this test and java code used to debug it.
Final result value is unknown instead of expected decoded string.

smali.zip https://github.com/CalebFenton/simplify/files/361137/smali.zip
Program.java.txt
https://github.com/CalebFenton/simplify/files/361138/Program.java.txt


You are receiving this because you commented.

Reply to this email directly, view it on GitHub
#50 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/ABSzcj9vf3CB23HgLHX7SFItOhZkfO0xks5qVLB6gaJpZM4JKd0L
.

Owner

CalebFenton commented Jul 13, 2016

Thanks for the smali. Can you post the hash?

On Wed, Jul 13, 2016, 2:31 AM teDDyGH notifications@github.com wrote:

Apk is not public available and has big size dimension (>50MB).
Analyzed function is a string decoder function; I attach a subset of smali
file usable for this test and java code used to debug it.
Final result value is unknown instead of expected decoded string.

smali.zip https://github.com/CalebFenton/simplify/files/361137/smali.zip
Program.java.txt
https://github.com/CalebFenton/simplify/files/361138/Program.java.txt


You are receiving this because you commented.

Reply to this email directly, view it on GitHub
#50 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/ABSzcj9vf3CB23HgLHX7SFItOhZkfO0xks5qVLB6gaJpZM4JKd0L
.

@teDDyGH

This comment has been minimized.

Show comment
Hide comment
@teDDyGH

teDDyGH Jul 13, 2016

Of course.
SHA1: 048D7B1F62285A7A1B43509231CD1B0444A711A3
MD5: A6493057B0B757B0AB4D117B0986F257

teDDyGH commented Jul 13, 2016

Of course.
SHA1: 048D7B1F62285A7A1B43509231CD1B0444A711A3
MD5: A6493057B0B757B0AB4D117B0986F257

@CalebFenton

This comment has been minimized.

Show comment
Hide comment
@CalebFenton

CalebFenton Jul 14, 2016

Owner

Quick update, I found a few bugs and have fixed most of them. Really interesting bugs, too. Easy to understand because the specific example you gave doesn't have a lot of code or complexity. Will hopefully have it fixed and written up by the end of the weekend. Kinda busy with the day job. :D

Owner

CalebFenton commented Jul 14, 2016

Quick update, I found a few bugs and have fixed most of them. Really interesting bugs, too. Easy to understand because the specific example you gave doesn't have a lot of code or complexity. Will hopefully have it fixed and written up by the end of the weekend. Kinda busy with the day job. :D

@teDDyGH

This comment has been minimized.

Show comment
Hide comment
@teDDyGH

teDDyGH Jul 15, 2016

Great! Thank you

teDDyGH commented Jul 15, 2016

Great! Thank you

@CalebFenton

This comment has been minimized.

Show comment
Hide comment
@CalebFenton

CalebFenton Jul 16, 2016

Owner

This may take me a big longer.

The first bug was that smalivm didn't consider literal values for fields. It's legal in smali to declare a field equal to a value at the beginning of a class, and not have any code in <clinit>. This was an easy fix.

The second was that field inheritance was kind of hacked and didn't work properly. Instead of realistically modeling field definitions, I would just shove all inherited fields into the same class state. This basically worked, but it was wasteful, dumb, and hard to reason about. So I rewrote a bit here and there to make it more like the JVM.

Then I found a third "problem" which is that Dalvik allows for some interesting types of field access. Static fields are not supposed to be inherited; they're properties of the class that defines them. However, at the bytecode level, dalvikvm doesn't care if you reference a static field via the class that defines it or by a child class. If you have a field .field public static Lparent_class;->myInt:I it can be referenced from within Lchild_class; with Lchild_class;->myInt:I. Fixing this properly exposed a fundamental design flaw: everything is passed around as strings -- class names, method names, field descriptors, etc.

Now I'm doing a major rewrite to migrate from using strings internally to passing around what amounts to convenience wrappers for dexlib objects. This is awesome because:

  • Less string manipulation - profiling shows a shitload of work is done splitting strings (turning a method signature into a class, method name, and parameters) but now most of this can be obviated completely or done up front, once
  • Code becomes more organized, less complex - instead of having a monolithic ClassManager that mediates everything, some of the complexity gets moved around to new, more logical locations. E.g. certain methods can be taken out and moved to new objects where they make more sense

Unfortunately, a rewrite like this will take some time. But this seems well worth it and labor we delight in physics pain.

Owner

CalebFenton commented Jul 16, 2016

This may take me a big longer.

The first bug was that smalivm didn't consider literal values for fields. It's legal in smali to declare a field equal to a value at the beginning of a class, and not have any code in <clinit>. This was an easy fix.

The second was that field inheritance was kind of hacked and didn't work properly. Instead of realistically modeling field definitions, I would just shove all inherited fields into the same class state. This basically worked, but it was wasteful, dumb, and hard to reason about. So I rewrote a bit here and there to make it more like the JVM.

Then I found a third "problem" which is that Dalvik allows for some interesting types of field access. Static fields are not supposed to be inherited; they're properties of the class that defines them. However, at the bytecode level, dalvikvm doesn't care if you reference a static field via the class that defines it or by a child class. If you have a field .field public static Lparent_class;->myInt:I it can be referenced from within Lchild_class; with Lchild_class;->myInt:I. Fixing this properly exposed a fundamental design flaw: everything is passed around as strings -- class names, method names, field descriptors, etc.

Now I'm doing a major rewrite to migrate from using strings internally to passing around what amounts to convenience wrappers for dexlib objects. This is awesome because:

  • Less string manipulation - profiling shows a shitload of work is done splitting strings (turning a method signature into a class, method name, and parameters) but now most of this can be obviated completely or done up front, once
  • Code becomes more organized, less complex - instead of having a monolithic ClassManager that mediates everything, some of the complexity gets moved around to new, more logical locations. E.g. certain methods can be taken out and moved to new objects where they make more sense

Unfortunately, a rewrite like this will take some time. But this seems well worth it and labor we delight in physics pain.

@teDDyGH

This comment has been minimized.

Show comment
Hide comment
@teDDyGH

teDDyGH Jul 18, 2016

Regarding static fields, Java code has similar behavior;
This is a working Java code:

public class Program {

    public static void main(String[] args) {
        RealClass rc = new RealClass();
        System.out.println(rc.staticField);
        // "abstract" will be printed
    }

}

public abstract class AbsClass {

    public static String staticField = "abstract";

}


public class RealClass extends AbsClass {

}

teDDyGH commented Jul 18, 2016

Regarding static fields, Java code has similar behavior;
This is a working Java code:

public class Program {

    public static void main(String[] args) {
        RealClass rc = new RealClass();
        System.out.println(rc.staticField);
        // "abstract" will be printed
    }

}

public abstract class AbsClass {

    public static String staticField = "abstract";

}


public class RealClass extends AbsClass {

}
@CalebFenton

This comment has been minimized.

Show comment
Hide comment
@CalebFenton

CalebFenton Jul 30, 2016

Owner

This has been a useful bug because the execution was simple and easy to understand yet revealed several problems. With the most recent commit I'm able to decrypt the string. Here's the code I used:

    public static void main(String[] args) throws Exception {
        VirtualMachineFactory vmFactory = new VirtualMachineFactory();
        VirtualMachine vm = vmFactory.build("teddygh-smali/ffffff");

        String methodSignature = "Lffffff/uuuaaa;->bТТ0422Т0422ТТ(Ljava/lang/String;CC)Ljava/lang/String;";
        VirtualMethod method = vm.getClassManager().getMethod(methodSignature);
        ExecutionContext context = vm.spawnRootContext(method);
        MethodState mState = context.getMethodState();
        mState.assignParameter(mState.getParameterStart(),
                "\u0339\u0344\u02fe\u0343\u033b\u0349\u02fe\u0331\u033e\u0349\u0347\u0338\u0335\u0342\u0335",
                "Ljava/lang/String;");
        mState.assignParameter(mState.getParameterStart() + 1, 0xf0, "C");
        mState.assignParameter(mState.getParameterStart() + 2, 0x2, "C");

        ExecutionGraph graph = vm.execute(methodSignature, context);
        HeapItem item = graph.getTerminatingRegisterConsensus(MethodState.ReturnRegister);
        System.out.println("With context, returns: " + item);
    }

And here's the output:

With context, returns: type=Ljava/lang/String;, value=it.sky.anywhere
Owner

CalebFenton commented Jul 30, 2016

This has been a useful bug because the execution was simple and easy to understand yet revealed several problems. With the most recent commit I'm able to decrypt the string. Here's the code I used:

    public static void main(String[] args) throws Exception {
        VirtualMachineFactory vmFactory = new VirtualMachineFactory();
        VirtualMachine vm = vmFactory.build("teddygh-smali/ffffff");

        String methodSignature = "Lffffff/uuuaaa;->bТТ0422Т0422ТТ(Ljava/lang/String;CC)Ljava/lang/String;";
        VirtualMethod method = vm.getClassManager().getMethod(methodSignature);
        ExecutionContext context = vm.spawnRootContext(method);
        MethodState mState = context.getMethodState();
        mState.assignParameter(mState.getParameterStart(),
                "\u0339\u0344\u02fe\u0343\u033b\u0349\u02fe\u0331\u033e\u0349\u0347\u0338\u0335\u0342\u0335",
                "Ljava/lang/String;");
        mState.assignParameter(mState.getParameterStart() + 1, 0xf0, "C");
        mState.assignParameter(mState.getParameterStart() + 2, 0x2, "C");

        ExecutionGraph graph = vm.execute(methodSignature, context);
        HeapItem item = graph.getTerminatingRegisterConsensus(MethodState.ReturnRegister);
        System.out.println("With context, returns: " + item);
    }

And here's the output:

With context, returns: type=Ljava/lang/String;, value=it.sky.anywhere
@teDDyGH

This comment has been minimized.

Show comment
Hide comment
@teDDyGH

teDDyGH Aug 1, 2016

Great news! Thank you

teDDyGH commented Aug 1, 2016

Great news! Thank you

@teDDyGH teDDyGH closed this Aug 1, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment