Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Compile + optimize instead of interpret #32

Closed
int3 opened this Issue Apr 4, 2012 · 14 comments

Comments

Projects
None yet
4 participants
Contributor

int3 commented Apr 4, 2012

General idea:

  • Replace opcodes with expression 'templates'. So iadd will be defined like this: in: [1,1], out: [1], exp: '$0 + $1', which means that it takes two 1-byte values from the stack and pushes a one-byte value. We can re-generate our current set of execute functions from these templates, but we could also string multiple opcodes together into one larger function.
  • When compiling, we can evaluate the stack symbolically. So when iadd is called we might have symbols (strings) 'a' and 'b' on the stack, and when we are done we have the string 'c' on the stack. We can then produce code like c = a + b. Not absolutely clear on how to analyze jumps and exception handling at the moment. See Harissa for more details.
  • Reduce virtual method lookup overhead by having compiled classes contain references to inherited functions.
  • Reconstruct 'normal' control flow code via decompilation. See Harissa, or Emscripten.
Contributor

perimosocordiae commented Apr 4, 2012

We could also mock out java.lang.String. We'll avoid a whole bunch of expensive conversion shenanigans, and the String class isn't too complex.

We can probably cache inherited functions to speed up virtual method dispatch fairly easily.

Contributor

int3 commented Apr 9, 2012

Hmm, I just tried method caching and it didn't seem to help much.

Contributor

int3 commented Oct 9, 2012

Just made some stabs at a basic compiler. It succeeds in transforming the code below

class Simple {
  public static int foo() {
    int a = 2;
    int b = 3;
    int c = 5;
    return a + b * c;
  }
  public static int bar(int a) {
    if (a < 0) return -1;
    return 1;
  }
}

to this:

var Simple = {
Simple: function() {
    return;
},
foo: function() {
    return wrap_int(gLong.fromInt(5).multiply(gLong.fromInt(3)).toInt()+2);
},
bar: function(a) {
    if (a >= 0) {
        return 1;
    } else {
        return -1;
    }
},
};

The next thing I'm looking at is making it handle more complex control structures.

Contributor

int3 commented Oct 9, 2012

On second thought, it's probably easier to start off with the simple switch-statement-in-while-loop and iteratively improve on it. That alone should give us a pretty big speed boost.

Member

jvilk commented Oct 9, 2012

You mean making a switch statement in a while loop for each function, where each case in the switch is a basic block? I was talking with CJ about that the other day.

EDIT: Also, is the plan to eval compiled functions? Are JS engines smart enough to JIT it if we eval the same string multiple times?

Owner

emeryberger commented Oct 9, 2012

Am I correct in assuming that the java code you are showing is just meant
to be representative of the byte code sequences you are looking at
compiling, or are you thinking of compiling from source?

Contributor

int3 commented Oct 9, 2012

Yeah, that's what I meant. I was hoping to implement Emscripten's Relooper algorithm, but I can't quite figure out what the paper is describing...

Contributor

int3 commented Oct 9, 2012

@emeryberger: Yeah, I compiled the Java code above to bytecode, then ran the doppio compiler to get the resulting Javascript. The generated if statement was kind of a special case though... it won't work on more complicated control flows, and I'm not sure how to generalize it.

Owner

emeryberger commented Oct 9, 2012

The right way to compile code really involves transforming it into an
appropriate intermediate representation -- I would caution venturing too
far down this path without first reading up on compilers. Maybe Andrew
Appel's text on java would be a good start.

Contributor

int3 commented Oct 9, 2012

Yeah, I've taken a class based on Appel's book before. Compiling to a high-level language is kind of different from compiling to assembly though...

Owner

emeryberger commented Oct 9, 2012

Decidedly.

On Oct 9, 2012, at 3:50 PM, Jez Ng notifications@github.com wrote:

Yeah, I've taken a class based on Appel's book before. Compiling to a
high-level language is kind of different from compiling to assembly
though...


Reply to this email directly or view it on
GitHubhttps://github.com/int3/doppio/issues/32#issuecomment-9276412.

Member

jvilk commented Jan 23, 2013

I had a sudden realization regarding a JITing Doppio that may seem obvious, but I'm going to note it here anyway for my future benefit.

One challenge to compiling now is that we have a number of operations that are necessarily asynchronous, since we are using WebWorkers now. So we will have to support "pausing" and "resuming" compiled methods for asynchronous operations.

But we have one huge item on our side: we already have a virtual representation of the stack with local variables and such. We can use this to back out of methods and jump back into them if need be. We also have to keep track of something so we can determine the Class that each method call belongs to even when we compile them, thanks to sun.reflect.Reflection.getCallerClass(), so we can just keep track of the actual method called so we can reconstruct control flow. Which means that JITing in the presence of async operations is more than possible for Doppio.

Of course, we are going to want to optimize away as much of the stack action as possible. In some cases, we may be able to guarantee that the function called will never trigger an asynchronous operation, although it's likely those types of methods will end up being inlined anyway. There's a lot of fun and interesting optimization problems.

Member

jvilk commented May 19, 2016

With #443 merged, we now have a basic JIT!

@jvilk jvilk closed this May 19, 2016

Contributor

perimosocordiae commented May 20, 2016

🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment