Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TObjectDecl #1

Closed
Simn opened this issue Mar 19, 2019 · 36 comments
Closed

TObjectDecl #1

Simn opened this issue Mar 19, 2019 · 36 comments

Comments

@Simn
Copy link
Owner

Simn commented Mar 19, 2019

We need a good way to generate anonymous objects.

@Simn
Copy link
Owner Author

Simn commented Mar 20, 2019

One thing to consider here is Reflect.deleteField. 💩

@nadako
Copy link
Collaborator

nadako commented Mar 20, 2019

I once wrote this https://gist.github.com/nadako/f01e6837d5508e922f0ef7287f0520bf. Not sure how well it applies to JVM, but I assume nicolas does something like this in HL too?

@Simn
Copy link
Owner Author

Simn commented Mar 20, 2019

The problem with all these optimization ideas is that this is specified to work:

class Main {
	static public function main() {
		var td = {
			a: null
		}
		trace(Reflect.deleteField(td, "a")); // true
		trace(Reflect.deleteField(td, "a")); // false
		trace(Reflect.hasField(td, "a")); // false
		td.a = null;
		trace(Reflect.hasField(td, "a")); // true
		trace(Reflect.deleteField(td, "a")); // true
		trace(Reflect.deleteField(td, "a")); // false
	}
}

Note in particular how we go from hasField = false to hasField = true after setting the value again.

@nadako
Copy link
Collaborator

nadako commented Mar 20, 2019

I think it makes sense to unspecify deleteField for non-optional fields, but yeah we need to keep some "deleted" flag anyway...

@Simn
Copy link
Owner Author

Simn commented Mar 20, 2019

To clarify: We'll have to do some bookkeeping anyway in order to support Reflect.fields, so you might get the idea that Reflect.deleteField could simply remove the field entry from whatever structure we maintain for that. That's correct, but the question then becomes who sets the entry back if a new value is assigned? This could only be done on every write-access, which is extremely stupid and subverts the entire optimization.

@nadako
Copy link
Collaborator

nadako commented Mar 20, 2019

subverts the entire optimization.

That's debatable. In current C#/Java targets, anon objects are made of arrays with binary-search lookup every time. I'm pretty sure that having a proper structure with a setter that sets some bool/bit flag will be faster.

@Simn
Copy link
Owner Author

Simn commented Mar 20, 2019

I think this is also specified:

class Main {
	static public function main() {
		var td:Dynamic = {};
		td.a = null;
		trace(Reflect.hasField(td, "a")); // true
	}
}

So it's necessary to have a storage for additional fields on these objects anyway. I think the best we can hope for is a lookup-based general implementation with a fast-path optimization for statically known fields. But even with these we have to be careful because of this deleteField crap...

@nadako
Copy link
Collaborator

nadako commented Mar 20, 2019

Yep, this is actually very similar to implements Dynamic and how it was done in gencommon (the dynamic "trait" was added to normal class fields).

@Simn
Copy link
Owner Author

Simn commented Mar 20, 2019

Here's a Haxe implementation of what I think could work: https://gist.github.com/Simn/f93d4945bcf7991bada5b85b54250565

  • the only overhead for known fields here is that if (_hxDeletedAField) check in the setter
  • the reflection map (_hx_fields) is only created if we need it, i.e. if we call of the _hx_ functions on the object.
  • unknown fields are looked up in a string map
  • we can omit the entire _hx_deletedAField part (and thus also the setters) if Reflect.deleteField is not part of the compilation

What do you think?

@nadako
Copy link
Collaborator

nadako commented Mar 20, 2019

This looks a lot like modern JS engines with their hidden classes that switch to the dictionary mode when delete o[key] is used :)

I think it would be nice to avoid string map for getting/setting known fields. Even if there are ways to avoid it, I'm pretty sure Reflect.field is super-common (I think 99% of templating and scripting engines use that).

We could generate a switch over known field names for _hx_getField/_hx_setField and only fall back to _hx_field if the field was added at run-time. Although I'm not really sure that switch will be faster than a map lookup :)

@nadako
Copy link
Collaborator

nadako commented Mar 20, 2019

Also, Here you meant "knownField" => this.knownField, right?

	override function _hx_getKnownFields() {
		return ["knownField" => null];
	}

also why we need to do a copy in _hx_initReflection?

@Simn
Copy link
Owner Author

Simn commented Mar 20, 2019

Also, Here you meant "knownField" => this.knownField, right?

Yes

also why we need to do a copy in _hx_initReflection

We don't. I originally had the map as a static var and forgot to remove the copy().

@Simn
Copy link
Owner Author

Simn commented Mar 20, 2019

I wonder about this from your design:

Second, for every unification of a class with a structure, we make that class implement the corresponding interface generated for that structure.

That's probably not gonna be enough because we could assign the class to the interface with something inbetween, like Dynamic. ((classInstance : Dynamic) : Interface) would not be detected correctly.

Another thing to consider are type parameter constraints.

@nadako
Copy link
Collaborator

nadako commented Mar 20, 2019

((classInstance : Dynamic) : Interface)

I guess this will require some DynamicImplForInterface that goes fully-dynamic (basically implementing get/set for Interface by using _hx_getField). The only thing that we can maybe do here is to cache that wrapper, although I'm not sure it's even worth it.

@Simn
Copy link
Owner Author

Simn commented Mar 20, 2019

I'm not worried about field access because we're gonna have a slow route for that anyway. My concern is the run-time assignment because we would have to make sure that classInstance can be assigned to Interface (bad name, I meant one of these anon interfaces).

@nadako
Copy link
Collaborator

nadako commented Mar 20, 2019

Yeah, well one option is to generate new DynamicInterface(classInstance), where:

class DynamicInterface implements Interface { // actually it has to also implement HxObject itself
  public var knownField(get,set):Int;
   
  final __obj:HxObject;

  public function new(obj:HxObject) {
     this.__obj = obj;
  }

  function get_knownField():Int return __obj._hx_getField("knownField");
  function set_knownField(v:Int):Int return __obj._hx_setField("knownField", v);
}

Of course it's technically an extra allocation, but a) if you do that you deserve it and b) it will probably be inlined on stack by Java/JVM anyway.

Simn added a commit that referenced this issue Mar 20, 2019
No optimizations for known fields yet.

see #1
@Simn
Copy link
Owner Author

Simn commented Mar 20, 2019

I've added the dynamic version for now. Initialization is somewhat verbose at instruction-level:

		var obj = {
			a: null,
			b: 1,
			c: "foo"
		}
  public void testObjectDecl();
    descriptor: ()V
    flags: ACC_PUBLIC
    Code:
      stack=7, locals=8, args_size=1
         0: new           #571                // class haxe/jvm/DynamicObject
         3: dup
         4: invokespecial #572                // Method haxe/jvm/DynamicObject."<init>":()V
         7: dup
         8: dup
         9: ldc_w         #576                // String a
        12: aconst_null
        13: invokevirtual #575                // Method haxe/jvm/DynamicObject._hx_setField:(Ljava/lang/String;Ljava/lang/Object;)V
        16: dup
        17: dup
        18: ldc_w         #577                // String b
        21: iconst_1
        22: invokestatic  #53                 // Method java/lang/Integer.valueOf:(I)Ljava/lang/Integer;
        25: invokevirtual #575                // Method haxe/jvm/DynamicObject._hx_setField:(Ljava/lang/String;Ljava/lang/Object;)V
        28: dup
        29: dup
        30: ldc_w         #578                // String c
        33: ldc_w         #554                // String foo
        36: invokevirtual #575                // Method haxe/jvm/DynamicObject._hx_setField:(Ljava/lang/String;Ljava/lang/Object;)V

This is gonna be quite efficient though.

@Simn
Copy link
Owner Author

Simn commented Mar 20, 2019

The alternative would be allocating two native arrays, one for names and one for values. That way we would only have 1 set call instead of N, but we would allocate two native arrays and have N store instructions. Plus the set function would then have to iterate on these arrays, which isn't free either. Not sure if that's worth it or not.

@Simn
Copy link
Owner Author

Simn commented Mar 29, 2019

We can improve this in 3 stages:

Stage 1: Generate TObjectDecl as class instance

This should be uncontroversial: Instead of creating a DynamicObject and calling a bunch of _hx_setField on it, we create an (anonymous) data class and call its constructor. Given that dynamic field access already works on class instances, it's still going to work after this change.

We can try to re-use existing data classes for this, but we have to consider evaluation order: The order of the TObjectDecl fields has to be respected wenn calling the constructor.

Stage 2: Optimize FAnon access with a checkcast

If we have TField(e1,FAnon cf), we can generate the equivalent of (e1 instanceof AnonClassForE1 ? (e1 : AnonClassForE1).theField : (e1 : Dynamic).theField. This replaces the fully dynamic lookup with an instanceof + checkcast branch, which should be several orders of magnitude faster.

Stage 3: nadako-interfaces

I have to think more about this idea first. I'm a bit concerned about polluting class declarations with lots of possible interface relations to anonymous types. I would like to try + benchmark this at some point though.


I wonder if we should consider typedefs at all to identify anons types. Nothing really prevents us from generating a class for haxe.macro.Expr with the expr and pos fields. We can then find that type when we have a TAnon. It's always a bit annoying because we have to sort the fields and consider both names and types, but it can certainly be done.

This would mean that some truly anonymous types might get misidentified, but I don't think this is an issue as long as we don't assign any semantics to these classes. And even then I don't see what could go wrong with that.

Simn added a commit that referenced this issue Mar 29, 2019
Simn added a commit that referenced this issue Mar 29, 2019
@Simn
Copy link
Owner Author

Simn commented Mar 29, 2019

class Main {
	static public function main() {
		var obj = getObj();
		var stamp = haxe.Timer.stamp();
		var target = stamp + 2.0;
		var num = 0;
		while (haxe.Timer.stamp() < target) {
			++num;
			call(obj.obj.obj.obj);
		}
		trace(num);
	}

	static function call(d:Dynamic) { }

	static function getObj() {
		return { obj: { obj: { obj: { obj: 12 }}}};
	}
}

Merge_2019-03-29_22-30-08

Note that there aren't actually any temp vars, that's just how the decompiler displays this.

before:     6855521
after   : 287782212

@nadako
Copy link
Collaborator

nadako commented Mar 30, 2019

I wonder if we should consider typedefs at all to identify anons types.

That would be very nice for readability and native interop :)

@nadako
Copy link
Collaborator

nadako commented Mar 30, 2019

Oh, regarding

polluting class declarations with lots of possible interface relations to anonymous types

My idea was to detect unifications and only add implements when needed.

@Simn
Copy link
Owner Author

Simn commented Mar 30, 2019

My idea was to detect unifications and only add implements when needed.

But what about ((classInstance : Dynamic) : Interface)?

@nadako
Copy link
Collaborator

nadako commented Mar 30, 2019

I guess it would be fair to have a fully-dynamic wrapper implementation for this, like we talked before: #1 (comment)

@nadako
Copy link
Collaborator

nadako commented Mar 31, 2019

Actually, regarding the example code. I think we should have 2 anon classes generated for that nested obj structures: one with generic Object obj, and one with int obj. This shouldn't be too bad in regards of generated code, if we reduce all types to the JVM ones (so basically object and numbers, iirc), but should be nicer wrt boxing.

@Simn
Copy link
Owner Author

Simn commented Mar 31, 2019

Oh it does that, I just didn't .obj enough so it never reached the integer field:

    public static void main(String[] args) {
        Object obj = getObj();
        Object var10000 = obj instanceof Anon1 ? ((Anon1)obj).obj : Jvm.readField(obj, "obj");
        var10000 = var10000 instanceof Anon1 ? ((Anon1)var10000).obj : Jvm.readField(var10000, "obj");
        var10000 = var10000 instanceof Anon1 ? ((Anon1)var10000).obj : Jvm.readField(var10000, "obj");
        call(var10000 instanceof Anon2 ? ((Anon2)var10000).obj : Jvm.toInt(Jvm.readField(var10000, "obj")));
    }

Interestingly, that decompilation is missing the boxing on for the call call where the argument is Dynamic. It's in the bytecode though:

        69: instanceof    #25                 // class haxe/generated/Anon2
        72: ifeq          84
        75: checkcast     #25                 // class haxe/generated/Anon2
        78: getfield      #28                 // Field haxe/generated/Anon2.obj:I
        81: goto          92
        84: ldc           #17                 // String obj
        86: invokestatic  #23                 // Method haxe/jvm/Jvm.readField:(Ljava/lang/Object;Ljava/lang/String;)Ljava/lang/Object;
        89: invokestatic  #32                 // Method haxe/jvm/Jvm.toInt:(Ljava/lang/Object;)I
        92: invokestatic  #38                 // Method java/lang/Integer.valueOf:(I)Ljava/lang/Integer;
        95: invokestatic  #42                 // Method call:(Ljava/lang/Object;)V

It goes 72 -> 75 -> 78 -> 81 -> 92 -> 95, and 92 has the boxing.

This also makes me realize that the other path does unbox + box (89 + 92). Not sure if there's a good way to fix that though because we need a consistent stack map at the branch join (92 which is reached from 81 (where we have an int) and 89 (where we would have an Integer without the unboxing).

@Simn Simn mentioned this issue Mar 31, 2019
@Simn
Copy link
Owner Author

Simn commented Mar 31, 2019

Addressed the casting in #25:

        69: instanceof    #25                 // class haxe/generated/Anon2
        72: ifeq          87
        75: checkcast     #25                 // class haxe/generated/Anon2
        78: getfield      #28                 // Field haxe/generated/Anon2.obj:I
        81: invokestatic  #34                 // Method java/lang/Integer.valueOf:(I)Ljava/lang/Integer;
        84: goto          92
        87: ldc           #17                 // String obj
        89: invokestatic  #23                 // Method haxe/jvm/Jvm.readField:(Ljava/lang/Object;Ljava/lang/String;)Ljava/lang/Object;
        92: invokestatic  #38                 // Method call:(Ljava/lang/Object;)V

It now has the Integer.valueOf within the branch (81) and doesn't cast at all on the other path.

@Simn
Copy link
Owner Author

Simn commented Mar 31, 2019

And for completeness, this is the code when we expect Int instead of Dynamic on the call argument:

        69: instanceof    #25                 // class haxe/generated/Anon2
        72: ifeq          84
        75: checkcast     #25                 // class haxe/generated/Anon2
        78: getfield      #28                 // Field haxe/generated/Anon2.obj:I
        81: goto          92
        84: ldc           #17                 // String obj
        86: invokestatic  #23                 // Method haxe/jvm/Jvm.readField:(Ljava/lang/Object;Ljava/lang/String;)Ljava/lang/Object;
        89: invokestatic  #32                 // Method haxe/jvm/Jvm.toInt:(Ljava/lang/Object;)I
        92: invokestatic  #36                 // Method call:(I)V

@nadako
Copy link
Collaborator

nadako commented Mar 31, 2019

awesome! so clean

@Simn
Copy link
Owner Author

Simn commented Mar 31, 2019

I'm still a bit concerned that the decompilers don't show that cast. IntelliJ just omits it whereas JD simply says /* Error */ on the entirety of main. x)

@Simn
Copy link
Owner Author

Simn commented Mar 31, 2019

I went ahead and did the typedef identification thing so we now get nice names:

public class Expr extends DynamicObject {
    public ExprDef expr;
    public Object pos;

    protected StringMap _hx_getKnownFields() {
        StringMap tmp = new StringMap();
        tmp.set("expr", this.expr);
        tmp.set("pos", this.pos);
        return tmp;
    }

    public Expr(ExprDef expr, Object pos) {
        this.expr = expr;
        this.pos = pos;
        super();
    }
}

As I said before, this would also pick up NotExpr if it has the same structure, but I think that's fair.

@nadako
Copy link
Collaborator

nadako commented Mar 31, 2019

nice, why this before super tho? doesn't that supposed to not work?

@Simn
Copy link
Owner Author

Simn commented Mar 31, 2019

Uhm, interesting, I didn't really notice that... Looks like the JVM is fine with this as long as we don't actually call something on this.

@nadako
Copy link
Collaborator

nadako commented Mar 31, 2019

I think we should not generate an empty Anon class for {} and just use DynamicObject directly.

Simn added a commit that referenced this issue Apr 1, 2019
@Simn
Copy link
Owner Author

Simn commented Apr 1, 2019

Indeed

@Simn
Copy link
Owner Author

Simn commented Mar 7, 2020

We now generate proper interfaces, so this is resolved.

@Simn Simn closed this as completed Mar 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants