TObjectDecl #1

Simn · 2019-03-19T20:49:08Z

We need a good way to generate anonymous objects.

Simn · 2019-03-20T08:10:13Z

One thing to consider here is Reflect.deleteField. 💩

nadako · 2019-03-20T08:53:52Z

I once wrote this https://gist.github.com/nadako/f01e6837d5508e922f0ef7287f0520bf. Not sure how well it applies to JVM, but I assume nicolas does something like this in HL too?

Simn · 2019-03-20T09:27:50Z

The problem with all these optimization ideas is that this is specified to work:

class Main {
	static public function main() {
		var td = {
			a: null
		}
		trace(Reflect.deleteField(td, "a")); // true
		trace(Reflect.deleteField(td, "a")); // false
		trace(Reflect.hasField(td, "a")); // false
		td.a = null;
		trace(Reflect.hasField(td, "a")); // true
		trace(Reflect.deleteField(td, "a")); // true
		trace(Reflect.deleteField(td, "a")); // false
	}
}

Note in particular how we go from hasField = false to hasField = true after setting the value again.

nadako · 2019-03-20T09:31:53Z

I think it makes sense to unspecify deleteField for non-optional fields, but yeah we need to keep some "deleted" flag anyway...

Simn · 2019-03-20T09:32:42Z

To clarify: We'll have to do some bookkeeping anyway in order to support Reflect.fields, so you might get the idea that Reflect.deleteField could simply remove the field entry from whatever structure we maintain for that. That's correct, but the question then becomes who sets the entry back if a new value is assigned? This could only be done on every write-access, which is extremely stupid and subverts the entire optimization.

nadako · 2019-03-20T09:40:58Z

subverts the entire optimization.

That's debatable. In current C#/Java targets, anon objects are made of arrays with binary-search lookup every time. I'm pretty sure that having a proper structure with a setter that sets some bool/bit flag will be faster.

Simn · 2019-03-20T09:46:18Z

I think this is also specified:

class Main {
	static public function main() {
		var td:Dynamic = {};
		td.a = null;
		trace(Reflect.hasField(td, "a")); // true
	}
}

So it's necessary to have a storage for additional fields on these objects anyway. I think the best we can hope for is a lookup-based general implementation with a fast-path optimization for statically known fields. But even with these we have to be careful because of this deleteField crap...

nadako · 2019-03-20T09:47:53Z

Yep, this is actually very similar to implements Dynamic and how it was done in gencommon (the dynamic "trait" was added to normal class fields).

Simn · 2019-03-20T10:37:21Z

Here's a Haxe implementation of what I think could work: https://gist.github.com/Simn/f93d4945bcf7991bada5b85b54250565

the only overhead for known fields here is that if (_hxDeletedAField) check in the setter
the reflection map (_hx_fields) is only created if we need it, i.e. if we call of the _hx_ functions on the object.
unknown fields are looked up in a string map
we can omit the entire _hx_deletedAField part (and thus also the setters) if Reflect.deleteField is not part of the compilation

What do you think?

nadako · 2019-03-20T11:28:18Z

This looks a lot like modern JS engines with their hidden classes that switch to the dictionary mode when delete o[key] is used :)

I think it would be nice to avoid string map for getting/setting known fields. Even if there are ways to avoid it, I'm pretty sure Reflect.field is super-common (I think 99% of templating and scripting engines use that).

We could generate a switch over known field names for _hx_getField/_hx_setField and only fall back to _hx_field if the field was added at run-time. Although I'm not really sure that switch will be faster than a map lookup :)

nadako · 2019-03-20T11:30:51Z

Also, Here you meant "knownField" => this.knownField, right?

	override function _hx_getKnownFields() {
		return ["knownField" => null];
	}

also why we need to do a copy in _hx_initReflection?

Simn · 2019-03-20T11:33:02Z

Also, Here you meant "knownField" => this.knownField, right?

Yes

also why we need to do a copy in _hx_initReflection

We don't. I originally had the map as a static var and forgot to remove the copy().

Simn · 2019-03-20T12:38:23Z

I wonder about this from your design:

Second, for every unification of a class with a structure, we make that class implement the corresponding interface generated for that structure.

That's probably not gonna be enough because we could assign the class to the interface with something inbetween, like Dynamic. ((classInstance : Dynamic) : Interface) would not be detected correctly.

Another thing to consider are type parameter constraints.

nadako · 2019-03-20T12:48:14Z

((classInstance : Dynamic) : Interface)

I guess this will require some DynamicImplForInterface that goes fully-dynamic (basically implementing get/set for Interface by using _hx_getField). The only thing that we can maybe do here is to cache that wrapper, although I'm not sure it's even worth it.

Simn · 2019-03-20T12:50:05Z

I'm not worried about field access because we're gonna have a slow route for that anyway. My concern is the run-time assignment because we would have to make sure that classInstance can be assigned to Interface (bad name, I meant one of these anon interfaces).

nadako · 2019-03-20T13:02:31Z

Yeah, well one option is to generate new DynamicInterface(classInstance), where:

class DynamicInterface implements Interface { // actually it has to also implement HxObject itself
  public var knownField(get,set):Int;
   
  final __obj:HxObject;

  public function new(obj:HxObject) {
     this.__obj = obj;
  }

  function get_knownField():Int return __obj._hx_getField("knownField");
  function set_knownField(v:Int):Int return __obj._hx_setField("knownField", v);
}

Of course it's technically an extra allocation, but a) if you do that you deserve it and b) it will probably be inlined on stack by Java/JVM anyway.

No optimizations for known fields yet. see #1

Simn · 2019-03-20T14:01:39Z

I've added the dynamic version for now. Initialization is somewhat verbose at instruction-level:

		var obj = {
			a: null,
			b: 1,
			c: "foo"
		}

  public void testObjectDecl();
    descriptor: ()V
    flags: ACC_PUBLIC
    Code:
      stack=7, locals=8, args_size=1
         0: new           #571                // class haxe/jvm/DynamicObject
         3: dup
         4: invokespecial #572                // Method haxe/jvm/DynamicObject."<init>":()V
         7: dup
         8: dup
         9: ldc_w         #576                // String a
        12: aconst_null
        13: invokevirtual #575                // Method haxe/jvm/DynamicObject._hx_setField:(Ljava/lang/String;Ljava/lang/Object;)V
        16: dup
        17: dup
        18: ldc_w         #577                // String b
        21: iconst_1
        22: invokestatic  #53                 // Method java/lang/Integer.valueOf:(I)Ljava/lang/Integer;
        25: invokevirtual #575                // Method haxe/jvm/DynamicObject._hx_setField:(Ljava/lang/String;Ljava/lang/Object;)V
        28: dup
        29: dup
        30: ldc_w         #578                // String c
        33: ldc_w         #554                // String foo
        36: invokevirtual #575                // Method haxe/jvm/DynamicObject._hx_setField:(Ljava/lang/String;Ljava/lang/Object;)V

This is gonna be quite efficient though.

Simn · 2019-03-20T14:12:44Z

The alternative would be allocating two native arrays, one for names and one for values. That way we would only have 1 set call instead of N, but we would allocate two native arrays and have N store instructions. Plus the set function would then have to iterate on these arrays, which isn't free either. Not sure if that's worth it or not.

Simn · 2019-03-29T14:47:15Z

We can improve this in 3 stages:

Stage 1: Generate TObjectDecl as class instance

This should be uncontroversial: Instead of creating a DynamicObject and calling a bunch of _hx_setField on it, we create an (anonymous) data class and call its constructor. Given that dynamic field access already works on class instances, it's still going to work after this change.

We can try to re-use existing data classes for this, but we have to consider evaluation order: The order of the TObjectDecl fields has to be respected wenn calling the constructor.

Stage 2: Optimize FAnon access with a checkcast

If we have TField(e1,FAnon cf), we can generate the equivalent of (e1 instanceof AnonClassForE1 ? (e1 : AnonClassForE1).theField : (e1 : Dynamic).theField. This replaces the fully dynamic lookup with an instanceof + checkcast branch, which should be several orders of magnitude faster.

Stage 3: nadako-interfaces

I have to think more about this idea first. I'm a bit concerned about polluting class declarations with lots of possible interface relations to anonymous types. I would like to try + benchmark this at some point though.

I wonder if we should consider typedefs at all to identify anons types. Nothing really prevents us from generating a class for haxe.macro.Expr with the expr and pos fields. We can then find that type when we have a TAnon. It's always a bit annoying because we have to sort the fields and consider both names and types, but it can certainly be done.

This would mean that some truly anonymous types might get misidentified, but I don't think this is an issue as long as we don't assign any semantics to these classes. And even then I don't see what could go wrong with that.

see #1

Simn · 2019-03-29T21:35:41Z

class Main {
	static public function main() {
		var obj = getObj();
		var stamp = haxe.Timer.stamp();
		var target = stamp + 2.0;
		var num = 0;
		while (haxe.Timer.stamp() < target) {
			++num;
			call(obj.obj.obj.obj);
		}
		trace(num);
	}

	static function call(d:Dynamic) { }

	static function getObj() {
		return { obj: { obj: { obj: { obj: 12 }}}};
	}
}

Note that there aren't actually any temp vars, that's just how the decompiler displays this.

before:     6855521
after   : 287782212

nadako · 2019-03-30T07:03:41Z

I wonder if we should consider typedefs at all to identify anons types.

That would be very nice for readability and native interop :)

nadako · 2019-03-30T07:08:15Z

Oh, regarding

polluting class declarations with lots of possible interface relations to anonymous types

My idea was to detect unifications and only add implements when needed.

Simn · 2019-03-30T07:26:37Z

My idea was to detect unifications and only add implements when needed.

But what about ((classInstance : Dynamic) : Interface)?

nadako · 2019-03-30T07:45:38Z

I guess it would be fair to have a fully-dynamic wrapper implementation for this, like we talked before: #1 (comment)

nadako · 2019-03-31T06:50:15Z

Actually, regarding the example code. I think we should have 2 anon classes generated for that nested obj structures: one with generic Object obj, and one with int obj. This shouldn't be too bad in regards of generated code, if we reduce all types to the JVM ones (so basically object and numbers, iirc), but should be nicer wrt boxing.

Simn · 2019-03-31T07:06:38Z

Oh it does that, I just didn't .obj enough so it never reached the integer field:

    public static void main(String[] args) {
        Object obj = getObj();
        Object var10000 = obj instanceof Anon1 ? ((Anon1)obj).obj : Jvm.readField(obj, "obj");
        var10000 = var10000 instanceof Anon1 ? ((Anon1)var10000).obj : Jvm.readField(var10000, "obj");
        var10000 = var10000 instanceof Anon1 ? ((Anon1)var10000).obj : Jvm.readField(var10000, "obj");
        call(var10000 instanceof Anon2 ? ((Anon2)var10000).obj : Jvm.toInt(Jvm.readField(var10000, "obj")));
    }

Interestingly, that decompilation is missing the boxing on for the call call where the argument is Dynamic. It's in the bytecode though:

        69: instanceof    #25                 // class haxe/generated/Anon2
        72: ifeq          84
        75: checkcast     #25                 // class haxe/generated/Anon2
        78: getfield      #28                 // Field haxe/generated/Anon2.obj:I
        81: goto          92
        84: ldc           #17                 // String obj
        86: invokestatic  #23                 // Method haxe/jvm/Jvm.readField:(Ljava/lang/Object;Ljava/lang/String;)Ljava/lang/Object;
        89: invokestatic  #32                 // Method haxe/jvm/Jvm.toInt:(Ljava/lang/Object;)I
        92: invokestatic  #38                 // Method java/lang/Integer.valueOf:(I)Ljava/lang/Integer;
        95: invokestatic  #42                 // Method call:(Ljava/lang/Object;)V

It goes 72 -> 75 -> 78 -> 81 -> 92 -> 95, and 92 has the boxing.

This also makes me realize that the other path does unbox + box (89 + 92). Not sure if there's a good way to fix that though because we need a consistent stack map at the branch join (92 which is reached from 81 (where we have an int) and 89 (where we would have an Integer without the unboxing).

Simn · 2019-03-31T07:35:08Z

Addressed the casting in #25:

        69: instanceof    #25                 // class haxe/generated/Anon2
        72: ifeq          87
        75: checkcast     #25                 // class haxe/generated/Anon2
        78: getfield      #28                 // Field haxe/generated/Anon2.obj:I
        81: invokestatic  #34                 // Method java/lang/Integer.valueOf:(I)Ljava/lang/Integer;
        84: goto          92
        87: ldc           #17                 // String obj
        89: invokestatic  #23                 // Method haxe/jvm/Jvm.readField:(Ljava/lang/Object;Ljava/lang/String;)Ljava/lang/Object;
        92: invokestatic  #38                 // Method call:(Ljava/lang/Object;)V

It now has the Integer.valueOf within the branch (81) and doesn't cast at all on the other path.

Simn · 2019-03-31T07:38:35Z

And for completeness, this is the code when we expect Int instead of Dynamic on the call argument:

        69: instanceof    #25                 // class haxe/generated/Anon2
        72: ifeq          84
        75: checkcast     #25                 // class haxe/generated/Anon2
        78: getfield      #28                 // Field haxe/generated/Anon2.obj:I
        81: goto          92
        84: ldc           #17                 // String obj
        86: invokestatic  #23                 // Method haxe/jvm/Jvm.readField:(Ljava/lang/Object;Ljava/lang/String;)Ljava/lang/Object;
        89: invokestatic  #32                 // Method haxe/jvm/Jvm.toInt:(Ljava/lang/Object;)I
        92: invokestatic  #36                 // Method call:(I)V

nadako · 2019-03-31T07:39:40Z

awesome! so clean

Simn · 2019-03-31T07:46:31Z

I'm still a bit concerned that the decompilers don't show that cast. IntelliJ just omits it whereas JD simply says /* Error */ on the entirety of main. x)

Simn · 2019-03-31T08:18:04Z

I went ahead and did the typedef identification thing so we now get nice names:

public class Expr extends DynamicObject {
    public ExprDef expr;
    public Object pos;

    protected StringMap _hx_getKnownFields() {
        StringMap tmp = new StringMap();
        tmp.set("expr", this.expr);
        tmp.set("pos", this.pos);
        return tmp;
    }

    public Expr(ExprDef expr, Object pos) {
        this.expr = expr;
        this.pos = pos;
        super();
    }
}

As I said before, this would also pick up NotExpr if it has the same structure, but I think that's fair.

nadako · 2019-03-31T10:09:13Z

nice, why this before super tho? doesn't that supposed to not work?

Simn · 2019-03-31T10:27:51Z

Uhm, interesting, I didn't really notice that... Looks like the JVM is fine with this as long as we don't actually call something on this.

see #1

nadako · 2019-03-31T22:48:40Z

I think we should not generate an empty Anon class for {} and just use DynamicObject directly.

see #1

Simn · 2019-04-01T06:21:52Z

Indeed

Simn · 2020-03-07T08:56:58Z

We now generate proper interfaces, so this is resolved.

Simn added a commit that referenced this issue Mar 20, 2019

add initial support for TObjectDecl

84d1376

No optimizations for known fields yet. see #1

Simn added a commit that referenced this issue Mar 29, 2019

implement stage 1 of the object revolution

08ded6c

see #1

Simn added a commit that referenced this issue Mar 29, 2019

stage 2 is actually super easy

c59fea5

see #1

Simn mentioned this issue Mar 31, 2019

Top-down-casting #25

Closed

Simn added a commit that referenced this issue Mar 31, 2019

call super() before initializing fields on anon classes

cd4f8f6

see #1

Simn added a commit that referenced this issue Apr 1, 2019

don't generate Anon class for empty fields

b134caa

see #1

Simn closed this as completed Mar 7, 2020

TObjectDecl #1

TObjectDecl #1

Comments

Simn commented Mar 19, 2019

Simn commented Mar 20, 2019

nadako commented Mar 20, 2019

Simn commented Mar 20, 2019

nadako commented Mar 20, 2019

Simn commented Mar 20, 2019

nadako commented Mar 20, 2019 • edited Loading

Simn commented Mar 20, 2019

nadako commented Mar 20, 2019

Simn commented Mar 20, 2019

nadako commented Mar 20, 2019 • edited Loading

nadako commented Mar 20, 2019

Simn commented Mar 20, 2019

Simn commented Mar 20, 2019

nadako commented Mar 20, 2019

Simn commented Mar 20, 2019

nadako commented Mar 20, 2019

Simn commented Mar 20, 2019

Simn commented Mar 20, 2019

Simn commented Mar 29, 2019

Stage 1: Generate TObjectDecl as class instance

Stage 2: Optimize FAnon access with a checkcast

Stage 3: nadako-interfaces

Simn commented Mar 29, 2019

nadako commented Mar 30, 2019

nadako commented Mar 30, 2019

Simn commented Mar 30, 2019

nadako commented Mar 30, 2019

nadako commented Mar 31, 2019

Simn commented Mar 31, 2019

Simn commented Mar 31, 2019

Simn commented Mar 31, 2019

nadako commented Mar 31, 2019

Simn commented Mar 31, 2019

Simn commented Mar 31, 2019

nadako commented Mar 31, 2019

Simn commented Mar 31, 2019

nadako commented Mar 31, 2019

Simn commented Apr 1, 2019

Simn commented Mar 7, 2020

nadako commented Mar 20, 2019 •

edited

Loading

nadako commented Mar 20, 2019 •

edited

Loading