New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dart2JS, DDC not consistent on canonicalizing types #35098

Closed
matanlurey opened this Issue Nov 8, 2018 · 18 comments

Comments

Projects
None yet
5 participants
@matanlurey
Contributor

matanlurey commented Nov 8, 2018

See the following code sample:

void main() {
  Type a = HasGenericT;
  Type b = typeOf<HasGenericT>();
  print('identical($a, $b) == ${identical(a, b)}');
  
  Type c = HasNoGenerics;
  Type d = typeOf<HasNoGenerics>();
  print('identical($c, $d) == ${identical(c, d)}');
}

class HasNoGenerics {}

class HasGenericT<T> {}

Type typeOf<T>() => T;

This prints true, false in DDC, and false, false in Dart2JS.

The AngularDart team is attempting to be able to change:

abstract class Injector {
  dynamic get(Type token);
}

... to ...

abstract class Injector {
  T provide<T>();
}

... but this is a blocker, because it does not work in Dart2JS at all, and part of the time in DDC.

Please advise - what should we do here?

/cc @vsmenon @jmesserly @sigmundch

@jmesserly

This comment has been minimized.

Member

jmesserly commented Nov 8, 2018

Hmm, I expected dartdevc to print true, true. We must have a bug there. (it seems to work, see below)

In general: if we canonicalize all generic types (including those instantiated at runtime), the type canonicalization table will grow without limit (i.e. leak memory). This is currently the case in DDC for generic types.

Since your example only requires canonicalizing HasGenericT<dynamic>, that could presumably be done at compile time in dart2js.

@jmesserly

This comment has been minimized.

Member

jmesserly commented Nov 8, 2018

CC @sigmundch @rakudrama

BTW, the reason Angular is using identical here is that == does not perform well enough for change detection. Wanted to mention that, because I asked that question in a previous bug about function types.

@jmesserly

This comment has been minimized.

Member

jmesserly commented Nov 8, 2018

@matanlurey -- I am getting true, true with DDC.

@matanlurey

This comment has been minimized.

Contributor

matanlurey commented Nov 8, 2018

Weird @jmesserly! What about with a:

class HasGenericTWithBounds<T extends num> {}

... if that works too, I might have been mistaken and its only impacting Dart2JS.

@jmesserly

This comment has been minimized.

Member

jmesserly commented Nov 8, 2018

Yup that works too. I don't think DDC records class type bounds (not needed at runtime). DDC depends pretty heavily on generic canonicalization currently (for better or worse). And we cache the Type wrapper class too (so that's why non-generics work).

@matanlurey

This comment has been minimized.

Contributor

matanlurey commented Nov 8, 2018

Then this must be limited to Dart2JS. Removing DDC.

(Though I'm guessing both of you have to agree on a solution?)

@jmesserly

This comment has been minimized.

Member

jmesserly commented Nov 8, 2018

Switching DDC to false wouldn't help you, presumably? You need dart2js to implement compile time canonicalization.

@matanlurey

This comment has been minimized.

Contributor

matanlurey commented Nov 8, 2018

Agreed

@lrhn

This comment has been minimized.

Member

lrhn commented Nov 8, 2018

The only Type objects that the specification promises to canonicalize are the values of constant type literals. There is no promise, and deliberately so, to canonicalize any other type object, including type objects obtained from evaluating type variables as expressions. Not even if the object containing the type is constant.
The reason to not promise that is exactly that it would require a canonicalization table that can grow unboundedly.

We could change that, and require that constant types used as type arguments get reified as canonicalized Type objects. It would still be limited by the actual source code positions of such arguments. Any type that contains a type variable may introduce a new type per evaluation, but types that are constant (no type variable) only ever give rise to one type, which we can canonicalize in some way.

Example:

T type<T>() => T;
main() {
  print(identical(int, type<int>()));  // true
  print(identical(type<List<int>>(), type<List<int>>()));  // true
  withTypeVar<X>() {
    print(identical(int, X));  // true
    print(identical(type<List<int>>(), type<List<X>>()));  // false (potentially)
  }<int>();  
}

In the last example, the type argument to type is List<X> which is not constant.
The List<X> type may be created anew for each evaluation (we would not prohibit canonicalization, because inlining may make it possible, and we wouldn't want to force implementation to allocate more objects than necessary).

It would mean for the spec to link type arguments to Type objects much more strongly than it does now (type arguments are bound to types, not reified Type objects, and we would have to somehow say that the Type object created from a type variable must retain some knowledge of its originating type expression).
(If we ever get a way to deconstruct types, then extracting the T from a List<T> should not have to keep that information).

As the spec is currently written, and intended, the current behavior of dartjs is exactly as intended.

@eernstg

This comment has been minimized.

Member

eernstg commented Nov 8, 2018

tl;dr Don't trust identical for types rd;lt

The only Type objects that the specification promises to canonicalize
are the values of constant type literals
[...]
the current behavior of dartjs is exactly as intended.

This is a little bit tricky. In section 'Const' it is specified that constant object expressions (e.g., const C(42)) must evaluate to the same instance when they have identical values for every instance variable. We have separate verbiage for list literals, map literals, symbols, and strings; and identical for booleans and numeric types is specified in section 'Object Identity'; and that's basically what the spec has to say about canonicalization.

However, we have no spec'ed guarantees about where instances of Type come from, when we obtain them as the result of an evaluation of expressions like int or (in the body of type) X, in a situation where the corresponding actual type argument was int, even though we (surely) all agree that they should just be "the int object".

Nevertheless, I can't see any support for claiming that an implementation is incorrect if it makes the choice to use a private platform specific class _Type implementing Type with a constructor taking some additional arguments. So when we evaluate int, we might get the result of something like const _Type(#core, #int, 42) or const _Type(#core, #int, 389), depending on which syntactic occurrence of int we are talking about. That extra number is not useful, but nobody says that an implementation can't put those extra numbers into the Type instances, just so they won't be field-by-field identical, and hence they won't need to be canonicalized across syntactic occurrences.

In that case we would evaluate identical(int, int) and get false.

We obviously don't want that, but we'd need to specify identical for Type instances separately, in order to be able to claim that it's a bug for identical to return false in that situation, or we'd need to say more about how such instances are obtained when we evaluate a type literal.

Assuming that we establish the foundation for claiming that identical(int, int) must evaluate to true, we still have some tricky cases:

typedef F1 = void Function(List<int>);
typedef F2 = void Function(List<int>);

main() => print(identical(F1, F2));

We do specify (section 'Dynamic Type System') that operator == on Type must return true when comparing two instances obtained from evaluation of expressions denoting types, iff they denote the same type. So int == int must evaluate to true, and so must X == Y in class C<X, Y> {} in an instance of C<List<int>, List<int>>, and int == String must evaluate to false; moreover, for the example above, F1 == F2 must evaluate to true.

But do we want to insist that identical(F1, F2) also evaluates to true, just because F1 and F2 are constant expressions? How do we know that the two occurrences of List<int> are canonicalized, and that they are the value of some fields in the Type for F1 resp. F2?

I suspect that we wouldn't have to force this kind of canonicalization, we might as well make sure that all comparisons of types use ==, and then it's OK for an implementation to canonicalize Type instances wherever that can be done conveniently (and it would even be OK for a compiler to replace T1 == T2 by identical(T1, T2) whenever that's guaranteed to give the same result), but developers would get used to use == everywhere, which is good because it will work as intended also in the case where there is no way we could guarantee canonicalization.

If we take this approach then we have the following situation:

  • For instances of Type, comparisons using identical yield a safe "yes", but when identical(T1, T2) is false it may still be true that T1 == T2, and they may still be the Type instances for the same type (so we never get a safe "no" from identical).
  • There are no exceptions; for instance, not even identical(int, int) is guaranteed to yield true.
  • In terms of this issue, the behaviors seen with the tools are then all correct, because there is no guarantee about canonicalization of Type instances.

If we want stronger guarantees then we'll need to consider how to constrain implementations a bit more in the language specification, but as you can see above it won't extend to all Type instances denoting any particular type, because the Type for List<int> could be obtained in a situation where the actual type argument is obtained in a non-trivial computation (such that we go outside the domain of decidable problems if we try to find the value of that type argument at compile-time).

@eernstg

This comment has been minimized.

Member

eernstg commented Nov 8, 2018

OK, I argued that we don't have much support for requiring identical(T1, T2) when T1 and T2 are Type instances for the same type, and it's not trivial to decide exactly which Type instances can be expected to be identical if we decide that we do want to specify that (and we will necessarily stop at some point, so there will be cases where identical(T1, T2) is false even though T1 and T2 are Type instances for the same type, unless we special case identical or canonicalize at run time).

But in the concrete setting, for the AngularDart team, would it work to use ==?

@matanlurey

This comment has been minimized.

Contributor

matanlurey commented Nov 8, 2018

A few notes:

  • We use identical because == is too slow on the critical path.
  • All of the types (T) provided to typeOf<T> are statically known.
@rakudrama

This comment has been minimized.

Member

rakudrama commented Nov 8, 2018

"Please advise - what should we do here?"

Keep using your current dynamic code.

Full and efficient canonicalization of reified types is a huge project. If we decide to take dart2js in that direction, it won't land for many months, even if high priority.

  • We could make the HasNoGenerics case work for types that are used as constant literals with some cost in startup and a small cost converting from type parameter to Type object (reification).

  • We could make the HasGenericT case work for types that are used as constant literals with slightly higher cost in both startup and reification. It would work for typeOf<List>() but not typeOf<List<int>>(), but only if List is used as a value, e.g. register(List, ...) and not register<List>(...).

The problem with the above two cases is that you will be tempted to move all types to type parameters, and then it won't work because the types that are used as constant literals is what constrains the problem.

  • Anything outside the above is considerably more expensive. Handling types that are not used as type literals requires a bigger startup hit and some code bloat to register/describe the type. typeOf becomes more expensive too, in effect pushing the cost of == to creating the Type object in order to avoid it later.

  • We can avoid the canonicalization during reification if the internal representation is canonicalized. This is what will take months to implement, since it changes the type representation throughout the runtime and compiler.

@matanlurey

This comment has been minimized.

Contributor

matanlurey commented Nov 8, 2018

Thanks @rakudrama.

We are likely to instead change our API to:

T provide<T extends Object /* Avoids instantiate to bounds of dynamic */ >(Type token) {
  // ...
}

Does that sound reasonable?

@eernstg

This comment has been minimized.

Member

eernstg commented Nov 9, 2018

We use identical because == is too slow on the critical path.

@lrhn, can't we make changes like this one in a couple of core libraries? Change

  // In class TypeImpl in js_rti.dart.
  bool operator ==(other) {
    return (other is TypeImpl) && _typeName == other._typeName;
  }

to

  bool operator ==(other) {
    return identical(this, other) ||
        ((other is TypeImpl) && _typeName == other._typeName);
  }

and, @matanlurey, do you think that would still be too slow?

@rakudrama

This comment has been minimized.

Member

rakudrama commented Nov 10, 2018

@matanlurey Provided token always comes from a type literal (i.e. did not pass through a type parameter) it should work with today's compiler and runtime.

@eernstg Simply guarding with identical would still be too slow because some tests in normal execution return false. I think we might be able fast-exit on some false cases without generating the structural description of the type in ._typeName (it is not a field). This would still be much slower than identical.

@matanlurey

This comment has been minimized.

Contributor

matanlurey commented Nov 11, 2018

Thanks! Given #35098 (comment), I think we can close this issue as WAI. While I'm a little worried about DDC and Dart2JS having different implementations of types <T> re: Identity, I don't know how much it will affect real code (most users use ==, and it's OK for frameworks to understand the difference).

@eernstg

This comment has been minimized.

Member

eernstg commented Nov 12, 2018

In response to the discussion above, I created the issues dart-lang/language#92 and dart-lang/language#95 to describe the current situation and a possible language design that would help giving developers a more precise answer as to when a given instance of Type can be expected to be canonicalized.

With that we would no longer rely on pure luck when identical(t1, t2) is used in an attempt to determine whether the reified types t1 and t2 are "the same type". ;-)

I can see that DDC and dart2js and the virtual machine do not agree on when to canonicalize reified types, so the fact that this issue has been closed as 'working as intended' must mean that we now recognize that identical(t1, t2) can not be trusted to have the same value as t1 == t2, no matter whether t1 and t2 are "constant types".

@matanlurey, was that also the conclusion that you wanted to state when you closed this issue?

Otherwise, dart-lang/language#92 and dart-lang/language#95 would be a starting point for establishing a certain amount of support for using identical(t1, t2) in some special cases, knowing that it actually answers the question "is t1 and t2 the same type?".

alorenzen added a commit to dart-lang/angular that referenced this issue Nov 15, 2018

refactor(Core): Rewrite ".provide<T>" as ".provideType<T>(Type)".
See dart-lang/sdk#35098 for background.

PiperOrigin-RevId: 221141562

alorenzen added a commit to dart-lang/angular that referenced this issue Nov 15, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment