Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow late final fields on const classes #2225

Open
gaaclarke opened this issue May 4, 2022 · 14 comments
Open

Allow late final fields on const classes #2225

gaaclarke opened this issue May 4, 2022 · 14 comments
Labels
feature Proposed language feature that solves one or more problems

Comments

@gaaclarke
Copy link

gaaclarke commented May 4, 2022

I don't believe there is a practical reason to disallow late final fields on const classes. It was probably disallowed to make the implementation easier (possibly to avoid thread synchronization?). I ran into this limitation when trying to memoize an expensive hashCode on a const class. I think the only alternative I have for memoizing the hashCode is to calculate it up front which unfortunately will have a different performance characteristic than the original code.

code

class Foo {
  const Foo(this.x, this.y);
  int x;
  int y;
  int get xPrime => expensiveFunction(x);
  @override
  int get hashCode => _hashCode;
  late final int _hashCode = hashValues(xPrime, y);
}

related issue: dart-lang/sdk#48948

cc @lrhn

@dnfield
Copy link

dnfield commented May 4, 2022

How would a compiler encode a constant object with an uninitialized field?

late final is really pretty different from final. The runtime would have to visit const objects and ... fill in fields later?

@gaaclarke
Copy link
Author

late final is really pretty different from final. The runtime would have to visit const objects and ... fill in fields later?

That's correct. It's still const from the users perspective, it will never change. From the vm implementation details it will change when the late value is calculated.

@lrhn
Copy link
Member

lrhn commented May 5, 2022

The way Dart constants are currently defined, constant evaluation doesn't have to happen at compile-time.
The semantics say that a const Foo(...) is evaluated normally, and then, if another Foo exists which was also created by a constant expression, and which has the exact same state, then the later evaluation canonicalizes to the first value instead.

Constant expressions have three properties:

  • Compile-time evaluation allowed
  • Equal objects canonicalized
  • Deep immutability

The specified semantics are such that you can't tell whether an object was created at compile-time or at run-time, and the run-time canonicalized. Early versions of the Dart dev-compiler did just that, and it wasn't wrong.

(Well, not entirely true, since we require throwing operations to be compile-time errors, and we no longer support compile-time errors at runtime, like we did in Dart 1. The specification doesn't preclude it, though.)

We don't have a notion of "potentially constant expression" which we can apply to the initializer expression, so either it must be just plain constant (in which case there is no need for late), or it can be any expression, which includes referring to non-constant variables and having side-effects. That's what the expensiveFunction here suggests.
That's slightly weird. Not impossible, since it's not a constant expression to read the getter, so those side effects cannot be triggered at compile-time anyway.

But having a late final field breaks deep immutability.
We can't allocate the object in read-only isolate-shared memory, like we can other constant objects. We can't allocate any constant object referencing that non-deeply-immutable object there either. Being not-deeply-immutable is contagious.

It also complicates the combination of the other two properties: If const objects are canonicalized at run-time, will the state of the late final field part of the equality computation used to decide what to canonicalize with? Probably not, which is also weird.

When we send constant objects to another isolate, they are deserialized into the corresponding constant on the other side.
(And as an optimization, the isolates can share the same object from in a shared heap containing all the compile-time created constants, wohoo!)

That means that an initialized late final field is either not passed to the other isolate, since it already has the corresponding object, which may not have initialized the field, or if the isolates share the object, then initialization is a cross-isolate event. (That's not going to happen, we'd have to have cross-isolate mutexes to avoid seeing intermediate values, being immutable is what makes sharing easy. So, no cross-isolate sharing of those non-deeply immutable objects.)

All in all, it's not clear what a late final field on a const object actually means according to the currently specified constant semantics.
We can define it, but I honestly don't think it's worth it.

If you want the same effect, just use an expando. Instead of late final Foo foo = initializer;, do:

  static final Expando<Foo> _foo = Expando();
  Foo get foo => _foo[this] ??= initializer;

Then it's absolutely clear that the mutable property is not actually part of the immutable object.

@lrhn lrhn transferred this issue from dart-lang/sdk May 5, 2022
@lrhn lrhn added the feature Proposed language feature that solves one or more problems label May 5, 2022
@mraleph
Copy link
Member

mraleph commented May 5, 2022

I think you can make it work with isolates if you specify that

a) late final initialiser has to produce a const-instance otherwise it is an exception
b) late final initialiser has to be stable (e.g. if two isolates race two initialise the same field then both initialisers have to produce the same value - otherwise it is an exception).

(a) is mandatory for sharing, (b) is optional - you can implement reasonable semantics without it.

Unfortunately (a) also means that you are relatively limited in what you can do with it because you can't necessarily produce const instances at runtime except for certain classes. So you are restricted to using known constants or something like integers, strings and arrays of canonicalizable objects.

@gaaclarke
Copy link
Author

Yea, it might be a bit of work but if we had this feature we could create the memoization of hashCode for a const class mentioned in the other bug. It might be hard because Dart doesn't have the concept of pure functions but the canonical form could just store the value of the late final as (pure function + input) instead of (output). The const object can just have an extra layer of indirection that points to a thread_local store of the calculated value. You can just recalculate it once per isolate to avoid any synchronization. I think @mraleph's suggestion is basically showing how we could assert the initializer is a pure function at runtime.

(Related but meandering observation: I couldn't share logic between const constructor initializers yesterday either because once you pull an expression into a function Dart lost the idea that it was a pure function that could be evaluated at compile time).

If you want the same effect, just use an expando.

Thanks, I'll look into that. I'm not sure what its performance is since the point of the exercise is to make things faster. I'll give it a shot.

Thanks for looking into this. I don't have evidence this requires any large engineering effort. I want to put it on your radar as a limitation I ran into while trying to optimize Flutter.

@gaaclarke
Copy link
Author

gaaclarke commented May 5, 2022

Using Expando to memoize hashCode have me a 70% speedup in Locale.hashCode, so in my case it was fast enough to make an improvement. Naively, the initializer is performing 2 map lookups so the one Expando lookup should be faster, so it makes sense. flutter/engine#33140

@leafpetersen
Copy link
Member

The way Dart constants are currently defined, constant evaluation doesn't have to happen at compile-time.

I don't think this is true. There are a number of specified compile time errors that rely on constant identity being available at compile time (e.g. having the same keys in a const map, or in a switch).

@lrhn
Copy link
Member

lrhn commented May 5, 2022

@leafpetersen Well, there is that. The specification is rather vague, though.

We do not provide compile-time semantics for constant expressions. They are evaluated using the same dynamic/runtime semantics that is used to evaluate expressions at runtime, since it's is the only evaluation semantics we have.
The runtime semantics of the expression var y = 1 + 2, x = const Object(); evalaute 1 + 2 to 3 and assign 3 to y, then it will create an object by calling the Object constructor, then (because of the const) it will try to canonicalize it with another instance of Object with the same state, and also created by a constant object creation expression. Then it will assign that object to the variable x.

The language semantics explicitly say that constant expressions are evaluated at runtime!

They then also say that it's a compile-time error if some part of that evaluation would have caused an error.
Neat hack, not very precise for a specification though.

All compile-time "constant evaluation" has to do is detect whether the run-time evaluation would throw, or a potentially constant expression would fail to be an actual constant expression, or - as you mention - a constant map having the same object as keys, because those things must be reported at compile-time.
It doesn't actually have to evaluate anything that can't cause a failure at compile-time. (If it can tell without evaluating, obviously, and the semantics are not optimized for that).

We have restrictions on the constant expressions such that the you cannot tell whether we pre-computed the result at compile-time, or re-computed the result at runtime and then canonicalized at runtime, as long as the constant expression does not cause a compile-time error. Because that's what the semantics actually say that the behavior should be equivalent to, since that's the specified semantics.

Evaluation doesn't have to happen at compile-time. Some "will this constant expression cause a compile-time error" analysis is needed, and in many cases it's probably indistinguishable from actual evaluation, but the actual evaluation which creates the runtime result can happen at run-time.

@leafpetersen
Copy link
Member

Evaluation doesn't have to happen at compile-time. Some "will this constant expression cause a compile-time error" analysis is needed, and in many cases it's probably indistinguishable from actual evaluation, but the actual evaluation which creates the runtime result can happen at run-time.

I have to admit, you lost me somewhere in your response. I don't understand what you're trying to say, or if I do, it feels like a distinction without a difference. It's true that I can choose to evaluate constants at compile time, throw away the result, and then recreate the object at runtime but... so what? The point is, I do in fact have to evaluate the object at compile time.

I think we agree that the compiler must evaluate at least some of the constants in the program in order to produce the specified errors, right? I don't really understand where you're going with the rest. Given the original definition of Foo in this issue, is {const Foo(1, 1) : 0, const Foo(1, 1) : 0} an error or not? In order to tell, you have to try to evaluate arbitrary user code at compile time (that is, you have to evaluate the late field). You can say that the result is an error, or you can say that you go ahead and evaluate them, but either way, you are evaluating the objects at compile time, and it is observable to the user that you are doing.

@eernstg
Copy link
Member

eernstg commented May 6, 2022

@leafpetersen wrote:

I think we agree that the compiler must evaluate at least
some of the constants in the program in order to produce
the specified errors, right?

There is one property which hasn't been mentioned so far, which is crucial in order to determine which ones. The required compile-time errors arise with expressions that are 'required to be constant':

It is a compile-time error if an expression is required to be
a constant expression,
but its evaluation would throw an exception.

It might be possible to perform a static analysis on constant expressions such that these situations can be precisely predicted without actually evaluating the expressions. For instance, it might be possible to find a useful set of expressions where it would be guaranteed via static analysis that an evaluation at run time will not throw. But it doesn't seem worthwhile to do that, in particular because those expressions are probably the simplest ones. So, essentially, we're requiring compile-time evaluation of constant expressions in the following situations. The given expression is:

  • the default value of an optional parameter.
  • the initializing expression of an instance variable in a class that has a constant generative constructor.
  • an element in a constant list or set literal.
  • a key or a value in a constant map literal.
  • an actual argument in a constant object expression (const SomeConstructor<Constant, Types>(some, actual, arguments)).
  • an actual argument in a metadata construct (@C<T>(arg)).
  • a switch case expression.
  • (plus, I probably missed something).

We have a "recursive" case, too, where the required evaluation propagates to other locations (and even to expressions that aren't present in the program):

  • A <constObjectExpression> is being evaluated, and substitution of the actual arguments for the formal parameters yields an initializing expression in the initializer list of the given constructor which is not a constant expression.

This means that the set of constant expressions that must be evaluated at compile-time is quite substantial.

Of course, we still have a lot of constant expressions that aren't required to be constant (such as 'a' + 'b' in var s = 'a' + 'b';), and they don't have to be evaluated before run time. The behavior of the implementations shows that we actually don't evaluate them, at least in many cases.

void main() {
  print(identical('ab', 'a' + 'b')); // 'false' on the vm, 'true' on the web.
}

So we might wish to be stricter on requiring all constant expressions to be evaluated at compile-time. Note also that the particular case with canonicalization of strings is the topic of #985.

@eernstg
Copy link
Member

eernstg commented May 6, 2022

I believe it's fair to say that the main topic here is caching getters on objects that are obtained as the value of a constant expression (let's just call them 'constant objects'): A late final instance variable with an initializer is essentially a cached getter.

@lrhn already mentioned several reasons why we can't allow arbitrary computations at run time to provide the value of an instance variable in a constant object, even though the object will still "look constant" because the value is filled in at the first usage and then never changed.

However, we can still consider some mechanisms that are a bit less permissive. For instance, we could consider introducing constant instance getters.

An instance getter marked const would have an expression body (not a block), and that expression would have to be potentially constant, with the adjustment that references to instance variables of the enclosing object would be considered potentially constant.

A constant instance getter is a compile-time error unless the enclosing class has a constant constructor. It is a compile-time error to override a constant instance getter by an instance getter which is not constant.

During creation of an instance of a class C with a constant instance getter g whose implementation is const T get g => e, the expression e is evaluated and the result is stored in an implicitly induced final instance variable with the fresh name _g, and every invocation of g will return the value stored in _g.

During constant expression evaluation at compile time, this implies that e is evaluated as a constant expression, substituting the values of any instance variables of the enclosing object which are used in e, subject to normal rules about constant expression evaluation (so, e.g., there could be a compile-time error because of an integer division by zero).

We would need to consider non-termination (which is currently not possible during constant expression evaluation); it could be an error during constant expression evaluation if it ever evaluates an expression obtained by substitution into the expression e mentioned above during the evaluation of any such expression obtained by substituting into the same e (so we'd outlaw recursion), or we might put a limit on the number of expressions that are subjected to substitution (so we'd allow some not-so-well-defined number of recursive invocations). In any case, termination is always an issue when constant expressions are generalized.

Note that this idea is quite similar to 'user-defined constant functions', but not quite as powerful because the extraction of the value of a constant getter isn't (currently) a constant expression itself.

@lrhn
Copy link
Member

lrhn commented May 6, 2022

I'm not seeing what "constant instance getters" can do that you can't do by just having a final field and putting the same expression into the initializer list. The only values the field expression can access are instance variables of the same current class (I'm not allowing super-class declared instance variables, that will break encapsulation and field/getter symmetry). The values actually stored into instance variables are all available in the constructor (not necessarily easily accessible, but simply allowing access to instance variables prior in the initializer as final local variables would fix that, and be generally useful, otherwise we just need one constructor level of indirection.)
Also, we'll then have to worry about mutually referencing constant instance getters :)

@rrousselGit
Copy link

I don't think we can expect the expression to be a constant expression though

A cached hashCode would typically call Object.hash(a, b, c) and maybe const DeepCollectionEquality().hash(collection)

@eernstg
Copy link
Member

eernstg commented May 6, 2022

I'm not seeing what "constant instance getters" can do that you
can't do by just having a final field and putting the same expression
into the initializer list.

You wouldn't be able to customize an expression in an initializer list of a constant constructor (or leaving parts of it abstract) by overriding a declaration in a subclass.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Proposed language feature that solves one or more problems
Projects
None yet
Development

No branches or pull requests

7 participants