possible to copy already-computed memoized fields in build()? #774

dave-doty · 2019-12-28T20:47:11Z

While profiling my application, I noticed that a supposedly memoized field was being recomputed over and over, even though the object's fields didn't change. However, the object itself changed through a call to replace (with identical field values as before).

The source of it seems to be that build does not copy the memoized fields.

I understand the danger here. Suppose we have

abstract class Value implements Built<Value, ValueBuilder> {
  int get field;
  
  @memoized
  int get derived => field*2;
  
  // boilerplate
}
var value = Value(field: 3);
print(value.derived); // prints 6
var builder = value.toBuilder();
builder.field = 30;
value = builder.build();
print(value.derived); // should print 60

If build naively copied the field derived, then it would still be the value 6 at the last print statement, even though we want it to be recomputed to be 60 since the non-derived field has changed.

Nonetheless, it would be great if the ValueBuilder class could detect whether any of its fields had changed since it was created, and if not, then any derived fields that have already been computed could be copied in build to avoid needing to re-compute them.

One way to do this would be to have build copy all of the derived fields by default, but for the writable field setters, on the first write to any of them, to set all the underlying derived fields to null. (in the example above, this would be the field named __derived in the generated .g.dart file). Or to be a bit more efficient, add to each Builder class a single _dirty field that's initially false, and is set to true when any setter is called, and build() only copies memoized fields into the new Built object if _dirty == false.

The text was updated successfully, but these errors were encountered:

dave-doty · 2019-12-28T21:14:29Z

Even better, one could imagine an implementation where this:

var val = Value(field: 3);
var val2 = val.rebuild((b) => b..field=3);

results in identical(val, val2) being true. More generally, the following code would also result in only a single Value object being created:

var val = Value(field: 3);
var builder = val.toBuilder();
builder.field = 3;
var val2 = builder.build();

The idea is the same: toBuilder() marks the field Builder._dirty as false, any setter on the Builder sets it to true if the new value is different, and build() returns the original underlying Built object (the one referenced by _$v in the implemented Builder class) if and only if _dirty is false.

There is one question of efficiency: in each of the Builder's setters, would we want to pay the cost of doing a full == comparison to determine if the new value is different? This could be expensive. Whether to do this full equality comparison could be an option specified in toBuilder and the auto-generated Builder constructor (stored in a Builder field bool _compareOnSet). But it would cost almost nothing to at least default to doing an identical comparison, i.e.,

set field(int value) {
  if (identical(_$this._field, value) || (_compareOnSet && _$this._field == value)) {
    return;
  }
  _dirty = true;
  _$this._field = value;
}

Even if _compareOnSet defaults to false, in many circumstances having a setter implemented as above would avoid duplicating objects that are structurally == to old objects, so identical(_$this._field, value) would end up being true , which would recursively imply that this object itself won't be duplicated either.

This would essentially get all the benefits of intern()'ing (#750) in the common special case where a duplicate object is created directly from the old one that it is structurally == to. (It would not, however, avoid duplication when going through a sequence of different values, e.g., val.rebuild((b) => b..field=3).rebuild((b) => b..field=4).rebuild((b) => b..field=3))

Looking at the generated file .g.dart file, my understanding is that currently, the only optimization is that if I create a Builder from a Built object via var builder = val.toBuilder(); and then don't access any of the fields of builder (not even to read them with a getter), then the same Built object val will be returned from builder.build(). Any access of the fields accesses the _$this getter, which in turn nulls the field _$v$, which causes a new Built object to be created when calling build().

davidmorgan · 2019-12-30T14:50:53Z

Thanks.

It might make sense to have the builder keep the reference to the old value, then compare after building whether the results are the same.

But, we have to be careful. For example, it's possible to have fields that are not included in the comparison. With this change, it would no longer be possible to update these fields unless you update some other field at the same time.

I'll have a think about this next time I look at implementing a batch of features.

dave-doty · 2019-12-30T18:30:28Z

For example, it's possible to have fields that are not included in the comparison. With this change, it would no longer be possible to update these fields unless you update some other field at the same time.

Ah, I see the problem.

Nevertheless, the identical test should be safe in this regard

set field(int value) {
  if (identical(_$this._field, value)) {
    return;
  }
  _dirty = true;
  _$this._field = value;
}

And then use the value of _dirty (rather than the null-ness of _$v) as your check in build():

  _$Value build() {
    _$Value _$result;
    try {
      _$result = _dirty
        ? new _$Value._(
              field: field,
              //...
          )
        : _$v;
    } catch (_) {
      String _$failedField;
      // ...
      } catch (e) {
        throw new BuiltValueNestedFieldError(
            'Value', _$failedField, e.toString());
      }
      rethrow;
    }
    replace(_$result);
    return _$result;
  }
}

In other words, create a new Value object if and only if some field was set to a value/object not identical to its previous value. (including even fields that are not included in an == comparison between two Value instances)

davidmorgan · 2020-01-02T10:16:20Z

Nested builders will be more complicated; those are always new, so identical won't help. I think we have to keep the old instance around and check when building, either with identical or with ==, to improve things there.

dave-doty · 2020-01-03T03:51:23Z

@davidmorgan For what it's worth, I just encountered another reason not to use the reading of _$this as a "dirty bit". I was trying to debug some code today and had a Heisenbug where printing the values in some builders caused an error to show up. It was because the print statement accessed a Builder field, triggering the _$this getter to change the Builder's internal state. It's always surprising when simply printing a value causes the state of the program to change.

davidmorgan · 2020-01-03T09:34:51Z

Yes, nested builders are great for usability but awkward for correctness :)

dave-doty · 2020-01-03T16:46:04Z

Is there a reason to null out the internal _$v reference whenever a getter is accessed? It seems that one could allow _$v to reference the original Built object assigned in replace until the first setter is called, i.e., implement getters and setters like this:

int _field;
int get field =>_$v == null ? _field : _$v.field;
set field(int newField) {
  if (_$v == null) {
    _field = newField;
  } else {
    _field = new_field;
    _field2 = _$v.field2;
    _field3 = _$v.field3;
    _fieldNestedBuilder = _$v?._fieldNestedBuilder.toBuilder();
    // etc...
    _$v = null;
  }
}

Update:
Ah, I think I see the problem. If the field accessed through a getter is itself a nested Builder, then since Builder objects are mutable, it is possible to change its value through the getter (e.g., myValueBuilder.nestedBuilder.field = 7;, which as far as myValueBuilder is concerned just looks like a getter access). So that's why you declare the top-level builder "dirty" (by setting _$v to null) on any getter access.

Still, it feels tantalizing close to there being an efficient way to avoid creating new Built objects unless they represent actual new values. This would be great for things like Redux, where you have very frequent updates to a large state tree, most of which leave most of the tree alone in terms of what value it represents, but as currently implemented in built_value, causes brand-new objects to be generated on each update.

dave-doty · 2020-01-05T16:17:53Z

I think the following logic might make sense to allow more eager caching of Built objects, but I'm sure I could be missing something.

Each Builder maintains an internal dirty bit _dirty, which depends only on the non-Builder fields in the way I suggested above (doesn't change on getter access to those fields, and changes on setter access only when the new value is not identical to the old, with a user-specifiable option to allow this to change to an == test, which is more time-expensive but more likely to successfully cache).

The Builder has a public method dirty(), which is defined as this._dirty || nestedBuilder1.dirty() || nestedBuilder2.dirty() || .... The value of this.dirty() during build() determines whether to create a new instance of _$Value or just return the original Value instance referenced by _$v.

One way this breaks is that built_value allows non-immutable objects to be contained in Built values, e.g., it is permitted to store a mutable List<int> as a field, instead of enforcing BuiltList<int> as the type. Such objects can be mutated, but there's no dirty() method to alert the parent Builder to this, so it would have to resort to a more expensive == comparison to determine the value of dirty(). I'd also think it would be great to allow a user-specifiable option to disallow this, which enforces that the only field types allowed are known immutable types such as int or String, as well as other Built instances and instances from built_collection.

I feel like I'm possibly re-inventing the wheel here; I wonder if there is a language that supports true language-level optional immutability, and if so, how it's implemented. There's functional languages like Haskell that force everything to be immutable (no optional mutability), and many languages have something like final to declare a reference cannot be changed, but do nothing to enforce immutability of the object being referenced (not true immutability). In some sense built_value is an implementation of the latter, since it allows fields to reference mutable types such as List.

davidmorgan · 2020-01-06T09:30:16Z

Allowing List is a bug; it used to be disallowed but an analyzer change broke that check and unfortunately it was a while before I noticed. I plan to re-enable the check again but I haven't had time to do it--it's hard because it's a breaking change.

Other than that though I've taken the approach that it's more useful to allow types we're not sure about, on the understanding that people might supply their own immutable types, and should be aware that things will break if they are mutable :)

Anyway, I hope we can improve things here--not sure when I'll get to it, though.

dave-doty changed the title ~~possible to copy already-computed memoized fields for replace/build?~~ possible to copy already-computed memoized fields in build? Dec 29, 2019

dave-doty changed the title ~~possible to copy already-computed memoized fields in build?~~ possible to copy already-computed memoized fields in build()? Dec 29, 2019

davidmorgan self-assigned this Dec 30, 2019

davidmorgan added the p2 / feature request label Dec 30, 2019

dave-doty mentioned this issue Feb 13, 2020

profile to see performance bottleneck in rendering UC-Davis-molecular-computing/scadnano#191

Open

dave-doty mentioned this issue Jul 18, 2020

connect creates new Props map object on every Action dispatch, which must be deeply compared Workiva/over_react#434

Open

UnHumbleBen mentioned this issue Jan 4, 2022

Work in Progress: Fixes #650; move svg position of helices out of design and into app state UC-Davis-molecular-computing/scadnano#703

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

possible to copy already-computed memoized fields in build()? #774

possible to copy already-computed memoized fields in build()? #774

dave-doty commented Dec 28, 2019 •

edited

Loading

dave-doty commented Dec 28, 2019 •

edited

Loading

davidmorgan commented Dec 30, 2019

dave-doty commented Dec 30, 2019

davidmorgan commented Jan 2, 2020

dave-doty commented Jan 3, 2020

davidmorgan commented Jan 3, 2020

dave-doty commented Jan 3, 2020 •

edited

Loading

dave-doty commented Jan 5, 2020

davidmorgan commented Jan 6, 2020

possible to copy already-computed memoized fields in build()? #774

possible to copy already-computed memoized fields in build()? #774

Comments

dave-doty commented Dec 28, 2019 • edited Loading

dave-doty commented Dec 28, 2019 • edited Loading

davidmorgan commented Dec 30, 2019

dave-doty commented Dec 30, 2019

davidmorgan commented Jan 2, 2020

dave-doty commented Jan 3, 2020

davidmorgan commented Jan 3, 2020

dave-doty commented Jan 3, 2020 • edited Loading

dave-doty commented Jan 5, 2020

davidmorgan commented Jan 6, 2020

dave-doty commented Dec 28, 2019 •

edited

Loading

dave-doty commented Dec 28, 2019 •

edited

Loading

dave-doty commented Jan 3, 2020 •

edited

Loading