Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move semantics for SendPort #49587

Open
gaaclarke opened this issue Aug 3, 2022 · 6 comments
Open

Move semantics for SendPort #49587

gaaclarke opened this issue Aug 3, 2022 · 6 comments
Labels
area-vm Use area-vm for VM related issues, including code coverage, FFI, and the AOT and JIT backends. type-enhancement A request for a change that isn't a bug

Comments

@gaaclarke
Copy link
Contributor

gaaclarke commented Aug 3, 2022

This tracker is for issues related to:

  • Dart VM

Description

We should be able to transfer any object over SendPort in constant time if we know:

  1. Isolates share a heap
  2. There is one reference to an object
  3. The object is never used again after it is sent to a SendPort

To make this possible I propose we add a new class called Unique that when constructed sent over a SendPort asserts that there is one reference to each object in the object graph and nulls out its reference. That way we can have constant-time communication of any object over SendPorts. It isn't asserted at compile time, but this might be the best we can do with Dart.

The printed timestamps should almost be the same, today the difference is 12s locally:

// @dart=2.12
import 'dart:isolate';

class Unique<T> {
  T? get value => _value;
  T? _value;
  Unique(this._value) {
    // Crawl through the object tree and ensure there is only one reference to
    // each object, otherwise throw exception.
    // assert(isUnique(_value));
  }
}

Map<String, Object> makeLargeMap(int n) {
  Map<String, Object> result = {};
  for (int i = 0; i < n; ++i) {
    result[i.toString()] = i;
  }
  return result;
}

void isolateMain(SendPort sendPort) {
  ReceivePort receivePort = ReceivePort();
  sendPort.send(receivePort.sendPort);
  receivePort.listen((message) {
    var foo = Unique<Map<String, Object>>(makeLargeMap(message));
    print(DateTime.now().millisecondsSinceEpoch);
    sendPort.send(foo);
    // assert(foo.value == null);
  });  
}

void main() async {
  ReceivePort receivePort = ReceivePort();
  Isolate.spawn(isolateMain, receivePort.sendPort);
  SendPort sendPort = await receivePort.first;
  sendPort.send(10000000);
  print(await receivePort.first);
  print(DateTime.now().millisecondsSinceEpoch);
}

Related language request for move semantics: dart-lang/language#2355 (this has compile time checks for uniqueness)
cc @mkustermann

Alternatives considered

@lrhn
Copy link
Member

lrhn commented Aug 3, 2022

Not sure how you would efficiently assert that there is only one reference to each object in the object graph. That sounds like something which needs a full GC-equivalent memory scan to ensure. (Mark the transitive closure of the unique object in one color, then mark from the roots in another color, stopping at the unique object, and see that you don't re-mark any object in both colors).

That's already better than "only referenced once" because it allows the unique object graph to refer to the same object more than once, as long as there are no references from outside that graph.

Also, it probably still won't work, since it means the graph cannot safely contain null, booleans, integers or strings, because you don't know whether those are shared. Same with constants.
I guess some objects (immutable ones mainly) can be freely shared, it's only references to mutable objects which matter.

I think simply making "immutable objects" a concept (anything created using const constructors with immutable arguments, even if the call is not const itself), and simply allowing those to be shared cheaply.

Trying to contain a Dart object reference, while providing access to the object, is something I very much doubt will ever work.

@gaaclarke
Copy link
Contributor Author

That sounds like something which needs a full GC-equivalent memory scan to ensure.

I guess it depends on how the garbage collector was implemented. I was assuming it would be cheap to crawl the subsection of memory that is referenced from an object recursively and count those references per object. I never looked at the Dart GC implementation, but it's probably not the case now that I consider it. You'd probably need a doubly linked graph of objects that's readily available at any time, which we'd never have.

I guess some objects (immutable ones mainly) can be freely shared, it's only references to mutable objects which matter.

Yea, Martin has already talked about making these cheap for sending over SendPort, so they there is already an idea for detecting them.

Trying to contain a Dart object reference, while providing access to the object, is something I very much doubt will ever work.

I don't follow this statement.

Maybe my proposal for compile-time checks for uniqueness was a better way to go. I already closed it thinking this would be easier to implement.

@mkustermann
Copy link
Member

As @lrhn says it is a very hard problem to maintain the there is only one reference guarantee.

Rephrasing the requirement of there is only one reference to an object is basically saying, there's a unique owner. So we'd need to implement ownership tracking - either statically or dynamically - which is hard to do.

On top of that, sending across SendPort would then become limited to sending object trees, because DAGs or even graphs with cycles have more than one reference to an object.

So we're down to sending mutable trees at this point. Do they have to be mutable?

@mraleph
Copy link
Member

mraleph commented Aug 4, 2022

I was assuming it would be cheap to crawl the subsection of memory that is referenced from an object recursively and count those references per object.

This would not work, because you are looking for incoming references, rather than outgoing references.

You will need to crawl the whole heap to find all incoming references to objects within the subgraph referenced by the given root. It's going to be expensive.

That being said, I think there is a trick that might work here. If you know that you are creating objects for transfer, you could tell VM to allocate them all in a separate part of the heap and you use write-barrier to track when references to that part of the heap escape, i.e.

// o.f = v
def WriteBarrier(object, field, value):
  if pageOf(object).id != pageOf(value).id
    pageOf(value).selfContained = false
    if !value.isCanonical:
      pageOf(object).selfContained = false

Now when you are transfering an object to another isolate you could do the following:

def transfer(port, object):
  # Check if there is potential escape through thread stack or globals
  # Caveat: ignores current frame (object variable).
  iterateRoots(lambda o: pageOf(o).selfContained = false)
  if not pageOf(object).selfContained:
    raise 'Can't transfer object which potentially escaped'
  # Now we know that the object graph is self-contained within 
  # a number of heap pages and there are no references (except object)
  # to these pages. This means we can transfer the object. 

With this approach you would need to write the code like this:

Map<String, Object> makeLargeMap(int n) {
  // |allocateForTransfer| makes VM allocate all new objects in a separate page.
  return allocateForTransfer(() {
    Map<String, Object> result = {};
    for (int i = 0; i < n; ++i) {
      result[i.toString()] = i;
    }
    return result;
  });
}

Maybe something like this can work.

@mit-mit mit-mit added the area-vm Use area-vm for VM related issues, including code coverage, FFI, and the AOT and JIT backends. label Aug 4, 2022
@lrhn
Copy link
Member

lrhn commented Aug 5, 2022

Using allocateForTransfer would probably be zone-based, otherwise it won't work with async functions.

Using a separate page with a write barrier needs to check both directions: When references to the presumed nonshared objects escape, and when an already shared (other-page) reference is written into that page. Either write loses the ability to share the page. Presumably permanently, because if you clear either of the cross-page references, you may need to do an expensive test to check that the object graph is closed again.
You'll have references into the page while working with the objects, often on the stack, but also in the heap if using async functions. Those will die automatically (stack popped, continuations GC'ed after use). Cross-page reference counting won't work with GC.

It still means needing to share common objects, like numbers, strings, booleans and null and enums. All immutable, so that's fine. Sharing immutable objects in general would be a good start.

Factory constructors and object caches become a problem, when you can't predict whether a call actually performs the allocation in the current zone.

Seems potentially very complicated, if at all viable. (Maybe just make allocateForTransfer run in a new isolate, that ensures the graph is closed).

@fzyzcjy
Copy link
Contributor

fzyzcjy commented Oct 24, 2022

Another probably silly idea (just for brainstorm): Can we convert objects between mutable and immutable, and ensure mutable XOR shared-across-isolate semantics?

A typical usage:

  1. construct (mutable) object graph, and modify it freely
  2. lock it (make it immutable), then share (indeed not move) across isolate
  3. the second isolate can now read it, but no isolate can modify anything
  4. after second isolate exits (btw exit itself can transfer some data to main isolate in O(1) if wanted), the graph can be mutable again

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-vm Use area-vm for VM related issues, including code coverage, FFI, and the AOT and JIT backends. type-enhancement A request for a change that isn't a bug
Projects
None yet
Development

No branches or pull requests

7 participants