Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide a way to pass an existing Uint8List to FFI functions without copying #51632

Open
jamesderlin opened this issue Mar 4, 2023 · 4 comments
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. library-ffi

Comments

@jamesderlin
Copy link
Contributor

AFAIK, there is no good way to pass an existing Dart Uint8List to an FFI function without allocating memory and copying the data. The typical recommendation avoid copying is to start with a Pointer<Uint8> and instead use .asTypedList when passing it to Dart code, but that's awkward:

  • Memory needs to be manually freed (or some system needs to be implemented to register finalizers).
  • Lengths need to be passed around, which is error-prone, or additional abstractions must be built on top to encapsulate the Pointer and length.
  • It doesn't help when dealing with Dart code that returns Uint8Lists. The caller cannot control how that memory was allocated.

My understanding is that the underlying memory buffer for a Uint8List can't be passed because it might be moved or reclaimed by the garbage collector. But if so, then would it be infeasible to add some mechanism to Uint8List (and possibly to Uint16List, etc.) to lock and unlock its memory buffer and prevent the GC from touching it?

@lrhn lrhn added area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. library-ffi labels Mar 4, 2023
@mraleph
Copy link
Member

mraleph commented Mar 5, 2023

But if so, then would it be infeasible to add some mechanism to Uint8List (and possibly to Uint16List, etc.) to lock and unlock its memory buffer and prevent the GC from touching it?

No, in general that is not feasible, at least not with our current GC which assumes that everything can move. It's even worse in the young generation which managed by copying collector: all young objects move when GC happens. We are evaluating some alternative GC architectures which don't have similar requirements/behavior, but the shift to such an architecture is not going to happen any time soon (or at all).

I think some variation of #50507 is the best we can hope for short term.

One possibility we could consider is that we could choose to allocate all buffers larger than certain size in the external memory, rather than allocating them in the GC managed memory. That would open a possibility of passing such buffers to native code without copying.

@ds84182
Copy link
Contributor

ds84182 commented Mar 6, 2023

This may not be possible, but could TypedData be upgraded in place to ExternalTypedData?

The following extensions would be exposed from dart:ffi:

extension ExternalTypedData on TypedData {
  /// Pins the backing data store of this TypedData to the external heap, if the data currently exists in
  /// Dart's GC managed memory.
  ///
  /// If the backing data store already is in the external heap, this does nothing.
  ///
  /// This may be expensive for large allocations.
  external void pin();

  /// Whether the typed data is pinned to the external heap.
  ///
  /// Attempts to get the pointer to the backing data store will fail if not pinned.
  external bool get isPinned;

  /// Returns a raw pointer to the data stored on the external heap, or nullptr if backed
  /// by Dart's GC managed memory.
  external Pointer<Void> get rawPointer;
}

extension ExternalUint8List on Uint8List {
  /// Returns a Uint8List that is guaranteed to be pinned to the external heap.
  ///
  /// This is faster than allocating a Uint8List then pinning it to external heap.
  external static Uint8List alloc(int length);

  /// Returns a copy of the given [list] that is guaranteed to be pinned to the external heap.
  external static Uint8List clone(Uint8List list);

  /// Returns a pointer to the data stored on the external heap, or nullptr if backed
  /// by Dart's GC managed memory.
  Pointer<Uint8> get pointer => rawPointer.cast();
}
// ...

For VM semantics, this would copy the contents to the heap via malloc + memcpy, then turn the TypedData into ExternalTypedData in-place. From what I understand about how TypedData is represented in the heap, the pointer field exists for both TypedData and ExternalTypedData so it can treat them the same at runtime.

The only weirdness here is the size of the heap object technically shrinks. It may have to be reallocated from new-space and then have a forwarding pointer installed on the original object (if I understand the GC correctly).

All other properties of the originating TypedData are maintained. If the TypedData is unmodifiable and happens to be GC managed, this will still work while remaining unmodifiable.

@mraleph
Copy link
Member

mraleph commented Mar 6, 2023

This may not be possible, but could TypedData be upgraded in place to ExternalTypedData?

Yes, it is possible. In fact we used to have similar API for strings, which we happily deleted. Main issue with this API is that it violates a very important invariant that objects never change their class-id / layout.

This API only helps when the same object cross boundaries repeatedly: if you cross the boundary a single time (e.g. to pass some array to an IO syscall or some decoding routine), then there is no real benefit over just copying it explicitly.

@dcharkes
Copy link
Contributor

external static Uint8List alloc(int length);

Why not use calloc() from package:ffi and call asTypedList. That would achieve the same thing, correct?

If you can have your APIs consume an external typed data to populate rather than ending up with an internal one from the Dart code that would avoid the copying.

external static Uint8List clone(Uint8List list);

If you have to copy, we should give you an efficient copy:

This would be able to copy efficiently between Pointers, internal TypedData and external TypedData.

Please upvote that issue and post your use case there.

(As Slava points out, pinning in the GC will create all kinds of trouble.)

If you are doing leaf-calls, we could possibly pass in the internal Uint8list itself without copying:

However, the callee should not hold on to it, and the callee should not take very long because it blocks the Dart isolate from running Dart code and GCing. So that might not be the best solution.

Feel free to post your use case and upvote that issue as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. library-ffi
Projects
None yet
Development

No branches or pull requests

5 participants