Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[embind] Add return value policy option for function bindings. #21692

Merged
merged 7 commits into from
May 13, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 2 additions & 0 deletions ChangeLog.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,8 @@ See docs/process.md for more on how version tagging works.
https://github.com/llvm/llvm-project/pull/90792), multivalue feature is now
enabled by default in Emscripten. This only enables the language features and
does not turn on the multivalue ABI.
- Embind now supports return value policies to better define object lifetimes.
See https://emscripten.org/docs/porting/connecting_cpp_and_javascript/embind.html#object-ownership for more information.

3.1.59 - 04/30/24
-----------------
Expand Down
100 changes: 85 additions & 15 deletions site/source/docs/porting/connecting_cpp_and_javascript/embind.rst
Original file line number Diff line number Diff line change
Expand Up @@ -203,15 +203,6 @@ to enable the closure compiler.
Memory management
=================

JavaScript only gained support for `finalizers`_ in ECMAScript 2021, or ECMA-262
Edition 12. The new API is called `FinalizationRegistry`_ and it still does not
offer any guarantees that the provided finalization callback will be called.
Embind uses this for cleanup if available, but only for smart pointers,
and only as a last resort.

.. warning:: It is strongly recommended that JavaScript code explicitly deletes
any C++ object handles it has received.

The :js:func:`delete()` JavaScript method is provided to manually signal that
a C++ object is no longer needed and can be deleted:

Expand All @@ -226,7 +217,8 @@ a C++ object is no longer needed and can be deleted:
y.delete();

.. note:: Both C++ objects constructed from the JavaScript side as well as
those returned from C++ methods must be explicitly deleted.
those returned from C++ methods must be explicitly deleted, unless a
``reference`` return value policy is used (see below).


.. tip:: The ``try`` … ``finally`` JavaScript construct can be used to guarantee
Expand All @@ -248,6 +240,19 @@ a C++ object is no longer needed and can be deleted:
}
}

Automatic memory management
---------------------------

JavaScript only gained support for `finalizers`_ in ECMAScript 2021, or ECMA-262
Edition 12. The new API is called `FinalizationRegistry`_ and it still does not
offer any guarantees that the provided finalization callback will be called.
Embind uses this for cleanup if available, but only for smart pointers,
and only as a last resort.

.. warning:: It is strongly recommended that JavaScript code explicitly deletes
any C++ object handles it has received.


Cloning and Reference Counting
------------------------------

Expand Down Expand Up @@ -344,31 +349,96 @@ The JavaScript code does not need to worry about lifetime management.
Advanced class concepts
=======================

.. _embind-object-ownership:

Object Ownership
----------------

JavaScript and C++ have very different memory models which can lead to it being
unclear which language owns and is responsible for deleting an object when it
moves between languages. To make object ownership more explicit, *embind*
supports smart pointers and return value policies. Return value
polices dictate what happens to a C++ object when it is returned to JavaScript.

To use a return value policy, pass the desired policy into function or method
bindings. For example:

.. code:: cpp

EMSCRIPTEN_BINDINGS(module) {
function("createData", &createData, return_value_policy::take_ownership());
}

Embind supports three return value policies that behave differently depending
on the return type of the function. The policies work as follows:

* *default (no argument)* - For return by value and reference a new object will be allocated using the
object's copy constructor. JS then owns the object and is responsible for deleting it. Returning a
pointer is not allowed by default (use an explicit policy below).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, even with the new text I'm confused here. This is probably my own fault 😄 But say we have a function that returns by value,

Object getObject() { .. }

What is, effectively, the bindings code for that? The text here says the copy constructor is called, so I am imagining

void* bind_getObject() {
  Object copy = getObject(); // copy constructor occurs here during the `=` operation
  ...
}

And somehow in the ... we transfer ownership to JS. But how does copy not end up collected by C++ when that scope ends?

To put my question another way, how can you call the copy constructor without opting into automatic memory management?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The copy constructor is explicitly called with a new. e.g. the code for the above would be:

void* bind_getObject() {
  return new Object(getObject());
}

Or in terms of the embind, that happens in the toWireType function here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, thanks, so it's a raw new that uses the copy constructor. Makes sense to me now.

* :cpp:type:`return_value_policy::take_ownership` - Ownership is transferred to JS.
* :cpp:type:`return_value_policy::reference` - Reference an existing object but do not take
ownership. Care must be taken to not delete the object while it is still in use in JS.

More details below:

+--------------------+-------------+---------------------------------------------------------------+
| Return Type | Constructor | Cleanup |
+====================+=============+===============================================================+
| **default** |
+--------------------+-------------+---------------------------------------------------------------+
| Value (``T``) | copy | JS must delete the copied object. |
+--------------------+-------------+---------------------------------------------------------------+
| Reference (``T&``) | copy | JS must delete the copied object. |
+--------------------+-------------+---------------------------------------------------------------+
| Pointer (``T*``) | n/a | Pointers must explicitly use a return policy. |
+--------------------+-------------+---------------------------------------------------------------+
| **take_ownership** |
+--------------------+-------------+---------------------------------------------------------------+
| Value (``T``) | move | JS must delete the moved object. |
+--------------------+-------------+---------------------------------------------------------------+
| Reference (``T&``) | move | JS must delete the moved object. |
+--------------------+-------------+---------------------------------------------------------------+
| Pointer (``T*``) | none | JS must delete the object. |
+--------------------+-------------+---------------------------------------------------------------+
| **reference** |
+--------------------+-------------+---------------------------------------------------------------+
| Value (``T``) | n/a | Reference to a value is not allowed. |
+--------------------+-------------+---------------------------------------------------------------+
| Reference (``T&``) | none | C++ must delete the object. |
+--------------------+-------------+---------------------------------------------------------------+
| Pointer (``T*``) | none | C++ must delete the object. |
+--------------------+-------------+---------------------------------------------------------------+

.. _embind-raw-pointers:

Raw pointers
------------

Because raw pointers have unclear lifetime semantics, *embind* requires
their use to be marked with :cpp:type:`allow_raw_pointers`.
their use to be marked with either :cpp:type:`allow_raw_pointers` or with a
:cpp:type:`return_value_policy`. If the function returns a pointer it is
recommended to use a :cpp:type:`return_value_policy` instead of the general
:cpp:type:`allow_raw_pointers`.

For example:

.. code:: cpp

class C {};
C* passThrough(C* ptr) { return ptr; }
C* createC() { return new C(); }
EMSCRIPTEN_BINDINGS(raw_pointers) {
class_<C>("C");
function("passThrough", &passThrough, allow_raw_pointers());
function("createC", &createC, return_value_policy::take_ownership());
}

.. note::

Currently the markup serves only to allow raw pointer use, and
show that you've thought about the use of the raw pointers. Eventually
we hope to implement `Boost.Python-like raw pointer policies`_ for
managing object ownership.
Currently allow_raw_pointers for pointer arguments only serves to allow raw
pointer use, and show that you've thought about the use of the raw pointers.
Eventually we hope to implement `Boost.Python-like raw pointer policies`_ for
managing object ownership of arguments as well.

.. _embind-external-constructors:

Expand Down