Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend Node-API to libnode #43542

Open
wants to merge 16 commits into
base: main
Choose a base branch
from
Open

Conversation

mmomtchev
Copy link
Contributor

This PR adds support for instantiation and further interaction with an embedded version of Node.js (libnode) entirely through Node-API.

Refs: #43516

@nodejs-github-bot
Copy link
Collaborator

Review requested:

  • @nodejs/gyp
  • @nodejs/node-api

@nodejs-github-bot nodejs-github-bot added lib / src Issues and PRs related to general changes in the lib or src directory. needs-ci PRs that need a full CI run. labels Jun 22, 2022
doc/api/n-api.md Outdated
@@ -515,6 +515,8 @@ the currently running Agent which was set by means of a previous call to
`napi_set_instance_data()` will be overwritten. If a `finalize_cb` was provided
by the previous call, it will not be called.

Not compatible with `libnode`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would this mean?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That API call is not to be used when using libnode

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because you don't need it - in this case it is the C++ software that will create environments, so it doesn't need the ability to carry its own context in this structure - and because the instance_data pointer is used to carry the libnode context.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but that doesn’t mean that this call is not compatible with libnode or with embedded mode.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does because, before these new methods, there was no way to use Node-API with libnode at all.

doc/api/n-api.md Outdated
@@ -6267,6 +6269,95 @@ idempotent.

This API may only be called from the main thread.

## Using Node.js as a shared library (`libnode`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has nothing to do with shared library mode.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This API allows using libnode through Node-API and it has no other uses.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That’s … wrong? Embedding and usage as a shared library are conceptually related but technically orthogonal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, let's call it Using an embedded version of Node.js

<!--introduced_in=REPLACEME-->

As an alternative, an embedded Node.js can also be fully controlled through
Node-API. This API supports both C and C++ through [node-addon-api][].
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do this, we should probably start adding converters between some Node-API values and V8/Node.js C++ API values (napi_valueLocal<Value>, napi_envEnvironment*/Isolate*).

The reason nobody has implemented a Node-API embedder API yet is that as an embedder, you often need fine-grained control over a lot of V8 options, and it’s not very practical to mirror the entire V8/Node.js embedder API to C (and that the Node.js C++ embedder API itself still has some rough edges that should probably be ironed out first).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you need to use raw V8 primitives then you can't use Node-API - this applies to both Node.js -> C++ (binary node addons) and to C++ -> Node.js (libnode). Still this API covers lots of use cases - mostly C++ software that needs to accept JS plugins. It does not cover Electron which has very particular needs.
Allowing for more fine-grained control over the initialisation is of course something that should be considered.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you need to use raw V8 primitives then you can't use Node-API

Right, that’s why I’m bringing this up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should you be able to mix V8 primitives and Node-API is another topic. If you take the ABI compatibility out of Node-API, there remains little use for it. However node-addon-api is a truly higher level API and it is much more practical to use from C++ then raw V8. Maybe there is a use case for software which has 95% node-addon-api and 5% custom V8 code. But this is something that goes beyond this PR - and it probably can be used both by embedders and addon authors.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe Cloudflare (service) Workers will need converters. Is this a Private API thing?

@addaleax addaleax added node-api Issues and PRs related to the Node-API. embedding Issues and PRs related to embedding Node.js in another project. labels Jun 23, 2022
Copy link
Member

@addaleax addaleax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would strongly encourage you to go through the current embedder API and understand how embedders use it. I would love to see a C embedding API, but since the goal of Node-API-style APIs is to provide long-term stability, it’s going to need to be an API that is designed to account for multiple usage patterns (e.g.: integration with libuv, usage with a node-addon-api-style C++ wrapper) and probably some kind of API versioning in advance (since Node.js and V8 embedding changes their API definitions much more frequently than JS changes as a language).

Comment on lines 905 to 909
{
int r = node::SpinEventLoop(node_env->node_env()).FromMaybe(1);
if (exit_code != nullptr) *exit_code = r;
node::Stop(node_env->node_env());
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s going to be heavily counterintuitive that this is part of a destruction operation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once you have drained the event loop, you have no other course of action then to destroy the environment - this is the reason I did it this way. There is no point in having a separate API call, if you will always have to call both of them one after the other. The real question is can we (and should we) keep an environment when its event loops is empty? This would allow to call async functions in a persistent environment but it will introduce a significant difference between the stand-alone Node.js and a embedded Node.js.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could want to take an action in response to the event loop being drained, though?

The real question is can we (and should we) keep an environment when its event loops is empty?

I think the answer to both questions here is yes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, but this would require splitting node::SpinEventLoop into iteration and cleanup and it is something that is not possible with the current embedders' API.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

… but it is? The docs for SpinEventLoop say which steps it takes, and all of those are public API. SpinEventLoop() is a helper, not a building block, and exists only for convenience and stronger API stability for those for whom it is sufficient.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I added napi_run_environment that spins up the event loop without destroying the environment when it is emptied and a unit test with an async callback.

@KevinEady
Copy link
Contributor

We discussed this PR in the 24 June 2022 Node API meeting. Our main concerns are:

  1. As @addaleax mentioned, people who are embedding Node into their application may require specific, fine-grained startup and configuration of the engine. We fear that creating any Node-API wrapper around engine initialization would bypass the ability to modify the engine startup properties as necessary. Furthermore, if there were engine-specific configurations, these concepts may not extend to other engines.
  2. This PR creates V8-specific objects like the platform. Other engines, eg. Hermes, does not have these concepts and it may be incorrect to label these types as such. Additions to Node-API must be platform-agnostic. We have some baseline requirements in Contributing a new API to Node-API.

As of now, Node does not provide any APIs that are "Node-engine specific", because our Node-API is meant to be engine-agnostic. The existing Node-specific APIs, eg. node::InitializeNodeWithArgs are available within the C++ language but not in an ABI-stable C language. It may be beneficial to create some APIs that are outside of the Node-API scope, but are still C-based and provide ABI stability in order to perform this embedding scenario, where we can ignore the platform-agnostic requirement, ie. inside a different header file.

@mmomtchev
Copy link
Contributor Author

@KevinEady

  1. I agree that this API would not cover all use cases - in particular Electron has very specific requirements - but if you look at these exchanges, you will see that even their team was interested in some form of ABI or at least ABI stability. Still, such an API will be sufficient for a very significant portion of the uses cases. What needs discussing is:
  • How can greater customisability of the engine startup be supported and where should the line be drawn
  • Should we support mixed use with conversions between v8::Local and napi
  1. I don't think that napi_platform is V8-specific - every engine would have a common main initialisation procedure and will support a single or many independent environments.

@KevinEady
Copy link
Contributor

We discussed in the 1 July 2022 Node API meeting, and we did want to bring up again the comment we had before:

It may be beneficial to create some APIs that are outside of the Node-API scope, but are still C-based and provide ABI stability in order to perform this embedding scenario, where we can ignore the platform-agnostic requirement, ie. inside a different header file.

There is a logical separation between "module extensions API" and "node hosting API" so it may be beneficial to separate this embedding API into a separate header.

Regarding:

Should we support mixed use with conversions between v8::Local and napi

Can you clarify this a bit?

@mmomtchev
Copy link
Contributor Author

Should we support mixed use with conversions between v8::Local and napi

Can you clarify this a bit?

Public API for retrieving the V8 reference from a napi* handle and a constructor for creating a napi* handle from a V8 reference. It goes against the Node-API logic, but it would allow extending Node-API without modifying it.

@mmomtchev
Copy link
Contributor Author

@KevinEady @addaleax @rvagg My goal is to be able to support installation of this (libnode + this API) along a NodeSource Node.js package

Basically, I have two choices:

  • Either libnode & libnode-dev are completely independent of nodejs and install headers to /usr/include/libnode
  • Either I install this single new header file /usr/include/node - and then I require that nodejs is present and it is a specific version

Whatever I do at this point will likely stay. Do you have any opinion?

@addaleax
Copy link
Member

addaleax commented Jul 5, 2022

@mmomtchev Two things:

  • I would not split header files from the existing header bundle, since the default for embedders will still be using the regular C++ APIs that are also used by addons
  • I would consider shipping the shared library variant of Node.js completely separate and independent from any embedding API additions or changes

@mmomtchev
Copy link
Contributor Author

Ok, this is what I currently have - libnode93 and libnode108 which can be installed independently of nodejs - all three can be installed at the same time - and libnode-dev which is mutually exclusive with nodejs - because both install /usr/include/node

If/when this PR goes through, libnode-dev won't be necessary anymore

@mmomtchev
Copy link
Contributor Author

@addaleax And do you have an idea for an elegant solution for having top-level functions appear in the global object the way they do in the REPL, besides vm.runInThisContext()? Ideally I would like to hide this from the end user.

@addaleax
Copy link
Member

addaleax commented Jul 6, 2022

@mmomtchev If somebody wants that, they can just do the same thing the REPL does to make that happen, right?

@mmomtchev
Copy link
Contributor Author

mmomtchev commented Aug 7, 2022

@addaleax I have a problem with restarting the event loop since NodePlatform::DrainTasks :

void NodePlatform::DrainTasks(Isolate* isolate) {

can be run only once per isolate. If I don't drain the tasks, some pending Promises (the dynamic import for example) do not get resolved. If I drain them, they do not get rescheduled. This is valid even when using the existing embedders' API. Do you have any opinion about it?

UPDATE: It is because node::PerIsolatePlatformData::FlushTasks is not called when using the embedders' API
UPDATE2: I found my problem (a resolved Promise does not keep the event loop from exiting) - but from having spent so much time in this part of the code, I think that all that FlushTasks/DrainTasks mess is completely useless. It is there for only one reason - because the flush_tasks_ async handle is unrefed to not prevent the event loop from exiting. This mechanism is called only a handful of times and if it was calling async_init / async_close every time, all those special edge cases wouldn't have been needed.

@mhdawson
Copy link
Member

mhdawson commented Sep 20, 2022

I was experimenting with using Node.js to support WASM containers. For that I needed to use libnode with crun. Unfortunately crun is current a C project and integrating the C++ API from libnode was going to be difficult (if crun would even consider moving to C++ versus C).

The model is to dlsym the required methods as crun support multiple engines and only wants to load the shared libraries needed.

With this PR is was relatively straight forward to get it working by using the new methods to start/get an env and then existing node-api methods to create and start a script to run.

At this for the simple run a script case it seemed to be enough. was experimenting with using Node.js to support WASM containers. For that I needed to use libnode with crun. Unfortunately crun is current a C project and integrating the C++ API from libnode was going to be difficult (if crun would even consider moving to C++ versus C).

The model is to dlsym the required methods as crun support multiple engines and only wants to load the shared libraries needed.

At this for the simple run a script case it seemed to be enough.

You can see what it used here: https://github.com/mhdawson/crun/blob/node-wasm-experiment/src/libcrun/handlers/wasmtime.c

@mmomtchev mmomtchev force-pushed the napi-libnode branch 5 times, most recently from f0b96b4 to 33b170d Compare October 10, 2022 20:14
@@ -102,6 +102,34 @@ node_api_symbol_for(napi_env env,
size_t length,
napi_value* result);
#endif // NAPI_EXPERIMENTAL

#ifdef NAPI_EMBEDDING
NAPI_EXTERN napi_status NAPI_CDECL napi_create_platform(int argc,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently the Node-API is split up between two files: js_native_api.h and node_api.h.
The difference is that the js_native_api.h contains only cross-engine JS-specific APIs and node_api.h has Node.JS-specific APIs.
We use js_native_api.h in different scenarios where we want to interact with different JS engine using ABI-safe Node-API functions. For example, see the Babylon Native, Hermes Node-API or V8-JSI. In all these scenarios we implement js_native_api.h APIs, and we do not use Node.JS or Node-specific node_api.h. Even the js_native_api_v8.h and js_native_api_v8.cc must be free from any Node.JS specific code. Any interaction with the Node-JS specific code must be done either through the functions declared in the js_native_api_v8_internals.h or overriding virtual napi_env__ in the derived node_napi_env__ class in the node_api_internals.h.
The node_api.h has the Node.js specific functions and mostly contains code concerning Node-JS modules and asynchronous code.
From this perspective, the Node.JS embedding API must not be part of the js_native_api.h header file. It could be in the node_api.h, but since this file is to be used by modules, the best option is to put these new APIs in their own node_hosting_api.h or node_embedding_api.h file. Other applications, such as Deno, could implement the full support for node_api.h and js_native_api.h to reuse the Node-API modules, but they would never implement the same Node.JS embedding APIs.
These embedding APIs are Node.JS specific and cannot be re-used for other engines or applications. E.g. other engines do not create a thread pool on their own, or cannot handle Node.JS CLI arguments,
It could be either pa

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vmoroz I moved everything to a new header file, however I would just like to point out that if all those other engines do use N-API, then they would eventually support embedding, and this could become an universal API.
Currently options are passed as a C array of strings, which is an universal method - even if the options will probably be different - and the only one which is Node.js-centric is the thread_pool_size which probably should get transformed to something else before the API is set in stone.

napi_destroy_platform(napi_platform platform);

NAPI_EXTERN napi_status NAPI_CDECL
napi_create_environment(napi_platform platform,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please note that the term Environment is somehow overused in Node.JS.
The napi_env is different from the Node.JS Environment.
We create one instance of napi_env per each native module. It allows using the napi_env instance settings and other flags per module. Each Node.JS environment may have zero or more napi_env instances depending on how many Node-API modules it loads.
It is important to clarify what type of environment the napi_create_environment function creates. I bet it is not the napi_env, but rather a moniker to the Node.JS Envrironment, which I guess must have a different type such as napi_node_instance, napi_node_env, or something else along those lines.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

napi_create_environment creates a napi_env instance linked to a Node Environment containing an additional context pointer (set_embedded/get_embedded). In theory (one must take care to register the cleanup only once) there can also be multiple napi_env instances for a single embedded Environment - but I don't see any reason why.

The whole point of the NAPI embedding is that it allows reusing of the NAPI functions for the embedded environment. So whatever type I return, that type must remain compatible with all the NAPI functions. In the C++ API, I return a different type (inherited from the basic Node::Env), but I don't see any practical way to do this in C. If I return a different type, you will have to cast it every time you invoke a NAPI function.

I don't understand the problem with the current situation - every napi_env is linked to a single Environment. When embedding, it is the Environment that is slightly different, not the napi_env.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I name this function napi_create_instance but still return a napi_env, does this address your concern?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or there is the quite cumbersome option:

napi_embedded_instance inst;
napi_create_instance(platform, NULL, NULL, &inst);
napi_value global;
napi_get_global(napi_instance_env(inst), &global);

It is ugly in C, but it will be covered up in C++

@mmomtchev
Copy link
Contributor Author

@vmoroz @gabrielschulhof @addaleax
I see that now there is a node::InitializeOncePerProcess that does not return an array of errors - something that didn't really work before anyways - and takes only a single list of CLI arguments instead of the argv/exec_argv separation. This means that napi_create_platform_version must change accordingly.

@mmomtchev
Copy link
Contributor Author

I adapted this to the latest changes in the embedding initialization which impacted my API as well - napi_create_platform and napi_create_environment . I opened #49512 and #49513 for the two unit test failures which are not related to this PR.

@samardzicneo
Copy link

May one ask what the holdup is on this PR? Has OP lost interest? I'm part of a project that relies on this API and we're not too keen on using an unofficial fork and having to build Nodejs manually for all the platforms we support.

@mhdawson
Copy link
Member

@samardzicneo do you have cycles to help get it across the line? I think @vmoroz also showed interest in moving this forward. @mmomtchev are you still going to have cycles or would it be good to have either @samardzicneo, @vmoroz or both try to move it forward?

@samardzicneo
Copy link

@samardzicneo do you have cycles to help get it across the line? I think @vmoroz also showed interest in moving this forward. @mmomtchev are you still going to have cycles or would it be good to have either @samardzicneo, @vmoroz or both try to move it forward?

Really, I'd love to do so, especially since I need this for a project of mine, but honestly, I don't trust my C/C++ skills enough to be sending PRs to Node 😅

Believe me, you don't want my messy code anywhere near your runtime.

@vmoroz
Copy link
Member

vmoroz commented Mar 8, 2024

@mmomtchev , I have moved the embedded code out of js_native_api_v8.cc to the new file node_api_embedding.cc. The main reason is that we support a scenario where the js_native_api_v8.cc can be used outside of Node.js code on its own.
(In node_api_embedding.cc line 81 is changed to have an explicit static_cast to suppress MSVC warning for int64_t to int32_t conversion).
Please review the change.

// and the v8::locker
delete emb_env;

cppgc::ShutdownProcess();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This call to ShutdownProcess() prevents another environment instance from being created in the same process. Specifically, a second call to napi_create_environment() crashes the process:

# Check failed: (g_page_allocator) != nullptr.

I found this because the node-api-dotnet project runs multiple embedding test cases in a process, and each test case uses a separate environment instance. With this line removed, the latest PR revision works well.

Was there a reason to add this call to ShutdownProcess()?

@vmoroz
Copy link
Member

vmoroz commented Mar 8, 2024

@mmomtchev , the code is rebased on the latest main code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
embedding Issues and PRs related to embedding Node.js in another project. lib / src Issues and PRs related to general changes in the lib or src directory. needs-ci PRs that need a full CI run. node-api Issues and PRs related to the Node-API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet