Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

worker: improve integration with native addons (take II) #26175

Closed
wants to merge 11 commits into
base: master
from

Conversation

Projects
None yet
7 participants
@addaleax
Copy link
Member

addaleax commented Feb 17, 2019

Allow loading add-ons from multiple Node.js instances if they are
declared context-aware; in particular, this applies to N-API addons.

Also, plug a memory leak that occurred when registering N-API addons.

Refs: #23319
Fixes: #21481
Fixes: #21783
Fixes: #25662
Fixes: #20239

/cc @nodejs/workers @nodejs/n-api

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • documentation is changed or added
  • commit message follows commit guidelines
Show resolved Hide resolved doc/api/worker_threads.md Outdated
@addaleax

This comment has been minimized.

Copy link
Member Author

addaleax commented Feb 18, 2019

@addaleax

This comment has been minimized.

Copy link
Member Author

addaleax commented Feb 19, 2019

Should have fixed the linter and Windows issues. New CI: https://ci.nodejs.org/job/node-test-pull-request/20895/

This might require some debugging on AIX.

@addaleax

This comment has been minimized.

Copy link
Member Author

addaleax commented Feb 19, 2019

I'll try to look into AIX, but I might have to request ssh access for that.

@gireeshpunathil Do you know anything off the top of your head that would lead to AIX-specific here? Maybe dlopen() returns different pointers when loading the same shared object twice, unlike other systems?

@gireeshpunathil

This comment has been minimized.

Copy link
Member

gireeshpunathil commented Feb 20, 2019

@addaleax: yes, AIX returns a unique value each time dlopen is called on the same object.

excerpt from the manual:

       The dlopen subroutine loads the module specified by FilePath into the executing process's
       address space. Dependents of the module are automatically loaded as well. If the module is
       already loaded, it is not loaded again, but a new, unique value will be returned by the dlopen
       subroutine.

cat dl.cc

#include <dlfcn.h>
#include <stdio.h>

int main(int argc, char *argv[]) {
  for(int i = 0; i < 10; i++) {
    void* addr = dlopen(argv[1], RTLD_NOW | RTLD_GLOBAL);
    if (addr == NULL) {
      fprintf(stderr, "error loading the lib\n");
      return -1;
    }
    fprintf(stderr, "address %d: %p\n", i, addr);
 }
}

Linux:

./a.out foo.so

address 0: 0x1ede040
address 1: 0x1ede040
address 2: 0x1ede040
address 3: 0x1ede040
address 4: 0x1ede040
address 5: 0x1ede040
address 6: 0x1ede040
address 7: 0x1ede040
address 8: 0x1ede040
address 9: 0x1ede040

AIX:
./a.out foo.so

address 0: 3
address 1: 4
address 2: 5
address 3: 6
address 4: 7
address 5: 8
address 6: 9
address 7: a
address 8: b
address 9: c

Let me know if you want any other specific info, or assistance.

@addaleax

This comment has been minimized.

Copy link
Member Author

addaleax commented Feb 20, 2019

@gireeshpunathil Thanks, that is indeed very helpful. :)

I think what I’ll attempt to do for AIX is to implement wrappers for dlopen() that provides the same return value when re-loading an add-on, based on running stat() and checking the dev + ino values. It’s not perfect, but I think it should make this work.

If you have other ideas, I’d be happy to hear them.

@gireeshpunathil

This comment has been minimized.

Copy link
Member

gireeshpunathil commented Feb 20, 2019

@addaleax - I think that is a good approach. Other way is to use loadquery API to get the (long) list of loaded modules, search through that... I guess your approach is more simple and elegant.

@addaleax

This comment has been minimized.

Copy link
Member Author

addaleax commented Feb 20, 2019

@gireeshpunathil I’ve implemented that … it still does look a bit hacky (although probably less than a loadquery approach). I’m wondering if it makes sense to put the extra code into a separate file?

Either way, if you have the bandwidth I would definitely appreciate a review from you on this PR :)

CI: https://ci.nodejs.org/job/node-test-pull-request/20902/

@addaleax addaleax force-pushed the addaleax:worker-addons-ii branch from 7dae4e8 to 2a474aa Feb 20, 2019

@addaleax

This comment has been minimized.

Copy link
Member Author

addaleax commented Feb 20, 2019

CI on musl should be fixed now, too. 🤞

CI: https://ci.nodejs.org/job/node-test-pull-request/20908/

@gireeshpunathil

This comment has been minimized.

Copy link
Member

gireeshpunathil commented Feb 20, 2019

@addaleax - sure, will do. Need some time to go through the linked issues and previous PRs.

addaleax added some commits Feb 17, 2019

worker: improve integration with native addons
Allow loading add-ons from multiple Node.js instances if they are
declared context-aware; in particular, this applies to N-API addons.

Also, plug a memory leak that occurred when registering N-API addons.

Refs: #23319

@addaleax addaleax force-pushed the addaleax:worker-addons-ii branch from d1c6258 to 9c0f9d3 Feb 20, 2019

@addaleax

This comment has been minimized.

Copy link
Member Author

addaleax commented Feb 20, 2019

@gireeshpunathil

This comment has been minimized.

Copy link
Member

gireeshpunathil commented Feb 21, 2019

@addaleax - I am unable to connect some links, so thought I will ask:

  • Error: Module did not self-register: is the base problem right? how (where) are we addressing it (line numbers in the patch will help)

We are now managing the referencing count of loaded libraries.

Are these maintained across:

  • main thread (worker-less) loading the same library multiple times
    • repeated invokion of process.dlopen ends up only reference count increments
    • repeated unloads are ignored but reference adjusted. When reach 0, really unload
    • further requests are not honored
  • same scenarios, but in worker's case
  • worker loading libraries that were loaded by main
    • maintain the count within worker's scope, reset to the baseline upon exit
  • same scenarios, but between 2 or more workers without main's involvement
    ?
@addaleax

This comment has been minimized.

Copy link
Member Author

addaleax commented Feb 21, 2019

@gireeshpunathil Thanks for taking a look!

@addaleax - I am unable to connect some links, so thought I will ask:

* `Error: Module did not self-register`: is the base problem right? how (where) are we addressing it (line numbers in the patch will help)

Yes, that’s the error message that people currently get when they try to load a Node.js addon that uses the __attribute__((constructor)) approach for loading twice inside a process.

We are now managing the referencing count of loaded libraries.

For AIX, yes. Other operating systems/standard libraries already implement dlopen() using reference counting, so we don’t have to take care of that. (musl is the other outlier here, with not actually providing a functional dlclose() at all.) So, the whole #ifdef _AIX block is only there to align behaviour with other platforms, it’s not introducing new functionality.

Are these maintained across:

  • main thread (worker-less) loading the same library multiple times

    • repeated invokion of process.dlopen ends up only reference count increments

That’s correct. (As mentioned above, that’s already the case on almost all platforms). The core of this patch is the fact that, based on the handle returned from dlopen(), we store the node_module* pointer, and put a reference count on that as well.

  • repeated unloads are ignored but reference adjusted. When reach 0, really unload

That’s true, although we can’t realistically unload addons that were loaded on the main thread for backwards compatibility (that’s what #25577 fixed).

  • further requests are not honored

We should not be attempting to unload addons more times than we’ve loaded them, so I’m not sure if I’m understanding this correctly?

  • same scenarios, but in worker's case

  • worker loading libraries that were loaded by main

    • maintain the count within worker's scope, reset to the baseline upon exit

The reference counts, both for the shared object handle (whether we maintain that ourselves on AIX, or libc does that for us) and for the node_module* pointer, are both global/per-process, and not scoped to Workers or the main thread.

  • same scenarios, but between 2 or more workers without main's involvement

Hm – I’m not sure I understand what you’re asking here?

@gireeshpunathil

This comment has been minimized.

Copy link
Member

gireeshpunathil commented Feb 21, 2019

same scenarios, but between 2 or more workers without main's involvement

I mean to say, a dll was loaded by a worker, but a second worker that comes up starts loading the same. I guess I got the answer as, it is nothing different than a main - worker scenario. Am I right?

re: AIX, does this fact make anything better for you? i.e, AIX knows (internally) how many times the library load was attempted, and the unload unrolls that count. I closed more than I opened just to show that it actually remembers what it opened.

$ cat main.cc

#include <dlfcn.h>
#include <stdio.h>

int main(int argc, char *argv[]) {
  int* cache[3];
  for(int i = 0; i < 3; i++) {
    cache[i] = (int*) dlopen(argv[1], RTLD_NOW | RTLD_GLOBAL);
    if (cache[i]== NULL) {
      fprintf(stderr, "error loading the lib\n");
      return -1;
    }
    fprintf(stderr, "index: %d, handle: %p\n", i, (void*) cache[i]);
 }
  for(int i = 0; i < 5; i++) {
    fprintf(stderr, "%d\n", dlclose((void*) cache[i % 10]));
  }
}

$ ./a.out ./libfoo.so

index: 0, handle: 3
index: 1, handle: 4
index: 2, handle: 5
0
0
0
22
22
@addaleax

This comment has been minimized.

Copy link
Member Author

addaleax commented Feb 21, 2019

same scenarios, but between 2 or more workers without main's involvement

I mean to say, a dll was loaded by a worker, but a second worker that comes up starts loading the same. I guess I got the answer as, it is nothing different than a main - worker scenario. Am I right?

Yeah, the only difference that Worker threads vs the main thread makes is that the main thread doesn’t unload addons.

So that’s why in the test here, we’re seeing 2 full load + unload cycles when using 2 Workers, but only 1 call to static constructors/destructors when using the main thread and a Worker thread.

re: AIX, does this fact make anything better for you? i.e, AIX knows (internally) how many times the library load was attempted, and the unload unrolls that count. I closed more than I opened just to show that it actually remembers what it opened.

I’m not sure… what we need is a fixed key for caching the node_module* for a given shared object. Would *(int*)handle be that? I don’t want to run into problems relying on AIX’s internals, given how much trouble we’ve been having with people hooking into our internals 😄

@gireeshpunathil

This comment has been minimized.

Copy link
Member

gireeshpunathil commented Feb 21, 2019

@addaleax - agreed. we can't rely on undocumented opaque variables. And those handles are not even native addresses, even to attempt one.

@addaleax

This comment has been minimized.

Copy link
Member Author

addaleax commented Feb 21, 2019

@gireeshpunathil Thank you!

@jasnell Does your approval still stand? The PR has changed a bit since you approved, but otherwise it should be ready. :)

@richardlau

This comment has been minimized.

Copy link
Member

richardlau commented Feb 21, 2019

In case anyone wants to reference the AIX documentation for dlopen() quoted by @gireeshpunathil, it can be found: https://www.ibm.com/support/knowledgecenter/en/ssw_aix_71/com.ibm.aix.basetrf1/dlopen.htm

@addaleax

This comment has been minimized.

Copy link
Member Author

addaleax commented Feb 22, 2019

Landed in 8ebd339 🎉

@addaleax addaleax closed this Feb 22, 2019

@addaleax addaleax deleted the addaleax:worker-addons-ii branch Feb 22, 2019

addaleax added a commit that referenced this pull request Feb 22, 2019

worker: improve integration with native addons
Allow loading add-ons from multiple Node.js instances if they are
declared context-aware; in particular, this applies to N-API addons.

Also, plug a memory leak that occurred when registering N-API addons.

Refs: #23319

PR-URL: #26175
Fixes: #21481
Fixes: #21783
Fixes: #25662
Fixes: #20239
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
Reviewed-By: Richard Lau <riclau@uk.ibm.com>

addaleax added a commit that referenced this pull request Feb 25, 2019

worker: improve integration with native addons
Allow loading add-ons from multiple Node.js instances if they are
declared context-aware; in particular, this applies to N-API addons.

Also, plug a memory leak that occurred when registering N-API addons.

Refs: #23319

PR-URL: #26175
Fixes: #21481
Fixes: #21783
Fixes: #25662
Fixes: #20239
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
Reviewed-By: Richard Lau <riclau@uk.ibm.com>

BridgeAR added a commit that referenced this pull request Feb 26, 2019

2019-02-26, Version 11.11.0 (Current)
Notable changes:

* worker:
  * Improve integration with native addons
    (#26175)

@BridgeAR BridgeAR referenced this pull request Feb 26, 2019

Merged

v11.11.0 proposal #26322

BridgeAR added a commit that referenced this pull request Feb 26, 2019

2019-02-26, Version 11.11.0 (Current)
Notable changes:

* worker:
  * Improve integration with native addons
    (#26175)

rvagg added a commit that referenced this pull request Feb 28, 2019

worker: improve integration with native addons
Allow loading add-ons from multiple Node.js instances if they are
declared context-aware; in particular, this applies to N-API addons.

Also, plug a memory leak that occurred when registering N-API addons.

Refs: #23319

PR-URL: #26175
Fixes: #21481
Fixes: #21783
Fixes: #25662
Fixes: #20239
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
Reviewed-By: Richard Lau <riclau@uk.ibm.com>

BridgeAR added a commit that referenced this pull request Mar 5, 2019

2019-03-05, Version 11.11.0 (Current)
Notable Changes

* n-api:
  * Implement date object (Jarrod Connolly)
    (#25917)
* util:
  * Group array elements together (Ruben Bridgewater)
    (#26269)
  * Add compact depth mode (Ruben Bridgewater)
    (#26269)
* worker:
  * Improve integration with native addons (Anna Henningsen)
    (#26175)

BridgeAR added a commit that referenced this pull request Mar 5, 2019

2019-03-05, Version 11.11.0 (Current)
Notable Changes

* n-api:
  * Implement date object (Jarrod Connolly)
    (#25917)
* util:
  * Group array elements together (Ruben Bridgewater)
    (#26269)
  * Add compact depth mode (Ruben Bridgewater)
    (#26269)
* worker:
  * Improve integration with native addons (Anna Henningsen)
    (#26175)

BridgeAR added a commit that referenced this pull request Mar 5, 2019

2019-03-06, Version 11.11.0 (Current)
Notable Changes

* n-api:
  * Implement date object (Jarrod Connolly)
    (#25917)
* util:
  * Group array elements together (Ruben Bridgewater)
    (#26269)
  * Add compact depth mode (Ruben Bridgewater)
    (#26269)
* worker:
  * Improve integration with native addons (Anna Henningsen)
    (#26175)

BridgeAR added a commit that referenced this pull request Mar 5, 2019

2019-03-06, Version 11.11.0 (Current)
Notable Changes

* n-api:
  * Implement date object (Jarrod Connolly)
    #25917
* util:
  * Add compact depth mode for `util.inspect()` (Ruben Bridgewater)
    #26269
* worker:
  * Improve integration with native addons (Anna Henningsen)
    #26175
  * Use fake MessageEvent for port.onmessage (Anna Henningsen)
    #26082

BridgeAR added a commit that referenced this pull request Mar 5, 2019

2019-03-06, Version 11.11.0 (Current)
Notable Changes

* n-api:
  * Implement date object (Jarrod Connolly)
    #25917
* util:
  * Add compact depth mode for `util.inspect()` (Ruben Bridgewater)
    #26269
* worker:
  * Improve integration with native addons (Anna Henningsen)
    #26175
  * MessagePort.prototype.onmessage takes arguments closer to the Web
    specification now (Anna Henningsen)
    #26082

Trott pushed a commit to Trott/io.js that referenced this pull request Mar 6, 2019

2019-03-06, Version 11.11.0 (Current)
Notable Changes

* n-api:
  * Implement date object (Jarrod Connolly)
    nodejs#25917
* util:
  * Add compact depth mode for `util.inspect()` (Ruben Bridgewater)
    nodejs#26269
* worker:
  * Improve integration with native addons (Anna Henningsen)
    nodejs#26175
  * MessagePort.prototype.onmessage takes arguments closer to the Web
    specification now (Anna Henningsen)
    nodejs#26082
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.