New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Asynchronous execution for the Node-API generator #2654
base: master
Are you sure you want to change the base?
Conversation
Not that what I say would actually matter, but I'd like to argue that it would be appropriate to leave synchronous method names as they are and add a suffix to their asynchronous counterparts. As opposed to suffixing both names that is. I mean given a |
I don't quite follow. Is it correct understanding that you suggest per-object locks? And the concern is that programmers would pass the same object to asynchronous methods that would execute in parallel. Well, in such a case I'd say it should be on a programmer. It's their direct responsibility to know what is appropriate and I for one see no point in "helping" them in the such manner. Because such "help" would simply serialize execution of the asynchronous method, as if it's synchronous. In other words it would simply mask the problem from the user and create a false illusion that they know what they're doing. The keyword is "false illusion," which is a bad thing, at least in my "book." If anything, it might be more helpful to detect the race and terminate (or throw an exception). Well, maybe I've misread, not to mention that it might be just me:-) Just in case, it should be noted that detecting a race is not universally appropriate, in the sense that it might be safe to share an object among multiple threads, programmer should know. |
By programmer, do you mean the end user of the module or the interface/wrapper author? A JS programmer does not expect random crashes and memory corruption if he doesn't implement locking himself - in fact, there isn't even a standard way to implement locks in JS, since all functions are atomic (interrupted only by eventual The goal of the locking mechanism is to allow concurrent execution when possible, but to serialize it when needed. Normally, this decision should be made by the wrapper author, who should be aware if the wrapped C++ library supports concurrent calls on the same object or function. I think that as mostly every author will have to implement some kind of locking mechanism, it is best if SWIG offered some basic feature to facilitate, if not completely automate, this task. |
This should be feature-complete. Finally, I went for an opt-in approach to locking. Either through a Next step would be to test this implementation in |
Both. Script developer has to know something. Module provider obviously has to know everything.
Right. And earlier I tried to say that the author is [arguably] entitled to express the choice in the method name, the I feel some dissonance. It's likely to be rooted in the fact that I'm not thinking about Javascript in the "right" way, but nevertheless. Taking your Pi example. It's perfectly "reentrant" and requires no locking whatsoever. (I'm writing "reentrant" in quotes, because normally the term applies to functions/methods, not objects.) So we have to imagine a "non-reentrant" object. The said property would necessarily mean that it's mutating in the process of the method execution, right? Now, if serializing the execution is meaningful, does it mean that the said mutations take the object to its original state? I mean the next method invocation surely expects the object to be in a specific state. In which case wouldn't it be more appropriate to make the script developer create multiple objects? As opposed to the behind-curtains serialization that is. Or are we talking about a larger scratch area that is not practical to allocate on a per-object basis? Either way, my point is that it's too application-specific to paint with the same brush. And I'd argue the module provider should have the tools to express all the necessary nuances... |
Even if the underlying C++ method is not reentrant at all - because it uses static data - there still would be a significant benefit to having an async method - which is not blocking the event loop. The main use of Node.js is as an application server coupled with Express.js (or other similar Express-inspired middleware - there are so many of them now). An async method will allow Node.js to serve other clients while some kind of processing, be it I/O or CPU, is running. Then, there are the methods which are not reentrant only when called on a single object. |
One thing that is a problem with the current implementation are the generic swig/Lib/typemaps/swigtype.swg Line 515 in 0d203fc
These cannot construct the value that is going to be thrown (they have to throw a
PS. Found a very elegant solution. |
Finally, I was able to implement using https://gist.github.com/mmomtchev/0544f11b1ff104c7b5498e6aed8cbf9b What does not work is catching a generic exception and retrieving its error message - the error message is lost when copying a generic |
@dot-asm @ojwb @wsfulton I just got a very good taste of the problems with automatic copying of variables in lambda functions on Windows and I am reconsidering the advanced asynchronous throwing. The problem with asynchronous throwing is that something must cross the thread boundary. It can't be a V8 object. And it seems that it is not a very good idea to do it with the exception object either. At least in my case, the MSVC compiler/linker/runtime do not properly allocate/destroy a captured local exception depending on whether the constructor lives across a library boundary or not. As you may know, on Windows, shared libraries can have a separate heap and it seems that MSVC mishandles the destruction in some cases such as this one: This particular example works for |
Two problems. Without |
The standalone example is missing I will try to check what is happening, maybe there is something more than just |
@dot-asm yes, indeed I think that ImageMagick is built without it and this is the reason why moving the class out of it fixes the problem. I will look into it, thanks. |
Alas this fixes the isolated reproduction, but not the bindings themselves - they already use I managed to get ASAN running with MSVC in a Node addon and I found the culprit - when creating the lambda, the copy constructor of the captured object writes after the end of the allocated area. I am baffled how this is possible, so I am still convinced that this is indeed a compiler problem. |
Ok, this has nothing to do with SWIG, |
since we are telling the users to define it themselves
alternative solution to swig#2675 (since Node-API require C++11 anyway)
Ok, clang on macOS has a somewhat similar alignment problem. Because of the custom alignment that ImageMagick types use, It is an edge case but now it is obvious that this will cause even more trouble in the future. I will try to preserve the feature but remove the lambda. |
* No more throwing of capturing lambdas * Uses the right C++ tool for the job * emit_action overhaul commit 0302ddc Author: Momtchil Momtchev <momtchil@momtchev.com> Date: Sat Sep 16 17:39:06 2023 +0200 deleting a hash deletes its members too commit b4adbbe Author: Momtchil Momtchev <momtchil@momtchev.com> Date: Sat Sep 16 17:12:12 2023 +0200 fix double dealloc of trethrow commit 6b88e0c Author: Momtchil Momtchev <momtchil@momtchev.com> Date: Sat Sep 16 17:12:00 2023 +0200 tidy up commit af16ad4 Author: Momtchil Momtchev <momtchil@momtchev.com> Date: Sat Sep 16 17:07:17 2023 +0200 free all the allocations commit 36a0951 Author: Momtchil Momtchev <momtchil@momtchev.com> Date: Sat Sep 16 15:55:40 2023 +0200 fix the special_variables test commit 6f51103 Author: Momtchil Momtchev <momtchil@momtchev.com> Date: Sat Sep 16 15:48:46 2023 +0200 fix variable expansion commit 15ea779 Author: Momtchil Momtchev <momtchil@momtchev.com> Date: Sat Sep 16 13:54:55 2023 +0200 fix %exception and restore swig_exception.i to its former glory commit 2c4db7c Author: Momtchil Momtchev <momtchil@momtchev.com> Date: Sat Sep 16 12:30:35 2023 +0200 fix compilation with exceptions disabled commit 25f3091 Author: Momtchil Momtchev <momtchil@momtchev.com> Date: Sat Sep 16 12:05:41 2023 +0200 second attempt This time the code uses the right tools for the job Exceptions are saved in an exception_ptr The exception_ptr crosses the thread boundary The user code handler is placed in a separate function The exception_ptr is rethrown in the V8 thread Donwnsides: * Overhaul of emit_action() in emit.cxx * Bypassing the NAPI error handling commit d75d839 Author: Momtchil Momtchev <momtchil@momtchev.com> Date: Fri Sep 15 17:16:08 2023 +0200 async exceptions w/o lambdas
I refactored the asynchronous exceptions around This is a much better solution, the only downside is that I had to refactor |
There are still some unit tests that haven't been updated, but this is advanced enough to start discussing it.
This implementation is quite different from the first one which relied on lambdas - this one is C++03 compatible and has a every elegant solution to the problem of copying the local variables. All local variables in the wrapper methods are now members of a class that inherits from
Napi::AsyncWorker
. This allows reusing of the existing code. The three phases of an async method - sync input arguments parsing on the main thread, async execution in a worker thread and sync generation of the resulting JS objects back on the main thread - are now implemented as class methods. The class deletes itself once the operation is complete. This way, all existing typemaps can be used in async mode without any modification. Async mode can be enabled either by using a%feature
or it can simply replace sync mode when using the-async
CLI option. In this case, all unit tests (so far) work without changing anything besides addingawait
and eventually fixing the JS to be strict mode compliant, as top-levelawait
requires it.Points that should be discussed:
INPUT
typemaps that are used with%apply
. My experience is that most C++ libraries expect that their major objects' methods are not reentrant - like GDAL (Dataset
), ImageMagick (Image
) and Eigen (Matrix
) - so locking will have to be implemented most of the timeAsync
/Sync
interact with renaming? Currently, renaming is applied before the suffix@dot-asm
AFAIK this is the first SWIG PR to be developed entirely on solar power