-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
crash in v8::ArrayBuffer::GetBackingStore #54
Comments
I have a more elaborate stack trace, if that helps - that shows
Could this be due to gc (incorrectly) collecting the Buffer object as it is not referenced by anyone in the JS space? |
chatting with @mhdawson , it was revealed that an entry for the backing store was found in the global backing store registery, where it was not expecting one, is how the crash is materializing. I am not good in interpreting the trace log of
and searching backwards, I found a matching registration for a huge chunk:
|
After incrementally adding a bunch of instrumentation it looked be related to WritePointer being called as that seemed to be what was always on the stack when the crash occurred. (EDIT I see Gireesh pointed that out above, but I'd started down the instrumentation path before then). In writePointer exports.writePointer = function writePointer (buf, offset, ptr) {
debug('writing pointer to buffer', buf, offset, ptr);
exports._writePointer(buf, offset, ptr);
exports._attach(buf, ptr);
}; _attach tries to avoid the buffer to which the pointer refers being freed before the buffer to which the pointer was written. However I seem to remember @addaleax mentioning that it was difficult to ensure that the cleanup for the buffer to which the pointer was written was complete before the reference buffer was freed. Currently it only ensures that both buffers become eligible to be collected and their finalizes run at the same time but does not ensure which order that will happen. This patch will ensure that the buffer to which the pointer refers will only become eligible to be collected AFTER the finalizers for the buffer to which the pointer was written have been run. It seems to avoid the crash: diff --git a/src/binding.cc b/src/binding.cc
index d1fefbd..2cbc5f7 100644
--- a/src/binding.cc
+++ b/src/binding.cc
@@ -355,6 +355,17 @@ void WritePointer(const CallbackInfo& args) {
if (input.IsNull()) {
*reinterpret_cast<char**>(ptr) = nullptr;
} else {
+ // create a napi reference and finalizer to ensure that
+ // the buffer whoes pointer is written can only be
+ // collected after the finalizers for the buffer
+ // to which the pointer was written have already run
+ napi_ref* _ref = new napi_ref;
+ napi_create_reference(env, args[2], 1, _ref);
+ args[0].As<Object>().AddFinalizer([](Env env, napi_ref* ref) {
+ napi_delete_reference(env, *ref);
+ delete ref;
+ }, _ref);
+
char* input_ptr = GetBufferData(input);
*reinterpret_cast<char**>(ptr) = input_ptr;
} There are other places that use _attach so this might not be the full fix, but I thought I'd post share what I've figured out so far. |
Ok the same thing but using node-addon-api versus dropping down into node-api c calls: diff --git a/src/binding.cc b/src/binding.cc
index d1fefbd..7d8906e 100644
--- a/src/binding.cc
+++ b/src/binding.cc
@@ -355,6 +355,16 @@ void WritePointer(const CallbackInfo& args) {
if (input.IsNull()) {
*reinterpret_cast<char**>(ptr) = nullptr;
} else {
+ // create a node-api reference and finalizer to ensure that
+ // the buffer whoes pointer is written can only be
+ // collected after the finalizers for the buffer
+ // to which the pointer was written have already run
+ Reference<Value>* ref = new Reference<Value>;
+ *ref = Persistent(args[2]);
+ args[0].As<Object>().AddFinalizer([](Env env, Reference<Value>* ref) {
+ delete ref;
+ }, ref);
+
char* input_ptr = GetBufferData(input);
*reinterpret_cast<char**>(ptr) = input_ptr;
} I'm going to open a PR and discussion on whether it is the right fix can continue there. |
Keep a buffer written by WritePointer alive until the finalizers for the buffer to which the pointer has been run have been executed: Refs: node-ffi-napi#54 Signed-off-by: Michael Dawson <mdawson@devrus.com>
Created PR: #55 |
is my issue any relatable to this one @gireeshpunathil @mhdawson here is the issue |
@prabhatmishra33 I see this on the stack in the issue you posted Since it's in production it might be hard to try the change in #55, but that would be the best way to confirm one way or the other. |
* src: keep reference buffer alive longer Keep a buffer written by WritePointer alive until the finalizers for the buffer to which the pointer has been run have been executed: Refs: #54 Signed-off-by: Michael Dawson <mdawson@devrus.com> * limit change to writePointer Update to avoid changing behaviour for _writePointer and limit change to writePointer where objects were already being associated with the buffer. Signed-off-by: Michael Dawson <mdawson@devrus.com> * remove _attach call in writePointer remove _attach call in writePointer as it is no longer needed since a stronger version is done in the native code when true is passed as the fourth parameter
Hi we still see the crash with ref-napi 3.0.2 The code we use looks like this, it essentially creates an object and releases it afterwards:
I'll try to simplify code probably. The crash is not stable it occurs at different iterations, so likely a threading issue. The backtrace looks like this:
The backingstore trace is attached too. Still looking on this issue, but maybe you have some ideas. I don't think the patch above solved the problem. |
Another backtrace with WritePointer instead of FFICall
|
It was confirmed to have solved the test cast that @gireeshpunathil had provided and the larger real-world case that it was derived from. You might have a different case that triggers a similar issue. |
@mhdawson Looking on original problem I see it is about the fact that you try to create the buffer for the same address twice. The patch deals with garbage collection but from what I see the problem is not related to garbage collection at all. What happens is that during the program execution we allocate a chunk of memory and store it as a buffer. Than we free this chunk and allocate again. If by accident new chunk has the same address as before and we have the collision of addresses. Old buffer is not garbage collected and not removed from memory. From what I see there is no method in ref-napi to release an pointer and free corresponding buffer. If introduced I could use that method to release Buffer object when I free memory chunk. Another solution I see is not to use NAPI buffers to store pointers at all, feels like they are totally broken. |
Also take a note from backtrace that second chunk which conflicts with the first one is allocated as an argument for function calls, not as an object. Maybe additional check is missing there. |
Here is more detailed backtrace with more intermediate calls between. See that
|
After investigation looks like GetBufferData should use pointer_to_orig_buffer map instead of creating arraybuffer directly on backingstore. |
Simplified version that crashes quickly:
|
Essentially the same discussion as in nodejs/node#33321 |
Also #47 ? |
I could have used weak-napi
but this also crashes:
|
@addaleax any comment please? |
@nshmyrev I’ll get to this when I can. If you have a good understanding of what’s happening, you can also open a PR, which would speed things up a bit. |
Great thank you. For now my proposal is to use separate class for wrapping pointers, not Buffer. Something like PointerBuffer with separate memory management model avoiding all BackingStore issues. We won't need centralized map then too and the logic will be much more straightforward. |
@nshmyrev That would have been ideal a long time ago, but major backwards-compatibility breakage isn’t really an option for this package, I’m afraid. |
Fyi, my branches have the full implementation with separate structures. All tests pass, though it might not be 100% compatible: https://github.com/alphacep/ref-napi please let me know if you are interested to merge or we can just have a new packages inside this project. |
I'm also looking for a fix for this problem.
Forgive my ignorance, but is there a reason that breaking changes can't be made along with a major semver bump? |
Since my maintenance of this package is mostly unpaid, and this would unfortunately be a larger rewrite, it’s not at the top of my agenda. @nshmyrev’s proposed solution conceptually works, and if there is a PR at some point, I’d be happy to take a look. |
Thanks! I hear you on the developer priorities thing - funnily enough, I'm after this fix in order to do unpaid maintenance elsewhere 😅 @nshmyrev is there anything that you need in order to put together a PR? Can I help out in some way? |
Thank you, Tim. I could take a look on PR a bit later but if you could help and pull the changes it would be much faster. |
I'm also having this exact issue. I haven't seen a node ffi implementation working well since Node 12.x with the lxe pull request, tonight I thought I can finally switch to a higher NodeJS version by using ffi-napi instead of node-ffi but no luck...hopefully the proposed pull request above is accepted. |
Using Node v17.0.1, the issue no longer reproduces for me. Could somebody point me to the change in Node that fixed this? I am guessing it is nodejs/node#40178 but I am not sure. |
Fixed indeed but now I'm experiencing a significant slowdown. Maybe related to nodejs/node#32226? |
I see v17.4.0 solved the issue I mentioned in #50. Wish the solution can be backported to 14/16 soon. |
steps to reprouce:
gireesh@p1:~/mq/rec1$ cat rec.js
gireesh@p1:~/mq/rec1$ node rec
The text was updated successfully, but these errors were encountered: