Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

node.js addons crash during garbage collection #317

Closed
stepugnetti opened this issue Jan 31, 2015 · 7 comments
Closed

node.js addons crash during garbage collection #317

stepugnetti opened this issue Jan 31, 2015 · 7 comments

Comments

@stepugnetti
Copy link

Creation of a massive number of javascript objects from a swig-built node.js addon results in segmentation fault when node's garbage collector starts. The same does not happen with an equivalent hand-written javascript module.

In this repository:
https://github.com/stepugnetti/swig-node-gc-issue
I've a dummy node.js addon to be built with swig by running make in project's root. The addon just expones a constructor for an empty Object. In the safe folder you can find a Javascript version of it. Now, running

node no-gc-issue.js

starts instantiating infinite many copies of the dummy object from the 'safe' module. No problem here (kill with ctrl-c). However, running

node gc-issue.js

objects are instantiated with the constructor of the swig-made addon. After few-hundred instantiations the process crashes with glibc complaining about an invalid pointer being freed.

Is this a bug?

@stepugnetti
Copy link
Author

This happens on my Ubuntu 12.04 , but also on my Raspberry Pi.

SWIG versions: 3.0.3, 3.0.4.

@stepugnetti
Copy link
Author

Still happening with SWIG 3.0.5.

@wsfulton
Copy link
Member

wsfulton commented Feb 2, 2015

It might be a bug in your code or a bug in what SWIG generates. You need C++ debugging skills to use SWIG, so when you debug your code, you should be able to give some details that others may find time to look at if the fix isn't clear to you once you've debugged it and posted an analysis.

@Zion-ICS
Copy link

Zion-ICS commented Feb 5, 2015

We're using the JS SWIG to interface C++ drivers in nodeJS.
We seem to run into memory cleanup issues.
First of all, the destructor isn't called when we kill the nodeJS process.
Second of all: there doesn't seem to be an interface to call the destructor manually.
In Python, SWIG creates new_ and delete_ functions for everything, whether classes, SWIG array_class declarations, or SWIG pointer_function declarations.
V8 seems to do this only for pointer_functions.
For creating/registering a new C++ class, SWIG publishes a SWIGV8_SYMBOL_NEW so you can call it with "new" in nodeJS.
I haven't found a corresponding "delete" for a class.

If we could auto-generate something like the following, I'd be really excstatic:
SWIGV8_AddMemberFunction(_exports_ourClass_class, "delete", _wrap_ourClass_delete);

Note that neither deleting the object with "delete" nor setting it to null run the destructor.

The SWIG-generated C++ file shows that much of the plumbing for deleting a class exists.
_wrap_delete_ourclass exists in the SWIG-generated cxx file.
There just doesn't appear to be any way to access these functions without hand-editing the SWIG-generated C file.
That isn't an option for our project.
(Aside: array_class has no interface either for delete).
Thanks for your help!!

stepugnetti added a commit to stepugnetti/swig-node-gc-issue that referenced this issue Feb 7, 2015
@stepugnetti
Copy link
Author

@wsfulton
Though not a C++ debugging guru, I've already tried something.

Is the bug in my code?

First of all, my code itself is very simple; this is the interface file (dummy-module.i):

%module dummy_module
%{
class DummyObject {};
%}
class DummyObject {};

I wrapped it with:

swig -c++ -javascript -node dummy-module.i

The node-gyp binding file I use is also quite simple:

{
  'targets': [
  {
    'target_name': 'dummy-module',
    'sources': ['dummy-module_wrap.cxx'],
    'cflags' : ['-ggdb3']
  }]
}

I built it with:

node-gyp configure build --debug

And then I installed it putting the following package.j

son file in the build directory:

{
  "name" : "dummy-module",
  "version" : "0.1.0",
  "description" : "A dummy package to illustrate a problem with swig-compiled modules and v8 garbage collector.",
  "main" : "dummy-module.node"
}

and running

npm install build/Debug

This is the javascript code that reproduces the error (gc-issue.js):

p=require('dummy-module');
var i=0;
setInterval(function(){console.log(++i + new p.DummyObject());return;},10);

(Everything is in the repository I linked in my first post)

If the bug is in my code, it should be easily spotted.

Is the bug in Node?

Since I couldn't tell what's wrong with my code, I started looking for problems with Node.JS. I tried running the same js code with a 'safe-dummy-module' written in Javascript (safe-module.js):

exports.DummyObject = function () {};

This is the adapted version of the Javascript code (no-gc-issue.js):

p=require('safe-module');
var i=0;
setInterval(function(){console.log(++i + new p.DummyObject());return;},10);

Everything works fine, no segmentation fault; so I would say it's not Node's fault, this time.

Valgrind debugging

I recompiled the latest Node.JS version (0.10.36) with debugging symbols, I installed libc6-dbg package (Kubuntu 12.04) to get debugging symbols of core C libraries as well, and tried to look at the backtrace. First I built my project as described above (make clean install DEBUG=true, with the Makefile provided), then I ran

valgrind --leak-check=full --leak-resolution=high --track-origins=yes --num-callers=40 --freelist-vol=100000000 --log-file=valgrind_output.txt node --debug gc-issue.js

(The output file is here)

There are a lot of memory leaks, most of them involving a call to _wrap_new_DummyObject(v8::Arguments const&) (dummy-module_wrap.cxx:1509) (dummy-module_wrap.cxx is now in my repository as well, though it is automatically generated when makeing my project.)

But what's more interesting is the full backtrace at the end of the run:

==3669== Process terminating with default action of signal 11 (SIGSEGV)
==3669==  Access not within mapped region at address 0x2011
==3669==    at 0x74C31D: v8::internal::GlobalHandles::Destroy(v8::internal::Object**) (in /home/stefano/.nvm/v0.10.36/bin/node)
==3669==    by 0x743602B: v8::Persistent<v8::Object>::Dispose() (v8.h:4256)
==3669==    by 0x7435D4B: SWIGV8_Proxy::~SWIGV8_Proxy() (dummy-module_wrap.cxx:915)
==3669==    by 0x7434F87: _wrap_delete_DummyObject(v8::Persistent<v8::Value>, void*) (dummy-module_wrap.cxx:1536)
==3669==    by 0x74C86D: v8::internal::GlobalHandles::PostGarbageCollectionProcessing(v8::internal::GarbageCollector) (in /home/stefano/.nvm/v0.10.36/bin/node)
==3669==    by 0x761CFD: v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::internal::GCTracer*) (in /home/stefano/.nvm/v0.10.36/bin/node)
==3669==    by 0x763552: v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollector, char const*, char const*) (in /home/stefano/.nvm/v0.10.36/bin/node)
==3669==    by 0x8858A6: v8::internal::Runtime::PerformGC(v8::internal::Object*) (in /home/stefano/.nvm/v0.10.36/bin/node)
==3669==    by 0xF7D04106213: ???
==3669==    by 0xF7D04184D4F: ???
==3669==    by 0xF7D0419A453: ???
==3669==    by 0xF7D0410A96D: ???
==3669==    by 0xF7D04198CE3: ???
==3669==    by 0xF7D0410A96D: ???
==3669==    by 0xF7D04197B23: ???
==3669==    by 0xF7D0410A96D: ???
==3669==    by 0xF7D041245E0: ???
==3669==    by 0xF7D041118B6: ???
==3669==    by 0x72A1E9: v8::internal::Invoke(bool, v8::internal::Handle<v8::internal::JSFunction>, v8::internal::Handle<v8::internal::Object>, int, v8::internal::Handle<v8::internal::Object>*
, bool*) (in /home/stefano/.nvm/v0.10.36/bin/node)
==3669==    by 0x72BACA: v8::internal::Execution::Call(v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Object>, int, v8::internal::Handle<v8::internal::Object>*,
 bool*, bool) (in /home/stefano/.nvm/v0.10.36/bin/node)
==3669==    by 0x86E9CC: v8::internal::Runtime_Apply(v8::internal::Arguments, v8::internal::Isolate*) (in /home/stefano/.nvm/v0.10.36/bin/node)
==3669==    by 0xF7D0410618D: ???
==3669==    by 0xF7D04183175: ???
==3669==    by 0xF7D0410A96D: ???
==3669==    by 0xF7D041963A9: ???
==3669==    by 0xF7D041284FD: ???
==3669==    by 0xF7D04195E7C: ???
==3669==    by 0xF7D041704ED: ???
==3669==    by 0xF7D0410A96D: ???
==3669==    by 0xF7D041245E0: ???
==3669==    by 0xF7D041118B6: ???
==3669==    by 0x72A1E9: v8::internal::Invoke(bool, v8::internal::Handle<v8::internal::JSFunction>, v8::internal::Handle<v8::internal::Object>, int, v8::internal::Handle<v8::internal::Object>*
, bool*) (in /home/stefano/.nvm/v0.10.36/bin/node)
==3669==    by 0x72BACA: v8::internal::Execution::Call(v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Object>, int, v8::internal::Handle<v8::internal::Object>*,
 bool*, bool) (in /home/stefano/.nvm/v0.10.36/bin/node)
==3669==    by 0x6D7E40: v8::Function::Call(v8::Handle<v8::Object>, int, v8::Handle<v8::Value>*) (in /home/stefano/.nvm/v0.10.36/bin/node)
==3669==    by 0x95218C: node::MakeCallback(v8::Handle<v8::Object>, v8::Handle<v8::Function>, int, v8::Handle<v8::Value>*) (in /home/stefano/.nvm/v0.10.36/bin/node)
==3669==    by 0x9522F1: node::MakeCallback(v8::Handle<v8::Object>, v8::Handle<v8::String>, int, v8::Handle<v8::Value>*) (in /home/stefano/.nvm/v0.10.36/bin/node)
==3669==    by 0x971157: node::TimerWrap::OnTimeout(uv_timer_s*, int) (in /home/stefano/.nvm/v0.10.36/bin/node)
==3669==    by 0x9B0BE6: uv__run_timers (timer.c:146)
==3669==    by 0x9A6774: uv_run (core.c:308)
==3669==    by 0x95413F: node::Start(int, char**) (in /home/stefano/.nvm/v0.10.36/bin/node)
==3669==  If you believe this happened as a result of a stack
==3669==  overflow in your program's main thread (unlikely but
==3669==  possible), you can try to increase the size of the
==3669==  main thread stack using the --main-stacksize= flag.
==3669==  The main thread stack size used in this run was 8388608.

To me it looks like the pointer to an instance of DummyObject that is about to be garbage-collected, as recorded in SWIG's proxy class, is invalid. Maybe one of all those "Invalid write"s reported by Valgrind has overwritten the value of this pointer?

Conclusions

Despite my attemps (all this was done before I wrote my first post), understanding what's the point and fix the problem is beyond my possibilities; I'm afraid it requires some good knowledge of Swig's internal achitecture (at least of the part that supports Javascript). But I thought that Swig developers might be interested in this report: that's why I wrote it.

@Zion-ICS
Copy link

Thank you very, very much for your post!
We ran into a bunch of garbage collection issues with JS and SWIG also.
Destructors aren't being called either.
#329

@ojwb ojwb added the Javascript label Sep 9, 2016
@ojwb
Copy link
Member

ojwb commented Mar 4, 2022

With SWIG 4.0.2 and node v12.22.9 NODE_PATH=build/Release/ node gc-issue.js just keeps going at least into the thousands, and top doesn't show the memory used by the node process as growing, so it seems this was fixed between SWIG v3.0.5 and v4.0.2 (though I don't see an obvious fix for this in the commit log) or by changes in node.js, so closing as fixed.

@ojwb ojwb closed this as completed Mar 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants