Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

pargs code 1: /bin/sh: pargs: not found #1

Open
julien51 opened this Issue · 8 comments

4 participants

@julien51

Hello
I'm trying to run ncore on a linux (debian) box and well it seems that it requires "pargs", which I have no idea where to find:

# ncore 16092
pargs code 1: /bin/sh: pargs: not found

Any clue?

@rmustacc
Owner

It looks like there is no implementation of pargs for GNU/Linux. It looks like you might be able to get an equivalent by running cat /proc/<pid>/cmdline. It looks like that bit of ncore will have to have some platform specific pieces.

@konobi

tr '\0' '\n' < /proc/<pid>/cmdline

seems to be the way to get it into shape.

@julien51
@davepacheco
Owner

Sorry about this. At Joyent, we've switched to using OS core files and MDB for post-mortem debugging (see http://dtrace.org/blogs/dap/2012/01/13/playing-with-nodev8-postmortem-debugging/). As a result, this one's fallen into disrepair. It looks like Node's internal debugging API changed some time before 0.8.0, and this module was never updated.

I was able to run the "ncore" example on OS X with Node 0.8.22 after applying this delta:

diff --git a/cmd/ncore.js b/cmd/ncore.js
index e90bfb7..61b3ddd 100644
--- a/cmd/ncore.js
+++ b/cmd/ncore.js
@@ -26,7 +26,7 @@ var cacUsage = mod_subr.caSprintf([
 ].join('\n'), process.argv[0], process.argv[1]);

 cacStages.push(cacCheckArgs);
-cacStages.push(cacCheckTarget);
+// cacStages.push(cacCheckTarget);
 cacStages.push(cacCheckPort);
 cacStages.push(cacDebugEnable);
 cacStages.push(cacDebugConnect);
@@ -123,12 +123,13 @@ function cacDebugConnect(unused, next)

 function cacCheckPid(unused, next)
 {
-   cacClient.reqEval('process.pid', function (res) {
-       if (!res.success || res.body.type != 'number')
-           die('failed to get target pid: %j', res);
+   cacClient.reqEval('process.pid', function (err, res) {
+       console.error(res);
+       if (err || res.type != 'number')
+           die('failed to get target pid: %j %j', err, res);

-       if (res.body.value != cacPid)
-           die('connected to wrong pid: %j', res.body.value);
+       if (res.value != cacPid)
+           die('connected to wrong pid: %j', res.value);

        next();
    });
@@ -137,9 +138,9 @@ function cacCheckPid(unused, next)
 function cacSendPanic(unused, next)
 {
    cacClient.reqEval('caPanic("core dump initiated at user request")',
-       function (res) {
-       if (!res.success)
-           die('core dump FAILED: %j', res);
+       function (err) {
+       if (err)
+           die('core dump FAILED: %j', err);
        die('core dumped');
    });
 }

As you already found, I had to comment out the "pargs" check (which is just a sanity check anyway). That should either be replaced with something that works across platforms, or replaced with a platform-specific check, or made overridable (e.g., with a "-f" flag).

The other change is that reqEval now emits an (err, result) tuple, where the result looks like:

{ handle: 27, type: 'number', value: 26396, text: '26396' }

That change is needed in both places where we call "reqEval" in order for "ncore" to work on Node 0.8 and later.

I'd happily take pull requests to apply these changes properly. I'd like to keep this module working for platforms that can't make use of native OS core files to debug Node programs, but unfortunately I don't have much time to spend on it.

@julien51

@davepacheco I'll try that, but according to you can node-panic be used to debug infinite loop problems? I'm almost sure this is what's happening to our program in production. If it relies on some kind of REPL, then, I'm afraid we won't be able to connect to the process because of course, we'll never reach the "connected" callback. Am I understanding this correctly?

Thanks,

@davepacheco
Owner

That's not a problem. We modified Node back in the 0.4 days to support opening the debugger port even if the Node program itself is stuck in an infinite loop, and V8 is able to evaluate expressions in this context. The README in this repo contains an example of doing exactly that, though it has to be modified as I described above because of changes to the debugger API. It does work, though -- I tested it on OS X yesterday, and did it again just now (again, with the above changes):

$ node examples/example-loop.js &
[1] 3269
$ starting infinite loop; use "ncore" tool to generate core

$ node cmd/ncore.js 3269
Hit SIGUSR1 - starting debugger agent.
debugger listening on port 5858
attempting to attach to process 3269 ...  ok.
{ handle: 42, type: 'number', value: 3269, text: '3269' }
[2013-07-18 22:19:48.231 UTC] CRIT   PANIC: explicit panic: EXCEPTION: Error: Error: core dump initiated at user request
    at caPanic (/Users/dap/work/node-panic/lib/panic.js:78:9)
    at eval (eval at <anonymous> (/Users/dap/work/node-panic/lib/panic.js:198:39))
    at ExecutionState.evaluateGlobal (native)
    at DebugCommandProcessor.evaluateRequest_ (native)
    at DebugCommandProcessor.processDebugJSONRequest (native)
    at DebugCommandProcessor.processDebugRequest (native)
    at caDebugState.set (/Users/dap/work/node-panic/lib/panic.js)
    at func (/Users/dap/work/node-panic/examples/example-loop.js:10:12)
    at Object.<anonymous> (/Users/dap/work/node-panic/examples/example-loop.js:14:1)
    at Module._compile (module.js:449:26)
[2013-07-18 22:19:48.231 UTC] CRIT   writing core dump to /Users/dap/work/node-panic/ncore.3269
[2013-07-18 22:19:48.232 UTC] CRIT   finished writing core dump

Notice that the stack trace includes my function "func", that function's call to caDebugState.set, followed by the debugger frames. So if I didn't already know where this program was looping, I'd now see it was in func(). This worked even though the Node program was stuck in a loop.

@julien51

I'm afraid the patch does not work great on Debian.

Here is what I did:

# node examples/example-loop.js &
[1] 15739
# starting infinite loop; use "ncore" tool to generate core

In another window:

# node cmd/ncore.js 15739
attempting to attach to process 15739 ... .............................FAILED
exceeded retry limit with error ECONNREFUSED
WARNING: SIGUSR1 sent to pid 15739, but debug attach failed.

Not sure what I'm doing wrong, but I'm completely unable to connect to the process with an infinite loop. We're using node -v v0.10.2.

@davepacheco
Owner

Is there a way for you to tell whether the Node program opened the debug port? Is there a way to see the native stack trace of the program? (I'm not sure what the GNU/Linux equivalent is, but on SmartOS you'd use "pfiles" and "pstack" for this, respectively.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.