Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions doc/api/cli.md
Original file line number Diff line number Diff line change
Expand Up @@ -183,6 +183,21 @@ added: v0.10
Aborting instead of exiting causes a core file to be generated for post-mortem
analysis using a debugger (such as `lldb`, `gdb`, and `mdb`).

### `--abort-on-unhandled-rejection`
<!-- YAML
added: REPLACEME
-->

Aborting instead of exiting causes a core file to be generated for post-mortem
analysis using a debugger (such as `lldb`, `gdb`, and `mdb`).

This option only affects `Promise` rejections. It is only available on POSIX
systems. It is implemented by keeping `fork()`ed copies of the process alive
until it is known whether the `Promise` is handled within the same event loop
iteration or is left unhandled.
This makes unhandled rejections a significantly more complex operation,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As scary as this already sounds, I think it remains a massive understatement. Fork, particularly at high frequency, has some pretty significant performance impact:

  • fork itself can take a bunch of time
  • after the fork completes, performance is impacted in both parent and child processes as subsequent writes require additional page faults, even if the memory is already resident
  • on systems that avoid overcommitting swap (including illumos and Linux, when configured to do so), the memory used by the child process is counted against the swap cap. In cloud environments with tuned caps, this can quickly exhaust swap, causing lots of other operations to fail.
  • If the child ends up writing to the shared pages, then the total memory used by the parent and child goes up, which can result in paging (again, particularly in environments with memory caps configured) that can dramatically reduce performance.

I think it's good that the documentation mentions the use of fork, and I think it would benefit from a firmer message about the possible impact.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fork, particularly at high frequency, has some pretty significant performance impact:

Yes, I am aware that that makes this option unsuitable depending on your application type. I can try to incorporate some of your points into the doc.

If the child ends up writing to the shared pages, then the total memory used by the parent and child goes up, which can result in paging (again, particularly in environments with memory caps configured) that can dramatically reduce performance.

Why would this happen (for more than one or two pages of the stack)? The child process really isn’t doing much here.

Copy link
Copy Markdown
Member

@bnoordhuis bnoordhuis Sep 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any pages marked copy-on-write by fork() get cloned when the parent mutates them. If the parent does (for example) a compacting GC, that's a significant number of pages.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any way we can limit GC or at least discourage it while the forked process lives?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be obvious but I wonder if reminding people they should ensure ulimit is set to allow core dumps would be helpful.

which may have performance implications.

### `--trace-warnings`
<!-- YAML
added: v6.0.0
Expand Down Expand Up @@ -436,6 +451,7 @@ if they had been specified on the command line before the actual command line
not allowed in the environment is used, such as `-p` or a script file.

Node options that are allowed are:
- `--abort-on-unhandled-rejection`
- `--enable-fips`
- `--force-fips`
- `--icu-data-dir`
Expand Down
11 changes: 11 additions & 0 deletions doc/node.1
Original file line number Diff line number Diff line change
Expand Up @@ -139,6 +139,17 @@ Enable loading native modules compiled with the ABI-stable Node.js API (N-API)
.BR \-\-abort\-on\-uncaught\-exception
Aborting instead of exiting causes a core file to be generated for analysis.

.TP
.BR \-\-abort\-on\-unhandled\-rejection
Aborting instead of exiting causes a core file to be generated for analysis.

This option only affects \fIPromise\fR rejections. It is only available on POSIX
systems. It is implemented by keeping \fBfork()\fRed copies of the process alive
until it is known whether the \fIPromise\fR is handled within the same event
loop iteration or is left unhandled.
This makes unhandled rejections a significantly more complex operation,
which may have performance implications.

.TP
.BR \-\-trace\-warnings
Print stack traces for process warnings (including deprecations).
Expand Down
17 changes: 17 additions & 0 deletions lib/internal/process/promises.js
Original file line number Diff line number Diff line change
@@ -1,11 +1,13 @@
'use strict';

const { safeToString } = process.binding('util');
const abortRegex = /^--abort[_-]on[_-]unhandled[_-]rejection$/;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to support the _ variant? We do for --abort-on-uncaught-exception, but that's because it's a V8 option.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s here for parity with --abort_on_uncaught_exception but yes, we can choose to not support it if we like.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't cost much to keep parity; I would suggest keeping it.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm +0 either way but if we do support it then we should probably have a test variant for it.


const promiseRejectEvent = process._promiseRejectEvent;
const hasBeenNotifiedProperty = new WeakMap();
const promiseToGuidProperty = new WeakMap();
const pendingUnhandledRejections = [];
const wantAbort = process.execArgv.some((e) => abortRegex.test(e));
let lastPromiseId = 1;

exports.setup = setupPromises;
Expand All @@ -17,6 +19,8 @@ function getAsynchronousRejectionWarningObject(uid) {

function setupPromises(scheduleMicrotasks) {
let deprecationWarned = false;
const closeCoreDump = process._closeCoreDump;
delete process._closeCoreDump;

process._setupPromises(function(event, promise, reason) {
if (event === promiseRejectEvent.unhandled)
Expand All @@ -34,6 +38,11 @@ function setupPromises(scheduleMicrotasks) {
}

function rejectionHandled(promise) {
if (wantAbort) {
// Defer core dump closing until the next tick since another Promise
// might become an unhandled rejection by adopting this promise's state.
process.nextTick(closeCoreDump, promise, false);
}
const hasBeenNotified = hasBeenNotifiedProperty.get(promise);
if (hasBeenNotified !== undefined) {
hasBeenNotifiedProperty.delete(promise);
Expand Down Expand Up @@ -92,10 +101,18 @@ function setupPromises(scheduleMicrotasks) {
hasBeenNotifiedProperty.set(promise, true);
const uid = promiseToGuidProperty.get(promise);
if (!process.emit('unhandledRejection', reason, promise)) {
if (wantAbort) {
const msg = (reason && reason.stack) || reason;
require('fs').writeSync(2, 'Unhandled promise rejection\n' +
safeToString(msg) + '\n');
closeCoreDump(promise, true); // This won't return.
process.abort(); // This won't be reached, hopefully.
}
emitWarning(uid, reason);
} else {
hadListeners = true;
}
closeCoreDump(promise, false);
}
}
return hadListeners;
Expand Down
2 changes: 2 additions & 0 deletions src/env.h
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,7 @@ class ModuleWrap;
V(decorated_private_symbol, "node:decorated") \
V(npn_buffer_private_symbol, "node:npnBuffer") \
V(processed_private_symbol, "node:processed") \
V(promise_abort_info_symbol, "node:promiseAbortInfo") \
V(selected_npn_buffer_private_symbol, "node:selectedNpnBuffer") \

// Strings are per-isolate primitives but Environment proxies them
Expand Down Expand Up @@ -313,6 +314,7 @@ class ModuleWrap;
V(performance_entry_template, v8::Function) \
V(process_object, v8::Object) \
V(promise_reject_function, v8::Function) \
V(promise_reject_reason_map, v8::NativeWeakMap) \
V(promise_wrap_template, v8::ObjectTemplate) \
V(push_values_to_array_function, v8::Function) \
V(randombytes_constructor_template, v8::ObjectTemplate) \
Expand Down
155 changes: 155 additions & 0 deletions src/node.cc
Original file line number Diff line number Diff line change
Expand Up @@ -118,6 +118,8 @@ typedef int mode_t;

#if defined(__POSIX__)
#include <dlfcn.h>
#include <sys/wait.h>
#include <errno.h>
#endif

#ifdef __APPLE__
Expand Down Expand Up @@ -148,6 +150,7 @@ using v8::MaybeLocal;
using v8::Message;
using v8::Name;
using v8::NamedPropertyHandlerConfiguration;
using v8::NativeWeakMap;
using v8::Null;
using v8::Number;
using v8::Object;
Expand All @@ -161,6 +164,7 @@ using v8::SealHandleScope;
using v8::String;
using v8::TryCatch;
using v8::Uint32Array;
using v8::Uint8Array;
using v8::Undefined;
using v8::V8;
using v8::Value;
Expand Down Expand Up @@ -188,6 +192,7 @@ static node_module* modlist_addon;
static bool trace_enabled = false;
static std::string trace_enabled_categories; // NOLINT(runtime/string)
static bool abort_on_uncaught_exception = false;
static bool abort_on_unhandled_rejection = false;

// Bit flag used to track security reverts (see node_revert.h)
unsigned int reverted = 0;
Expand Down Expand Up @@ -1296,6 +1301,141 @@ void SetupNextTick(const FunctionCallbackInfo<Value>& args) {
args.GetReturnValue().Set(Uint32Array::New(array_buffer, 0, fields_count));
}

// Data struct that is attached to promises as raw memory.
struct promise_abort_info {
#ifdef __POSIX__
int fd;
pid_t pid;
int refcount;
#endif
};

void OpenCoreDump(Environment* env, Local<Promise> promise) {
if (!abort_on_unhandled_rejection) return;
#ifdef __POSIX__
struct rlimit lim;
CHECK_EQ(getrlimit(RLIMIT_CORE, &lim), 0);
if (lim.rlim_cur == 0) {
// Don't bother creating processes if we cannot create core dumps anyways.
return;
}
// Look up whether there is already a core dump holder process for this
// rejection reason, to better capture failure propagation along Promise
// chains.
Local<NativeWeakMap> reason_map = env->promise_reject_reason_map();
if (reason_map.IsEmpty()) {
reason_map = NativeWeakMap::New(env->isolate());
env->set_promise_reject_reason_map(reason_map);
}
CHECK_EQ(promise->State(), Promise::kRejected);
Local<Value> other = reason_map->Get(promise->Result());
if (other->IsPromise()) {
Local<Value> uint8array =
other.As<Promise>()->GetPrivate(env->context(),
env->promise_abort_info_symbol())
.ToLocalChecked();
CHECK(uint8array->IsUint8Array());
promise_abort_info* info =
reinterpret_cast<promise_abort_info*>(Buffer::Data(uint8array));
info->refcount++;
promise->SetPrivate(env->context(),
env->promise_abort_info_symbol(),
uint8array).FromJust();
return;
}
reason_map->Set(promise->Result(), promise);
int pipes[2] = { -1, -1 };
pid_t pid;

if (pipe(pipes) == -1) return;
pid = fork();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we avoid the problem where another thread is holding a lock that will remain held in the child process with no thread to unlock it?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That doesn't matter, does it? The child does nothing but read() and _exit(), it doesn't take out any locks.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fork() only clones the calling thread into the child process, if that answers your question. Plus, what Ben said.

Copy link
Copy Markdown
Contributor

@davepacheco davepacheco Sep 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true, although it relies on nobody adding anything more sophisticated to the child process in the future.

That also raises something I missed the first time: this only captures stack information about the forking thread. State about thread pool threads or other threads created by libraries or other add-ons will be missing from the core file. That might be another caveat worth documenting.

(edit: Sorry for the redundant note. I hadn't caught up to Julien's comment.)

if (pid == -1) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I reading this correctly that if we fail to fork (or do any of these operations), we just won't get a core file for that, and the user will never know? I think the documentation should probably reflect that this is a best-effort approach. It would also be great if there were some way for a user to know that a dump has been skipped because of a failure like this. I'm afraid that when combined with some of the problems I mentioned earlier (e.g., swap exhaustion, or exhaustion of some other cap -- like number of processes), users won't get a core file, and they won't know that they won't get one until it's too late.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, the original process itself will abort() once it finds that there is no promise_abort_info struct attached to the promise when it would otherwise signal the child process to abort:

https://github.com/addaleax/node/blob/29fcd36f789a2f9913661ba7f6b964362517c9b6/src/node.cc#L1369-L1373

It’s certainly not ideal, but what’s the alternative?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that aborting the original process if we can't be sure is the reasonable approach.

close(pipes[0]);
close(pipes[1]);
return;
}
if (pid == 0) {
char do_abort;
int rc;
do {
rc = read(pipes[0], &do_abort, sizeof(do_abort));
} while (rc == -1 && errno == EINTR);
if (rc > 0 && !do_abort)
_exit(0);
else
abort();
}
close(pipes[0]);

Local<ArrayBuffer> ab =
ArrayBuffer::New(env->isolate(), sizeof(promise_abort_info));
Local<Value> uint8array =
Uint8Array::New(ab, 0, sizeof(promise_abort_info));
promise_abort_info* info =
reinterpret_cast<promise_abort_info*>(Buffer::Data(uint8array));
promise->SetPrivate(env->context(),
env->promise_abort_info_symbol(),
uint8array).FromJust();
info->fd = pipes[1];
info->pid = pid;
info->refcount = 1;
#endif // __POSIX__
}

void CloseCoreDump(Environment* env, Local<Promise> promise, char do_abort) {
if (!abort_on_unhandled_rejection) return;
#ifdef __POSIX__
CHECK_EQ(promise->State(), Promise::kRejected);
Local<Value> uint8array =
promise->GetPrivate(env->context(),
env->promise_abort_info_symbol()).ToLocalChecked();
if (!uint8array->IsUint8Array()) {
if (do_abort) {
// This may happen when e.g. fork()ing itself failed due to resource
// constraints.
ABORT();
}
return;
}
promise_abort_info* info =
reinterpret_cast<promise_abort_info*>(Buffer::Data(uint8array));
if (!do_abort) {
info->refcount--;
if (info->refcount > 0) return;
}
env->promise_reject_reason_map()->Delete(promise->Result());
int rc;
do {
rc = write(info->fd, &do_abort, sizeof(do_abort));
} while (rc == -1 && errno == EINTR);
CHECK_GT(rc, 0);
close(info->fd);
int status = 0;
do {
rc = waitpid(info->pid, &status, 0);
} while (rc == -1 && errno == EINTR);
CHECK_NE(rc, -1);
if (do_abort || !WIFEXITED(status)) {
if (!WIFEXITED(status)) {
// Disable a core dump for this process assuming the fork()ed process
// already wrote one.
struct rlimit limit_zero;
limit_zero.rlim_cur = 0;
limit_zero.rlim_max = 0;
setrlimit(RLIMIT_CORE, &limit_zero);
}
abort();
}
#endif // __POSIX__
}

void CloseCoreDump(const FunctionCallbackInfo<Value>& args) {
Environment* env = Environment::GetCurrent(args);
CHECK(args[0]->IsPromise());
CHECK(args[1]->IsBoolean());
CloseCoreDump(env, args[0].As<Promise>(), args[1].As<v8::Boolean>()->Value());
}

void PromiseRejectCallback(PromiseRejectMessage message) {
Local<Promise> promise = message.GetPromise();
Isolate* isolate = promise->GetIsolate();
Expand All @@ -1305,6 +1445,10 @@ void PromiseRejectCallback(PromiseRejectMessage message) {
Environment* env = Environment::GetCurrent(isolate);
Local<Function> callback = env->promise_reject_function();

if (message.GetEvent() == v8::kPromiseRejectWithNoHandler) {
OpenCoreDump(env, promise);
}

if (value.IsEmpty())
value = Undefined(isolate);

Expand Down Expand Up @@ -3644,6 +3788,7 @@ void SetupProcessObject(Environment* env,
env->SetMethod(process, "binding", Binding);
env->SetMethod(process, "_linkedBinding", LinkedBinding);

env->SetMethod(process, "_closeCoreDump", CloseCoreDump);
env->SetMethod(process, "_setupProcessObject", SetupProcessObject);
env->SetMethod(process, "_setupNextTick", SetupNextTick);
env->SetMethod(process, "_setupPromises", SetupPromises);
Expand Down Expand Up @@ -3798,6 +3943,10 @@ static void PrintHelp() {
" --abort-on-uncaught-exception\n"
" aborting instead of exiting causes a\n"
" core file to be generated for analysis\n"
" --abort-on-unhandled-rejection\n"
" aborting instead of emitting a warning\n"
" causes a core file to be generated for\n"
" analysis\n"
" --trace-warnings show stack traces on process warnings\n"
" --redirect-warnings=file\n"
" write warnings to file instead of\n"
Expand Down Expand Up @@ -3939,6 +4088,7 @@ static void CheckIfAllowedInEnv(const char* exe, bool is_env,
"--force-fips",
"--openssl-config",
"--icu-data-dir",
"--abort_on_unhandled_rejection",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also add this to the documented list of options that are allowed in NODE_OPTIONS in doc/api/cli.md.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@richardlau done!


// V8 options (define with '_', which allows '-' or '_')
"--abort_on_uncaught_exception",
Expand Down Expand Up @@ -4131,6 +4281,11 @@ static void ParseArgs(int* argc,
} else if (strcmp(arg, "--") == 0) {
index += 1;
break;
#ifdef __POSIX__
} else if (strcmp(arg, "--abort-on-unhandled-rejection") == 0 ||
strcmp(arg, "--abort_on_unhandled_rejection") == 0) {
abort_on_unhandled_rejection = true;
#endif
} else if (strcmp(arg, "--abort-on-uncaught-exception") == 0 ||
strcmp(arg, "--abort_on_uncaught_exception") == 0) {
abort_on_uncaught_exception = true;
Expand Down
30 changes: 30 additions & 0 deletions test/abort/test-abort-unhandled-rejection.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
'use strict';

const common = require('../common');
const assert = require('assert');
const spawn = require('child_process').spawn;
const node = process.execPath;

if (process.argv[2] === 'child') {
Promise.reject(new Error('child error'));
} else {
run('', null);
if (!common.isWindows) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we common.skip() on Windows, or perhaps assert that the flag is rejected?

run('--abort-on-unhandled-rejection', ['SIGABRT', 'SIGTRAP', 'SIGILL']);
run('--abort_on_unhandled_rejection', ['SIGABRT', 'SIGTRAP', 'SIGILL']);
}
}

function run(flags, signals) {
const args = [__filename, 'child'];
if (flags)
args.unshift(flags);

const child = spawn(node, args);
child.on('exit', common.mustCall(function(code, sig) {
if (signals)
assert(signals.includes(sig), `Unexpected signal ${sig}`);
else
assert.strictEqual(sig, null);
}));
}