Skip to content
This repository

Implicit asynchronous context #3733

Open
eladb opened this Issue July 18, 2012 · 16 comments
Elad Ben-Israel
eladb commented July 18, 2012

In continuation to https://groups.google.com/d/topic/nodejs-dev/gBpJeQr0fWM/discussion

Most environments provide some form of ability to push implicit information into the execution context. In multithreaded environments, it is usually some form of storage associated with the current thread and which can be accessed by code anywhere in the program. See Thread Local Storage for details.

This mechanism enables use cases where it is impossible to pass along data across a call chain from one layer to another in the execution flow (consider the use of many libraries throughout the chain). Common examples are instrumentation and debugging tools like logging and performance counters that need to be associated with the current flow but are usually implemented in a central location within the codebase. For example, one might want to automatically associate a request ID with each emitted log line. The request ID will be added to the execution context when the request comes in and the logging library will extract it and emit it with the log line.

I think that node should enable those scenarios by providing a mechanism to associate data implicitly into the current execution context and extract them along the way. Because of the asynchronous nature of node, this data should traverse async hops in a similar fashion to Domains.

Isaac Z. Schlueter
Collaborator
isaacs commented July 18, 2012

One way to address this would be to bless the use of process.domain as the active domain object, and maybe give it a "data" object member that is a general-purpose bag-o-stuff for you to use.

Scott Sanders

data object member bag would be the most flexible, and usable, especially if users of the data member utilized their own name as a high-level key to 'their' own data bag. Example, process.domain.data.express, process.domain.data.haraka, or process.domain.data.passport

Elad Ben-Israel
eladb commented July 19, 2012

A caveat to consider is that domains can be arbitrarily nested. process.domain points to the current domain, which is not necessarily the domain that contains the context that you are looking for:

var domain = require('domain');

var d1 = domain.create();
d1.c1 = 'hello';

d1.run(function() {
  var d2 = domain.create();
  d2.c2 = 'world';
  d2.run(function() {
    console.log('c1:', process.domain.c1);
    console.log('c2:', process.domain.c2);
  });
});

Output:

c1: undefined
c2: world

A possible solution could be to automatically link data to the parent domain's data upon creation:

var domain = require('domain');

var _create = domain.create;

domain.create = function() {
  var d = _create.apply(domain, arguments);
  var parent_data = (process.domain && process.domain.data) || {};
  d.data = Object.create(parent_data);
  return d;
};

var d1 = domain.create();
d1.data.c1 = 'hello';

d1.run(function() {
  var d2 = domain.create();
  d2.data.c2 = 'world';

  d2.run(function() {
    console.log('c1:', process.domain.data.c1);
    console.log('c2:', process.domain.data.c2);
  });
});

Output:

c1: hello
c2: world

An added value of this approach is that overriding a data attribute of a child domain does not affect code that runs within the parent domain. Love javascript!

Elad Ben-Israel

@isaacs What do you think about the above suggestion?

Alex Kocharin

So, is it guaranteed somehow that using process.domain.data in userland won't conflict with future node.js versions?

By the way, what is the preferred way to access current active domain? Is it process.domain or require('domain').active? Node.js core uses both ways: process.domain in timers.js and domain.active in events.js. Documentation don't mention neither one.

Pavel Lang

So, is it guaranteed somehow that using process.domain.data in userland won't conflict with future node.js versions?

I thing no at the present time: Stability: 1 - Experimental

By the way, what is the preferred way to access current active domain? Is it process.domain or require('domain').active? Node.js core uses both ways: process.domain in timers.js and domain.active in events.js. Documentation don't mention neither one.

Both are fundamentals, this is question for Isaacs or @bnoordhuis, but:

What about prototype chaining? Can be efficiently used here (nesting execution context)?

chuggins

+1...support for this would be a huge win for any large scale Node application. A few features we desperately need on my team but can't provide:

  1. A logging feature that includes a unique request ID for all log statements, for easy debugging
  2. Instrumentation--for instance, for a given request how many SQL statements were executed? How long did it spend waiting on the database? On Facebook?

Without having some sort of request-scoped context, you end up having to pass the request or some sort of context object around to every single piece of code you write in order to support something like this.

Forrest L Norvell

:+1:

For obvious reasons, this would be hugely useful for New Relic as well. FWIW, my work with domains leads me to believe that they're an excellent means for handling errors in end-user applications, but a poor means for creating context (and not that hot for weaving error tracing into module code, either).

In particular, the requirements that error-handling imposes on domains makes them inappropriate as a means of propagating transactional state. It's very difficult to write instrumentation that uses domains without monkeypatching the domains module and some of its clients if you want user code to run the same whether or not it's using the instrumentation. This can lead to lots of deoptimization and / or slowing down some of the hottest code paths in Node. Something that either generalized domains or provided an alternate path through Node core for transactional state would be a huge win.

evadnoob
Andreas Ländle

Another +1. A "context" is definitely something that is missing - and even if it doesn't fit 100% into the node.js concept - real world is dirty and not all 3rd-party code allows us to retrieve something like a context objects in the callbacks. Also I'm not sure if domain is really the right place to implement this feature...

gsilk commented July 08, 2013

@eladb In a multi-threaded program, thread-local storage provides a mechanism for storing context per thread. How would you define an implicit execution context in a single-threaded node app, without support from v8 ...?

Elad Ben-Israel
eladb commented July 08, 2013

@gsilk You need some low level support in order to implement something like that. Node already supports domains which are a form of implicit context and this is an example for using it in order to implement async context for async chains. Read above - there is an interesting discussion with several pull requests on the topic.

manuelsantillan

Absolute +1 for this. Real-world deployments need it: logging, transactions, instrumentation, security audits, ... It's not about deployment size, any non-trivial customer-facing app would benefit from it.

Vladimir Kurchatkin

@manuelsantillan it's already possible with async listeners. Check https://github.com/othiym23/node-continuation-local-storage out

manuelsantillan
Trevor Norris
Collaborator
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.