Support for domains. #185

mikeal · 2012-10-04T07:30:36Z

This patch adds support for domains to async. The test addition is purely illustrative, i have no idea how to write or run tests the way you've got em and running npm test appears to do nothing.

The style of support:

if (process.domain) callback = process.domain.bind(callback)

Is something we discussed and established at NodeConf SummerCamp. The basic premise is that whoever owns an upper level transaction (like a web framework handling an HTTP request) will create a domain and wrap the handlers it fires for that transaction. The developer writing handlers has no idea there is a domain, and it's up to API authors (async, request, node-redis) to make sure they attach to the domain when it is active. In the case of request, it's all in an EventEmitter so it's automatically added, but for node-redis and async and other pure callback based APIs there will need to be code like this that binds the incoming callback to the domain so that any errors thrown in the handler get picked up by the transaction holder's domain.

I hope that makes sense :)

mikeal · 2012-10-04T07:32:29Z

Adding @isaacs @mranney @domenic . All were present during the discussion about this and I want to make sure I got the explanation right.

The good news about this change is that all users of async will auto-magically get proper 500s out of web frameworks that use domains without doing any work :)

domenic · 2012-10-04T07:45:29Z

Cool, looking forward to hearing from @isaacs about whether this lines up with his thoughts.

othiym23 · 2012-10-04T08:39:09Z

This is a great idea. The more utility libraries like this one deal with the domains corner cases, the more powerful domains become.

FWIW, this looks remarkably similar to the domain-like wrapping I'm doing in my own projects (which need to work with versions of Node older than 0.8).

domenic · 2012-10-04T08:42:27Z

You know what would be really useful? A simple 30-line web server, with a "route" that you can plug async code into, such that if your async code is not domain-aware the server crashes, but if the domains are working correctly you get a 500.

I imagine that for people familiar with domains, the test code @mikeal has in this patch would likely be sufficient illustration, but for those of us still struggling to fit it all together in our heads, something more concrete would be super-nice.

mikeal · 2012-10-04T14:11:28Z

@domenic you want like an example server or a real module?

This is off the top of my header, but is what you're talking about using a plain handler instead of routes.

var http = require('http')
  , domain = require('domain')
  ;

module.exports = function (handler) {
  var server = http.createServer(function (req, resp) {
    var d = domain.create()
    d.run(function () {
      handler(req, resp)
    })
    d.on('error', function (err) {
      console.error(err)
      resp.statusCode = 500
      resp.end()
    })
  })
  return server
}

ritch · 2012-10-04T14:53:30Z

@domenic @mikeal - I'd like to flesh out that example a bit and make sure my assumptions are accurate. To keep this pull request on topic, you can check it out in this gist.

Raynos · 2012-10-04T16:30:14Z

@mikeal this is awesome. My main problem with domains was wrapping callbacks in them is a pain. But if all the async libraries do the wrapping themself then it becomes a non issue.

isaacs · 2012-10-04T17:39:37Z

@mikeal In that example, you're not going to capture error events from the req and res. You're running the handler synchronously in the domain, but since the req and res might emit error events much later, those event handlers will escape.

Add this to your example:

d.add(req);
d.add(res);

and then all will be good.

isaacs · 2012-10-04T17:41:33Z

Regarding the actual patch, lgtm. If a user is using domains, then it'd be great to Just Work when they use async.

caolan · 2012-10-04T17:51:32Z

Domains seem brittle in their current state imho, but if that's the way node's moving (and it appears to perform similar bindings with built-in libs) then I'm fairly happy accepting the patch. Ideally, I shouldn't have to know about domains at all when handling callbacks. I also worry a little that some things will 'just work' with domains and some things won't. You basically have to read the source to find out... I've not used domains for much so these concerns may not be a issue in practice.

mikeal · 2012-10-04T22:23:56Z

I'd like to know what you mean by "brittle." Domains certainly have their shortcomings but it's a curious choice of words so I was hoping you could elaborate.

caolan · 2012-10-08T15:52:41Z

@mikeal inflexible, liable to break. Personally, I'd use more processes in most circumstances where I'd worry about this stuff.

isaacs · 2012-10-08T16:27:07Z

@coalan Yes, you should still use separate processes. However, if you have a cluster of 4 servers, let's say, and one of them throws an error, it's nice to be able to take the server out of rotation, send a 500 response to the user who's request crashed, log it, and then wait for the others to finish before shutting down the process. Domains make it possible to know who should get an error page.

Domains are a very low-level concept. If your functions are being called by event emitters, timeouts, handles, and req wraps (ie, anything in Node), then you get extremely good coverage by using Domains, in a way that you can't really get without hooks into the underlying event system. In fact, this patch is probably unnecessary, now that I think about it, since the active domain will already be attached to any kind of libuv req or handle, and there shouldn't really be any other way to do anything async anyway. (This is really only an issue if you're implementing some kind of connection pooling lib, or doing lowlevel stuff.)

And of course, you can build more interesting APIs on top of Domains in userland.

caolan · 2012-10-08T16:34:43Z

@isaacs the fact that I may not need to accept this patch due to any libuv req or handle already being bound to a domain makes me feel much happier with the concept. Basically, I'm saying "If I need to accept this for domains to work, something is wrong".

Raynos · 2012-10-08T16:41:41Z

@caolan you only need to accept this to allow users to throw errors in async handlers and have them go to a domain rather then uncaughtException.

Domains allow you to actually throw exceptions even though almost no-one does that.

isaacs · 2012-10-08T16:41:44Z

@caolan Well, I don't know much about the internals of async. If you're creating event emitters and re-using them, then yes, explicitly binding the function to the domain is necessary. (EventEmitters are bound to the active domain at creation time, and all their events fire in the context of that domain.)

isaacs · 2012-10-08T16:55:55Z

@Raynos But if the users are using domains already (and async isn't doing much magic), they it won't matter. In order for it to be the active process.domain, you must have already entered it. Any async operations that start from there will take that context with them, so the cb will end up being called either now (in which case, it's the active domain, obviously), or later (meaning that it was attached to a setTimeout initiated while it was the active domain, which will call its cb in the same domain it was initiated in.)

You can "escape" by pooling event emitters or timers. For example:

var cbs = [];
function onTick(cb) {
  cbs.push(cb);
}
setInterval(function() {
  var x = cbs.slice(0);
  cbs.length = 0;
  x.forEach(function(cb) {
    cb();
  });
}, 500);

// then later..

d.enter();
onTick(function() {
  // not in the domain!
  throw new Error('pwn');
});
d.exit();

Because the setInterval was established before entering the domain, it won't be the context when the function actually is called, and the throw will crash the server. Thus, this pattern is required in that case:

function onTick(cb) {
  if (process.domain) cb = process.domain.bind(cb);
  cbs.push(cb);
}

Similarly, consider a db library where you keep one or a few connection objects around at all times.

function get(query, cb) {
  connection.addQuery(query, cb)
}
// ...
Connection.prototype.addQuery = function(query, cb) {
  // some eventy stuff..
  this.once('drain', function() {
    this.once('result', cb);
    this.write(query);
  });
  if (this.queue.length === 0)
    this.emit('drain');
};

Now, even though we're presenting a fn(args, cb) style interface, the callback is actually being attached to a pre-existing connection object. In that case, the connection object will not be attached to the domain at the time of calling the API function, and will thus not be implicitly bound. Explicitly binding to the active domain at the API surface level is also a good idea in situations like this.

These two edge cases are very hard to do anything about in any sort of consistent way, and they're pretty rare. We're somewhat limited by the fact that we can't get any kind of hook into the moment a function is created, otherwise these cases could be done in a little bit better way, but even then, since you often re-use named functions, pass them around, etc, that might not be ideal either.

It's a tricky problem.

Raynos · 2012-10-08T17:29:12Z

@isaacs is domain.enter part of the api?

isaacs · 2012-10-09T15:38:03Z

@Raynos Yes. d.run(fn) ===> d.enter(); fn(); d.exit();

mikeal · 2012-10-10T09:34:18Z

@isaacs @caolan @Raynos i think you're missing the point of the patch, this code is all for cases where you aren't using an event emitter.

the intention is not to get people to start throwing in their callbacks either, it's to handle uncaught exceptions in their callbacks, like property access errors and other runtime exceptions that are not user written throw statements.

async doesn't know if the underlying callback API it's abstracting is using event emitters or pooling or anything, so it's a good practice to bind the callback to the active domain it entered with and preserve the domains ownership across the callback boundary in case the API being used isn't automatically bound to the domain by node.

othiym23 · 2012-10-10T17:38:25Z

Async is used in more places than just Node; shouldn't the PR either add a stub process object or add a test for process to each of the bindings?

Having thought about it a bit more, I see @mikeal's use case, but I think this points to a weakness in the design of domains. I agree that capturing exceptions and putting them on the domain is important, but this doesn't feel like something that should be required of a flow control library. At the very least, isn't the appropriate place to bind the handler in the module user's code?

isaacs · 2012-10-10T17:57:37Z

@mikeal I'm not getting it.

The domain is already active when calling the async method, or else it wouldn't be on process.domain.
The callback either runs immediately (in the current domain), or later via an event emitter created now (in the current domain), or later via an event emitter created earlier (see above), or later via a setTimeout/nextTick/setInterval/setImmediate (set in the current domain, so it'll enter that domain when it fires).

Can you provide a use case where your patch makes a difference, other than the EE pooling case? I'm not seeing it.

Raynos · 2012-10-10T18:00:41Z

@isaacs

what if the callback is function (foo) { return foo.bar } and foo === undefined. That thrown exception will not go to any domain because functions are not implicitly bound (you can't hook into function creation)

isaacs · 2012-10-10T18:05:05Z

The thrown exception will go to the domain that is active at the time the function is called. There's a limited number of things that can happen here:

If the function is called now, it'll get caught by process.domain. Explicit binding is unnecessary.
If the function is called later via a setTimeout, then the timer will be bound to the domain, and the domain will be entered before calling the function. Explicit binding is unnecessary.
If the function is called later via a setInterval, then the timer will be bound to the domain, and the domain will be entered before calling the function. Explicit binding is unnecessary.
If the function is called later via a setImmediate, then the timer will be bound to the domain, and the domain will be entered before calling the function. Explicit binding is unnecessary.
If the function is called later via a process.nextTick, then the tick object will be bound to the domain, and the domain will be entered before calling the function. Explicit binding is unnecessary.
If the function is called later via a new EventEmitter, then the emitter will be bound to the domain, and the domain will be entered before calling the function. Explicit binding is unnecessary.
If the function is called later via an EventEmitter that was created in the context of another domain (ie, pooling use-case). Explicit binding is necessary. Note that this case is very rare.

mikeal · 2012-10-13T20:44:39Z

@isaacs here

var client = redis.client(host, port)

jaws().route('/path/blah', function (req, res){
  async.each(['key1', 'key2'], client.get.bind(client), function (e, results) {
    results[0].property.that.does.not.exist
  })
}).listen(8080)

redis pools it's socket and will not create any event emitters that get bound to that callback.

jaws creates a domain around the handler and needs that exception to get caught in it.

isaacs · 2012-10-14T02:48:24Z

Ok, so, this really belongs in node_redis, rather than in async. However, async is easier to fix, and will probably provide coverage for more libs than going around fixing all the libs, anyway, so sure. Whatever. +1, I guess :)

But still, for those of us that use redis, and not async, it'd be good to fix it in node_redis. And for those of us that use async, but don't use redis (or other client-pooling libs), this is a big no-op.

othiym23 · 2012-10-14T02:55:33Z

I've talked to @mranney about adding support for this to node_redis, and he said a pull request was fine – as long as it didn't have a significant effect on performance. I haven't had time to do it yet, but I've been planning to do some work with node_redis anyway and can try to put together a PR this next week.

mikeal · 2012-10-14T08:54:50Z

I attempted to write it, didn't get it working.

mikeal/node_redis@50e2a61

I still think that this pull request should get merged. I think there will be some libraries that slip through the cracks and async is a good place to pick them up.

caolan · 2014-03-28T16:28:31Z

Not planning on adding domains into the lib itself, that should be handled by libraries / user code

Adding domain support.

fc62a36

jeffbski mentioned this pull request Oct 4, 2012

add domains support jeffbski/autoflow#11

Open

caolan closed this Mar 28, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for domains. #185

Support for domains. #185

mikeal commented Oct 4, 2012

mikeal commented Oct 4, 2012

domenic commented Oct 4, 2012

othiym23 commented Oct 4, 2012

domenic commented Oct 4, 2012

mikeal commented Oct 4, 2012

ritch commented Oct 4, 2012

Raynos commented Oct 4, 2012

isaacs commented Oct 4, 2012

isaacs commented Oct 4, 2012

caolan commented Oct 4, 2012

mikeal commented Oct 4, 2012

caolan commented Oct 8, 2012

isaacs commented Oct 8, 2012

caolan commented Oct 8, 2012

Raynos commented Oct 8, 2012

isaacs commented Oct 8, 2012

isaacs commented Oct 8, 2012

Raynos commented Oct 8, 2012

isaacs commented Oct 9, 2012

mikeal commented Oct 10, 2012

othiym23 commented Oct 10, 2012

isaacs commented Oct 10, 2012

Raynos commented Oct 10, 2012

isaacs commented Oct 10, 2012

mikeal commented Oct 13, 2012

isaacs commented Oct 14, 2012

othiym23 commented Oct 14, 2012

mikeal commented Oct 14, 2012

caolan commented Mar 28, 2014

Support for domains. #185

Support for domains. #185

Conversation

mikeal commented Oct 4, 2012

mikeal commented Oct 4, 2012

domenic commented Oct 4, 2012

othiym23 commented Oct 4, 2012

domenic commented Oct 4, 2012

mikeal commented Oct 4, 2012

ritch commented Oct 4, 2012

Raynos commented Oct 4, 2012

isaacs commented Oct 4, 2012

isaacs commented Oct 4, 2012

caolan commented Oct 4, 2012

mikeal commented Oct 4, 2012

caolan commented Oct 8, 2012

isaacs commented Oct 8, 2012

caolan commented Oct 8, 2012

Raynos commented Oct 8, 2012

isaacs commented Oct 8, 2012

isaacs commented Oct 8, 2012

Raynos commented Oct 8, 2012

isaacs commented Oct 9, 2012

mikeal commented Oct 10, 2012

othiym23 commented Oct 10, 2012

isaacs commented Oct 10, 2012

Raynos commented Oct 10, 2012

isaacs commented Oct 10, 2012

mikeal commented Oct 13, 2012

isaacs commented Oct 14, 2012

othiym23 commented Oct 14, 2012

mikeal commented Oct 14, 2012

caolan commented Mar 28, 2014