Domains dispose problem #3559

felixge · 2012-06-27T14:50:25Z

Problem: When creating streams inside of a domain, Domain#dispose() will
not clean them up.

This patch demonstrates this problem using a test case, and provides a
possible fix by also adding implicit domain members to the domain.

Considering that I don't understand the design decisions that went into
only tracking explicit members at this point, this patch should be
considered as a request for comment on the current behavior / to start a
discussion on improving it.

Problem: When creating streams inside of a domain, Domain#dispose() will not clean them up. This patch demonstrates this problem using a test case, and provides a possible fix by also adding implicit domain members to the domain. Considering that I don't understand the design decisions that went into only tracking explicit members at this point, this patch should be considered as a request for comment on the current behavior / to start a discussion on improving it.

isaacs · 2012-06-27T23:00:23Z

In developing domains, I found that adding implicit members was a pretty significant performance hit. Can you try running bash benchmark/http.sh with and without this change, to see if it causes problems? We're on a new V8 now, so it might be a bit less hazardous.

isaacs · 2012-06-27T23:01:59Z

Sorry, that won't do anything. Use this command:

NODE_USE_DOMAINS=1 bash benchmark/http.sh

But, running without domains should definitely not cause a performance impact, either :)

felixge · 2012-06-28T09:57:18Z

Did some benchmarks:

Not using domains:

$ bash benchmark/http.sh 
net.inet.ip.portrange.first: 12000 -> 12000
net.inet.tcp.msl: 1000 -> 1000
kern.maxfiles: 1000000 -> 1000000
kern.maxfilesperproc: 1000000 -> 1000000
pid 79936
Listening at http://127.0.0.1:8000/
create storedBytes[n]
Completed 3000 requests
Completed 6000 requests
Completed 9000 requests
Completed 12000 requests
Completed 15000 requests
Completed 18000 requests
Completed 21000 requests
Completed 24000 requests
Completed 27000 requests
Completed 30000 requests
Finished 30000 requests
Requests per second:    3608.00 [#/sec] (mean)

Domains as in 0.8.0:

$ NODE_USE_DOMAINS=1 bash benchmark/http.sh 
net.inet.ip.portrange.first: 12000 -> 12000
net.inet.tcp.msl: 1000 -> 1000
kern.maxfiles: 1000000 -> 1000000
kern.maxfilesperproc: 1000000 -> 1000000
pid 80170
Listening at http://127.0.0.1:8000/
create storedBytes[n]
Completed 3000 requests
Completed 6000 requests
Completed 9000 requests
Completed 12000 requests
Completed 15000 requests
Completed 18000 requests
Completed 21000 requests
Completed 24000 requests
Completed 27000 requests
Completed 30000 requests
Finished 30000 requests
Requests per second:    3496.04 [#/sec] (mean)

Domains as in 0.8.0 + implicit domain member tracking patch:

$ NODE_USE_DOMAINS=1 bash benchmark/http.sh 
net.inet.ip.portrange.first: 12000 -> 12000
net.inet.tcp.msl: 1000 -> 1000
kern.maxfiles: 1000000 -> 1000000
kern.maxfilesperproc: 1000000 -> 1000000
pid 79979
Listening at http://127.0.0.1:8000/
create storedBytes[n]
Completed 3000 requests
Completed 6000 requests
Completed 9000 requests
Completed 12000 requests
Completed 15000 requests
Completed 18000 requests
Completed 21000 requests
Completed 24000 requests
Completed 27000 requests
Completed 30000 requests
Finished 30000 requests
Requests per second:    2761.73 [#/sec] (mean)

You're right: There is a significant performance penalty for tracking domain members implicitly. But IMO domains are useless without it. I mean it's cool that I can match exceptions up to my http requests with them, but without being able to clean up, my processes will leak resources.

So even with the performance implications, I think domains should track implicit members by default. If there is a use case for domains without explicit member tracking (I don't think there is), we could easily make the behavior configurable on a per-domain basis.

Let me know what you think, and I'll finalize this patch to also take care of timers.

--fg

There were two problems with the way domains were used in the benchmark: a) A global domain was created. IMO this is not how one would normally use domains, as an error caught by a global domain would be just as difficult to handle as an 'uncaughtException' one. More importantly: Setting up a global domain makes implicit domain member tracking expensive as per-request domains will now also be tracked as members of the global domain. b) The per-request domains created in the benchmark were never entered. While it didn't have an impact on the performance of the benchmark, it does seem a bit pointless to not enter the domains, so this patch causes them to be entered.

felixge · 2012-06-29T09:55:21Z

@isaacs Please have a look at my commit above. IMO the benchmark used to determine the impact of domains was flawed, and after fixing it, my patch for adding implicit member tracking no longer causes any performance problems:

Fixed benchmark + without implicit member tracking patch:

Requests per second:    3332.68 [#/sec] (mean)

Fixed benchmark + implicit member tracking patch:

Requests per second:    3376.45 [#/sec] (mean)

So ... if you agree with my adjustment to the benchmark, what do you think about enabling implicit member tracking by default now? IMO it would make domains much more useful / easier to explain.

Let me know and I'll come up with a final set of patches / tests for this.

--fg

isaacs · 2012-06-29T20:00:14Z

@felixge That's fascinating. Let's discuss this in more detail after (or at) nodeconf.

The dom.enter() should be mostly unnecessary, but also harmless. The request and response objects are bound to it, so they'll enter during their events. However, it's probably the server's connection event entering the global domain that is the bottleneck, since it's added implicitly to the gdom domain.

I think it's clear that we need a better understanding of exactly where it's spending its time. Removing the explicit/implicit disparity would be nice, but we do have to be very careful we don't create performance regression landmines.

felixge · 2012-08-02T11:13:53Z

@isaacs the main thing 44a0f74 was fixing was essentially a leak in the benchmark. If you do implicit member tracking, and you have a global domain + domains for each request, the domains for each request will all be added as children to the global domain, and never cleaned up.

I remember our discussion at nodeconf where you said the global domain was useful in your scenario (where you'd shut down the process after any domain errors), but IMO the biggest use case I see for domains is keeping long-running connections alive. This means the process cannot be shutdown (to clean up leaks).

So from your perspective: What would you like to see to move this forward? You mentioned a more detailed analysis on where time is spend? What about making the implicit member tracking optional, this way you can ignore the performance issues for your use case (having a global domain), and I can get good performance on my use case (not having a global domain) at the same time.

tanguylebarzic · 2012-12-18T21:36:18Z

Hi,

Any news on this? I have a similar issue:
See gist https://gist.github.com/4332159.
Launching the server, then calling http://localhost:1337/ works the five first times, but after that calls to curl just hang.

From what I saw, it's because the underlying socket of the http.request is not destroyed, resulting in a starvation of the sockets pool. Calling newDomain.dispose inside process.nextTick 'fixes' the problem. From what I was able to understand, it's because domain._disposed is set to true later, allowing some EventEmitter to enter the domain and get cleaned better.

Would be great to be able to use domains without worries :).

Nodejs-Jenkins · 2013-03-13T23:28:08Z

Can one of the admins verify this patch?

isaacs · 2013-08-19T23:13:55Z

This patch is no longer relevant. @trevnorris is in the process of significantly refactoring how domains work in master, and domain.dispose() has been widely acknowledged as a failure of an abstraction (it doesn't do what it claims, and is unnecessary and unsafe.)

AndreasMadsen mentioned this pull request Sep 22, 2012

Keeping a domain.active fails #3982

Closed

isaacs closed this Aug 19, 2013

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Domains dispose problem #3559

Domains dispose problem #3559

felixge commented Jun 27, 2012

isaacs commented Jun 27, 2012

isaacs commented Jun 27, 2012

felixge commented Jun 28, 2012

felixge commented Jun 29, 2012

isaacs commented Jun 29, 2012

felixge commented Aug 2, 2012

tanguylebarzic commented Dec 18, 2012

Nodejs-Jenkins commented Mar 13, 2013

isaacs commented Aug 19, 2013

Domains dispose problem #3559

Domains dispose problem #3559

Conversation

felixge commented Jun 27, 2012

isaacs commented Jun 27, 2012

isaacs commented Jun 27, 2012

felixge commented Jun 28, 2012

felixge commented Jun 29, 2012

isaacs commented Jun 29, 2012

felixge commented Aug 2, 2012

tanguylebarzic commented Dec 18, 2012

Nodejs-Jenkins commented Mar 13, 2013

isaacs commented Aug 19, 2013