Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

fix error causing socket to not return to the http agent until timeout occurs. #5100

Closed
wants to merge 2 commits into
from

Conversation

Projects
None yet
3 participants

vshlos commented Mar 21, 2013

The Problem

I have found an issue where if there is an exception thrown in the handler of a http response, then this connection will not be returned to the pool until it times out a few minutes later. This causes the pool to backup and can lead to a denial of service attack or other issues in which node will not be able to talk to a host for a given amount of time.

Here is a simple proof of concept:

var http = require("http")

function test() {
    var options = {
        host: "google.com",
        port: 80,
        path: "/"
    }

    console.log("started")
    var req = http.request(options, function (res) {
        console.log("response")

        res.on("end", function () {
            throw new Error("HELLO")

        })
    })
    req.end()

}

process.on("uncaughtException", function () {
    console.log("uncaught");
})

for (var i = 0; i < 10; i++)
    test();

This code sample will display "started" 10 times and "response" 5 times (maxSockets is 5) then hang for a couple of minutes until those sockets timeout, then show "response" 5 more times.

This exception can be thrown on the response callback or in any of the emit events and will cause a similar issue.

This is especially noticeable when you are running a test framework like mocha and on the response you check the JSON object with an assert which throws an exception and causes the rest of the tests that talk to the same endpoint to fail or timeout.

For example if you have the following type of code

test.makeRequest(function (result) {
    assert(result.firstName != "")
})

After 5 of these tests fail, the rest will fail with a timeout even though the test might otherwise pass.

The Fix

There is now a try-catch wrapper around the emit handling code in http.js which overrides the emit event and ensures that responseEnd is always called. If there is an exception on getting the response or in the "data" event, then it will kill the socket and return it to the pool.

There is now 1 added test to test this exact problem in pummel/test-http-agent-exception.js

Vlad Shlosberg added some commits Mar 21, 2013

Fixing bug that causes socket to not return to http socket pool if th…
…ere is an exception thrown in one of the handlers such as 'end'

@bnoordhuis bnoordhuis commented on the diff Mar 21, 2013

lib/http.js
+ try {
+ handled = req.emit('response', res);
+ }
+ catch (ex){
+ req.shouldKeepAlive = false;
+ responseOnEnd.apply(res);
+ throw ex;
+ }
+
+ //res.on("end", responseOnEnd)
+
+ // Override the emit event to explicetly look for the "end" event of the request.
+ // This will make sure the socket is returned back to the pull if the "end" throws
+ // an exception.
+ var emit = res.emit;
+ res.emit = function () {
@bnoordhuis

bnoordhuis Mar 21, 2013

Member

That's rather hack-ish and probably expensive, performance-wise. Have you run make bench-http?

isaacs commented Mar 21, 2013

Thrown errors will generally always leave a system in an undefined state. That's what we have domains and such for. A re-throw like you have it set up here, will obscure the original source of the error, which is unacceptable.

The bug report is fine, but this approach is not acceptable, I'm sorry. Can you post the original problem as a regular old issue so that we can get it in the list of things to keep in mind as we refactor http in 0.11?

@isaacs isaacs closed this Mar 21, 2013

vshlos commented Mar 21, 2013

Rethrowing doesn't obscure the original source. The stack trace still points to the original exception source.

try {
    (function topLevel () {
        try {
            (function innerLevel () {
                throw new Error("HELLO")
            })()
        }
        catch (ex) {
            console.log("inner")
            throw ex;
        }
    })()
}
catch (ex) {
    console.log(ex.stack)
}

Will return

inner
Error: HELLO
    at innerLevel (/Users/vshlosbe/tmp/throwstack.js:5:23)
    at topLevel (/Users/vshlosbe/tmp/throwstack.js:6:15)
    at Object.<anonymous> (/Users/vshlosbe/tmp/throwstack.js:12:7)
    at Module._compile (module.js:449:26)
    at Object.Module._extensions..js (module.js:467:10)
    at Module.load (module.js:356:32)
    at Function.Module._load (module.js:312:12)
    at Module.runMain (module.js:492:10)
    at process.startup.processNextTick.process._tickCallback (node.js:244:9)

so will this:

process.on("uncaughtException", function (err) {
        console.log(err.stack)
});

 (function topLevel () {
    try {
        (function innerLevel () {
            throw new Error("HELLO")
        })()
    }
    catch (ex) {
        console.log("inner")
        throw ex;
    }
})()

Also in response to why it was done this was done this way is because the general behavior in all languages across the board is when you have an event emitter and an exception is thrown in the first call, the event will not be emitted to the rest. Because of this I wanted to keep the logic the same and not try to catch exceptions in the event emitter. Especially since causes confusion in what to do with the exception.

Also since this is a problem only affecting the agent, I did not want to make a different function call that socket has to make. So this way the fix is confined to the agent only.

Instead of simply closing this request, can you simply give me some feedback on a better approach to fixing this or a better direction to go? Why would anyone want to contribute to node when instead of providing feedback on my hours of work in finding the issue, testing it out, figuring out a solution, running tests and finally submitting it to you, when you simply throw it away?

isaacs commented Mar 21, 2013

It doesn't obscure the stack, it obscures the printed line. Compare:

// t.js
function inner() {
  throw new Error('bark');
}

function outer() {
  inner();
}

try {
  outer();
} catch (er) {
  throw er;
}
$ ./node t.js

/Users/isaacs/dev/js/node-v0.10/t.js:12
  throw er;
        ^
Error: bark
    at inner (/Users/isaacs/dev/js/node-v0.10/t.js:2:9)
    at outer (/Users/isaacs/dev/js/node-v0.10/t.js:6:3)
    at Object.<anonymous> (/Users/isaacs/dev/js/node-v0.10/t.js:10:3)
    at Module._compile (module.js:456:26)
    at Object.Module._extensions..js (module.js:474:10)
    at Module.load (module.js:356:32)
    at Function.Module._load (module.js:312:12)
    at Function.Module.runMain (module.js:497:10)
    at startup (node.js:119:16)
    at node.js:901:3
// t2.js
function inner() {
  throw new Error('bark');
}

function outer() {
  inner();
}

outer();
$ ./node t2.js

/Users/isaacs/dev/js/node-v0.10/t2.js:2
  throw new Error('bark');
        ^
Error: bark
    at inner (/Users/isaacs/dev/js/node-v0.10/t2.js:2:9)
    at outer (/Users/isaacs/dev/js/node-v0.10/t2.js:6:3)
    at Object.<anonymous> (/Users/isaacs/dev/js/node-v0.10/t2.js:9:1)
    at Module._compile (module.js:456:26)
    at Object.Module._extensions..js (module.js:474:10)
    at Module.load (module.js:356:32)
    at Function.Module._load (module.js:312:12)
    at Function.Module.runMain (module.js:497:10)
    at startup (node.js:119:16)
    at node.js:901:3

We sometimes get around this by putting the cleanup code in a finally, without a catch:

// tf.js
function inner() {
  throw new Error('bark');
}

function outer() {
  inner();
}

var threw = true;
try {
  outer();
  threw = false;
} finally {
  console.error('finally, threw = %j', threw);
}
$ ./node tf.js
finally, threw = true

/Users/isaacs/dev/js/node-v0.10/tf.js:2
  throw new Error('bark');
        ^
Error: bark
    at inner (/Users/isaacs/dev/js/node-v0.10/tf.js:2:9)
    at outer (/Users/isaacs/dev/js/node-v0.10/tf.js:6:3)
    at Object.<anonymous> (/Users/isaacs/dev/js/node-v0.10/tf.js:11:3)
    at Module._compile (module.js:456:26)
    at Object.Module._extensions..js (module.js:474:10)
    at Module.load (module.js:356:32)
    at Function.Module._load (module.js:312:12)
    at Function.Module.runMain (module.js:497:10)
    at startup (node.js:119:16)
    at node.js:901:3

Instead of simply closing this request, can you simply give me some feedback on a better approach to fixing this or a better direction to go?

Sorry, I thought I did, must have been unclear.

Please post the issue as a regular issue. It's a bit subtle, and any fix for it will cause behavior changes, and probably will have to be rolled into the more extensive http changes planned for 0.12. Then we can discuss the problem, the scope, etc., in the context of other changes planned.

Ie, I closed this pull req because it's premature. We should have a non-pull-req issue first, so that we can triage it and discuss the problem a bit more carefully.

vshlos commented Mar 21, 2013

OK I have created the bug request here: joyent#5107

Ill look for comments on that thread and hopefully my fix will make it into the module anyway before the http rework in 0.12. I hope this might make it in a smaller release since this is a pretty annoying issue for us (@linkedin) and am scared of what it can lead to. I can remove the rethrow code as you suggested by having a finally call and resubmit the pull request. Since I dont think the behavior is changing (besides the actual bug fix causing it to act as it should), can I work on a modified pull request and get that checked in before you rework the entire module for 0.12 so that I can get this feature included in a smaller code release?

richardlau pushed a commit to ibmruntimes/node that referenced this pull request Feb 18, 2016

crypto: add `pfx` certs as CA certs too
According to documentation all certificates specified in `pfx` option
should be treated as a CA certificates too. While it doesn't seem to be
logically correct to me, we can't afford to break API stability at this
point.

Fix: #5100
PR-URL: nodejs/node#5109
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Shigeki Ohtsu <ohtsu@iij.ad.jp>

richardlau pushed a commit to ibmruntimes/node that referenced this pull request Feb 18, 2016

test: disable gh-5100 test when in FIPS mode
This is a follow-up fix for half-broken test in 23196fe, and an attempt
to recover some dignity after breaking CI.

PR-URL: nodejs/node#5144
Reviewed-By: Rich Trott <rtrott@gmail.com>

richardlau pushed a commit to ibmruntimes/node that referenced this pull request Feb 18, 2016

crypto: add `pfx` certs as CA certs too
According to documentation all certificates specified in `pfx` option
should be treated as a CA certificates too. While it doesn't seem to be
logically correct to me, we can't afford to break API stability at this
point.

Fix: #5100
PR-URL: nodejs/node#5109
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Shigeki Ohtsu <ohtsu@iij.ad.jp>

richardlau pushed a commit to ibmruntimes/node that referenced this pull request Feb 18, 2016

test: disable gh-5100 test when in FIPS mode
This is a follow-up fix for half-broken test in 23196fe, and an attempt
to recover some dignity after breaking CI.

PR-URL: nodejs/node#5144
Reviewed-By: Rich Trott <rtrott@gmail.com>

richardlau pushed a commit to ibmruntimes/node that referenced this pull request Feb 18, 2016

crypto: add `pfx` certs as CA certs too
According to documentation all certificates specified in `pfx` option
should be treated as a CA certificates too. While it doesn't seem to be
logically correct to me, we can't afford to break API stability at this
point.

Fix: #5100
PR-URL: nodejs/node#5109
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Shigeki Ohtsu <ohtsu@iij.ad.jp>

richardlau pushed a commit to ibmruntimes/node that referenced this pull request Feb 18, 2016

test: disable gh-5100 test when in FIPS mode
This is a follow-up fix for half-broken test in 23196fe, and an attempt
to recover some dignity after breaking CI.

PR-URL: nodejs/node#5144
Reviewed-By: Rich Trott <rtrott@gmail.com>

richardlau pushed a commit to ibmruntimes/node that referenced this pull request Mar 9, 2016

crypto: add `pfx` certs as CA certs too
According to documentation all certificates specified in `pfx` option
should be treated as a CA certificates too. While it doesn't seem to be
logically correct to me, we can't afford to break API stability at this
point.

Fix: #5100
PR-URL: nodejs/node#5109
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Shigeki Ohtsu <ohtsu@iij.ad.jp>

richardlau pushed a commit to ibmruntimes/node that referenced this pull request Mar 9, 2016

test: disable gh-5100 test when in FIPS mode
This is a follow-up fix for half-broken test in 23196fe, and an attempt
to recover some dignity after breaking CI.

PR-URL: nodejs/node#5144
Reviewed-By: Rich Trott <rtrott@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment