Skip to content
This repository has been archived by the owner on Apr 22, 2023. It is now read-only.

getaddrinfo ENOTFOUND in dns.js when doing a lot of HTTP requests #5545

Closed
eelcocramer opened this issue May 24, 2013 · 23 comments
Closed

getaddrinfo ENOTFOUND in dns.js when doing a lot of HTTP requests #5545

eelcocramer opened this issue May 24, 2013 · 23 comments

Comments

@eelcocramer
Copy link

This is the same issue as Issue #5488 but has a code example to reproduce.

On node v0.10.7 (on OS X 10.8.3) the error occurs for me after 240 request (on node v0.8 after 244 requests).

var http = require('http');

var count = 0;

var makeRequest = function() {
  var req;

  req = http.get("http://www.google.com/index.html", function(res) {
    count++;
    console.log('STATUS: ' + res.statusCode);
    console.log('Request count = ' + count);
  });

  req.on('error', function(e) {
    console.log('problem with request: ' + e.message);
    console.log('Last successful request count = ' + count);
  }).on("socket", function (socket) {
    socket.emit("agentRemove");
  });
};

setInterval(makeRequest, 10);

If http.request is used instead of http.get the error does not occur. Browsing through the source of nodejs I cannot find an obvious reason why.

There is some more code examples in a gist I created:
https://gist.github.com/eelcocramer/5626801

The gist has 3 files:

error-case.js = script that results in the error for me after 240 requests
success-case.js = different approach to the same functionality but this does not result in the error
fixed-error-case.js = basically the same script as error-case.js but has the req.socket.close call (in line 12) as suggested by @brianseeders in Issue #5488.

@danmaz74
Copy link

I confirm this issue on node 0.10.4
No problem with node 0.8.21

@eelcocramer
Copy link
Author

Tested with 0.8.21 as well but the issue remains.

@danmaz74
Copy link

So maybe it's not exactly the same issue. I was using the same script on two different servers; the one with 0.10.4 was giving this problem, the one with 0.8.21 wasn't. I just installed 0.8.21 on the server that was giving the problem, I'll update you if this solves it for me.

@jdurack
Copy link

jdurack commented May 28, 2013

I have the same issue on 0.10.4, but it happens consistently between 1,002 and 1,014 requests. Tested ~10 times.

I noticed that the test above was building up a backlog of outstanding requests for me, so I slowed it down but had the same breaking point (just over 1,000 requests).

This is an issue we're running into intermittently with our production code. Very happy to see it reproduced here.

@danmaz74
Copy link

I confirm that with v 0.8.21 I'm not having that problem, while with 0.10.4 I was.

@discretepackets
Copy link

I'm getting this issue too, since at least April! I'm currently on v0.10.3 and it isn't working.

@bnoordhuis
Copy link
Member

Upgrade, people! By the sound of it, you're hitting a bug that was fixed in v0.10.6.

@jdurack
Copy link

jdurack commented Jun 4, 2013

Unfortunately that didn't help. I just upgraded to 0.10.10 and I'm seeing the same issue.

@eelcocramer
Copy link
Author

The issue still remains on node v0.10.10 and v0.8.24.

@bnoordhuis
Copy link
Member

Okay, there's two things here. One is a libuv bug that was fixed a while ago, hence my 'upgrade' comment.

The other is that it sounds like you're hitting the EMFILE limit, the per-process open file descriptor limit. Libuv uses the getaddrinfo() POSIX function under the hood and it more or less returns what that function returns.

Easy solution: crank up ulimit -n to something high. It's a silly limitation anyway.

@brianseeders
Copy link

Cranking up ulimit -n is not really a solution. It probably just means that you are going to use way more resources and your app will run longer before experiencing the same issue.

This bug comes about by spawning requests faster than they are actually completed. It almost seems like when you do this, even if you stop spawning new ones, the old ones don't get cleaned up correctly. I could be wrong there, though.

If you want to spawn a request, and have that request spawn a new one once it is complete, there doesn't seem to be a reliable way to do it. I could not find an event that was fired after the response has been completely received and the socket has been closed. In my case, I don't actually need the response body, so as soon as I know that the response has started to be received, I just manually destroy the socket, then spawn a new request.

Does that help?

@bnoordhuis
Copy link
Member

If you're referring to the test case posted above, it emits socket.on('agentRemove') on the socket so it's no surprise it's leaking file descriptors - the fd stays around until one side closes the connection. If you're going down that road, you need to install 'close' and 'end' event listeners and move carefully.

So far I'm not convinced there's a bug in node.js. I'm closing the issue unless someone convinces me otherwise.

@eelcocramer
Copy link
Author

Ok, now I understand. The proper way would be to set agent=false in the options get the http.get call like so:

var http = require('http');

var count = 0;

var options = {
  hostname: 'www.google.com',
  port: 80,
  path: '/index.html',
  method: 'GET',
  agent: false
};

var makeRequest = function() {
  var req;

  req = http.get(options, function(res) {
      count++;
      console.log('STATUS: ' + res.statusCode);
      console.log('Request count = ' + count);
  });

  req.on('error', function(e) {
      console.log('problem with request: ' + e.message);
      console.log('Last successful request count = ' + count);
  });
};

setInterval(makeRequest, 10);

@jdurack
Copy link

jdurack commented Jun 14, 2013

I believe this is still an issue.

Here's my reproduction case, based off the one above:
https://gist.github.com/jdurack/5785783

I've upgraded to node v0.10.11.
I've tried setting agent to false.
I've tried setting my "ulimit -n" really high.
I've removed the "agentRemove" event used above.
None of these fixed the issue for me.

I still get "getaddrinfo ENOTFOUND" every time after ~1,000 http responses.

@isaacs
Copy link

isaacs commented Jun 14, 2013

@jdurack You're not consuming the requests that you're receiving. From http://nodejs.org/docs/latest/api/http.html#http_class_http_clientrequest:

If no 'response' handler is added, then the response will be entirely discarded. However, if you add a 'response' event handler, then you must consume the data from the response object, either by calling response.read() whenever there is a 'readable' event, or by adding a 'data' handler, or by calling the .resume() method. Until the data is consumed, the 'end' event will not fire.

Since the data is never consumed, the fd is never read or destroyed, so it's never closed.

See the fix in my fork of your gist: https://gist.github.com/isaacs/5785971

@IIIEII
Copy link

IIIEII commented Jul 4, 2013

I have the same issue with node v0.10.12
My application is running by forever and sometimes in my log I can see such messages:

events.js:72
        throw er; // Unhandled 'error' event
              ^
Error: getaddrinfo ENOTFOUND
    at errnoException (dns.js:37:11)
    at Object.onanswer [as oncomplete] (dns.js:124:16)
error: Forever detected script exited with code: 8
error: Forever restarting script for 2 time

The most strange thing is that all of my code (and every callback) is turned into try-catch. In some cases exception catched, but in some (as in log above) exception not catched and application crashes.
I constantly check number of fd with ls -a -p <PID> but it never exeeds 30 (with ulimin -n = 1024)

@IIIEII
Copy link

IIIEII commented Jul 5, 2013

I've done some additional tests.
Every error in requests was catched and re-requested. Everytime second attempt (with same params) was succesfull.
I have requests counter (all requests in whole node application). Here is numbers of bad requests:

479,491,512,570,1169,1184,1291,1411,1478,1514,1596,1605,1645,1663,1712,1733,1824,1842,1929,1948,1952,2172,2185,2288,2333,2404,2413,2445,2450,2473,2488,2513,2550,2563,2571,2690,2693,2718,2835,2847,2927,3110,3156,3160,3227,3231,3389,3413,3419,3490,3524,2690,2693,2718,2835,2847,2927,3110,3156,3160,3227,3231,3389,3413,3419,3490,3524

So, I can summarize that request params are good and max file descriptors count is not exeeded.
Any ideas abount this behaviour?

@IIIEII
Copy link

IIIEII commented Jul 7, 2013

As workaround I added manual try-catched dns.resolve4 call before http.request.

dns.resolve4(url.parse(full_url).hostname, function(e, addresses) {
    if (e) {
        console.log(new Date().toISOString(), url.parse(full_url).hostname, '\n', e.stack, '\n');
        dns.resolve4(url.parse(full_url).hostname, function(e, addresses) {
            if (e) {
                console.log(new Date().toISOString(), url.parse(full_url).hostname, '\n', e.stack, '\n');
                throw e;
            } else {
                tryDownload.apply(this,[addresses[0]]);
            }
        });
    } else {
        tryDownload.apply(this,[addresses[0]]);
    }
});
var tryDownload = function(address) {
   // http.request with address here
};

Now I use hostname=<ip address> and path=<full http://... path> in options of request. Everything works without errors.

@musamusa
Copy link

musamusa commented Aug 8, 2014

@IIIEII, your workaround worked for me

@tiagoalves
Copy link

In our case this was happening when we reached about 1000 sockets and 1024 file descriptors in total in Ubuntu. We had increased the ulimit value for open file descriptors to 8192 in /etc/security/limits.conf but it was having no effect. Turns out upstart scripts ignore the limits.conf settings. We set the new limits directly in the upstart script and the problem was solved. More details here: http://lzone.de/cheat-sheet/ulimit.

@sudhirbitsgoa
Copy link

this error came up when we changed the host name through code and restarted the node service.We figured out that we are not updating /etc/hosts. so dns resolution was not happening.

@prcongithub
Copy link

What is the final conclusion on this?
I have also started facing the same problem on CentOS release 6.4 (Final).
node: v0.10.4
npm: 2.14.3

events.js:72
        throw er; // Unhandled 'error' event
              ^
Error: getaddrinfo ENOTFOUND
    at errnoException (dns.js:37:11)
    at Object.onanswer [as oncomplete] (dns.js:124:16)

This happens when I am load testing the app with just 10 requests per second

@cjihrig
Copy link

cjihrig commented Jan 21, 2016

node: v0.10.4

Please see #5545 (comment). You're using a pretty old version of Node. Try upgrading and see if your problem goes away.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests