Skip to content

Error: ETIMEDOUT, Operation timed out #3

Open
clehner opened this Issue Jul 7, 2011 · 8 comments

4 participants

@clehner
clehner commented Jul 7, 2011

My OscarConnections consistently time out after 2 hours, 11 minutes and 17 seconds. What can I do to prevent this? I have been just reconnecting them, but then after a while the program dies with an error about "too many open files".

@mscdex
Owner
mscdex commented Jul 7, 2011

Yes, this is a known issue and is in the TODO. It has to do with the fact that there currently isn't a timer started upon connection that sends keepalive packets to at least the main server connection.

I'm hoping to get to knock this and a few other TODO items off the list in the near future.

@clehner
clehner commented Jul 8, 2011

clehner@6250696

I tried adding a timer to send keepalive flaps on the main connection, but I still got timeouts.

I'll investigate some more and will try sending keepalives on all the connections. Any ideas?

@mscdex
Owner
mscdex commented Jul 8, 2011

I think I found the culprit. I didn't read the error message clearly and it has nothing to do with the application-level keepalive. Instead it has to do with the OS-level tcp keepalive timeout (http://www.kernel.org/doc/man-pages/online/pages/man7/tcp.7.html under tcp_keepalive_time).

I'd think that sending out application-level keepalive packets should cause the connection to no longer be idle though, keeping the OS from disconnecting with ETIMEDOUT. I'm not able to do any testing here yet, but temporarily changing the default OS tcp_keepalive_time value might help during debugging, that way you won't have to wait 2 hours to see if a solution works or not.

EDIT: Perhaps try supplying an initialDelay argument for setKeepAlive() in _addConnection()?

@clehner
clehner commented Jul 9, 2011

I started sending keepalive packets on all the connections instead of just main, and now it looks like the ETIMEDOUTs are banished.

But I'm still having trouble. In one of my apps, I get now get "Error: ECONNRESET, Connection reset by peer", after about 15 minutes on each account.

In another app, I don't get the connection reset, but the bart connection closes, which makes the OscarConnection emit a close event, even though the account is still online.

@digitalgecko

I know this issue is ancient but did anyone ever sort this out?

I am not certain that this is my problem however if I disable the oscar part of my app it doesnt happen. With it enabled I get ... ... After about 15 minutes

Error: read ECONNRESET
at errnoException (net.js:883:11)
at TCP.onread (net.js:539:19)
[ERROR] 23:00:43 Error

@mscdex
Owner
mscdex commented Jan 21, 2014

It's hard to say what's causing it. For all I know it's normal behavior for the protocol, for example if they're shifting traffic to different servers or something. shrug

@digitalgecko

Hmm for now I am just handling the uncaught exceptions globally and its working ok for me and atleast gets it working so I can go to bed. Just though I would drop this here incase anyone else runs into the same issue I was having. Even though the connection is getting reset in my testing the overall connection is maintained.

I am going to have to look into this a bit further. I will report back with any results

process.on('uncaughtException', function(err) {
    console.log('Caught exception: ' + err);
    console.log(err.stack);
});
@j6s
j6s commented Sep 12, 2014

@digitalgecko Thank you. You just saved me a lot of time. I was already recoding my chatbot in Python, because of the Connection resets.
Just catching the exception works well.

So far my bot is running for 2 hours without a single reconnect

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.