Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Server closed the connection with pool #1070

Closed
AdriVanHoudt opened this issue Apr 30, 2015 · 44 comments
Closed

Server closed the connection with pool #1070

AdriVanHoudt opened this issue Apr 30, 2015 · 44 comments
Assignees

Comments

@AdriVanHoudt
Copy link

Error: Connection lost: The server closed the connection.
at Protocol.end (/home/***/node_modules/mysql/lib/protocol/Protocol.js:103:13)
at Socket.<anonymous> (/home/***/node_modules/mysql/lib/Connection.js:98:28)
at Socket.emit (events.js:117:20)
at _stream_readable.js:944:16
at process._tickDomainCallback (node.js:492:13)
    --------------------
    at Protocol._enqueue (/home/***/node_modules/mysql/lib/protocol/Protocol.js:135:48)
    at Protocol.handshake (/home/***/node_modules/mysql/lib/protocol/Protocol.js:52:41)
    at PoolConnection.connect (/home/***/node_modules/mysql/lib/Connection.js:119:18)
    at Pool.getConnection (/home/***/node_modules/mysql/lib/Pool.js:45:23)
    at Object.exports.register (/home/***/node_modules/hapi-plugin-mysql/lib/index.js:42:24)
    at /home/***/node_modules/hapi/lib/plugin.js:242:14
    at iterate (/home/***/node_modules/hapi/node_modules/items/lib/index.js:35:13)
    at done (/home/***/node_modules/hapi/node_modules/items/lib/index.js:27:25)
    at Object.exports.register (/home/***/node_modules/lout/lib/index.js:95:5)
    at /home/***/node_modules/hapi/lib/plugin.js:242:14

I will try to mock some reproducible project to test with. This is very hard since it only happens after quite some time and I don't know what causes it :/

@AdriVanHoudt
Copy link
Author

Timeout config of the db if it helps

@dougwilson
Copy link
Member

I'm looking to see what I can determine in the meantime. So does this error comes as an uncaught exception or through an error argument in a callback function? If the later, what method call did it come through?

@AdriVanHoudt
Copy link
Author

Given the stacktrace it comes from https://github.com/Salesflare/hapi-plugin-mysql/blob/master/lib/index.js#L42
So a caught exception.
Hmm is it possible that the pool doesn't get destroyed decently on server stop and when the server starts again immediately after the pool somehow tries to use bad connections?

@dougwilson
Copy link
Member

Possibly. I'm investigating :)

@dougwilson dougwilson self-assigned this Apr 30, 2015
@AdriVanHoudt
Copy link
Author

thanks! If you need more info let me know!

@dougwilson
Copy link
Member

Can you confirm the exact version of this module you have running when you get that for me :)?

@AdriVanHoudt
Copy link
Author

hapi-plugin-mysql? v2.0.0 but 2.0.2 doesn't change much

@dougwilson
Copy link
Member

I mean of this module :) mysql. That plugin depends on 2.x.x so I don't know what version you are using without you looking at your server :)

@AdriVanHoudt
Copy link
Author

npm list prints mysql@2.6.2 at top level and none under hapi-plugin-mysql so I guess that's the version used

@dougwilson
Copy link
Member

As far as I can tell, the server (or something) literally shut down the TCP connection during the connect attempt (likely some time during the handshake). I can reproduce the exact error if I do this manually, but it's hard to say for sure this is what you are experiencing without debugging on your server.

@AdriVanHoudt
Copy link
Author

it's weird since the stacktrace points to code that only runs on startup. Can I help with the debugging? e.g. provide certain logging data or something? Giving access to the code itself is somewhat harder :/ I do run on google cloud sql and cloud compute engine if that means anything

@dougwilson
Copy link
Member

I mean, it just points to code calling getConnection; there is nothing special about that call; the other getConnection in that library completely ignores all errors, so of course any error during the other getConnection you'd never see, thus why you think the error only "runs on startup".

It would be useful to add that Hoek.assert(!err, err); to https://github.com/Salesflare/hapi-plugin-mysql/blob/master/lib/index.js#L52 to know what is going on in the other getConnection as well, so it's easier to identify the frequency of the issue as a first step.

@AdriVanHoudt
Copy link
Author

I don't know if you are familiar with the hapi framework and the plugin system? The plugin's register function is executed at startup to bootstrap the server so that code will only be executed at startup.
Damn I will add that asap, that's actually a bug :O thanks for the catch!

@dougwilson
Copy link
Member

Yes, I am familiar and do use Hapi. I was just running on about how the only error handling that existed was in start up, so it wasn't that it "only happened on start up", but rather you'd only notice it on start up ;)

@AdriVanHoudt
Copy link
Author

nice, and I see, I'm updating the package as we speak/type

@AdriVanHoudt
Copy link
Author

updated the package and rebooted the server, will wait until the error pops up again and then I'll get back to you, thanks for all the help!

@AdriVanHoudt
Copy link
Author

Ok got the error again

Error: Connection lost: The server closed the connection.
at Protocol.end (/home/***/node_modules/mysql/lib/protocol/Protocol.js:103:13)
at Socket.<anonymous> (/home/***/node_modules/mysql/lib/Connection.js:98:28)
at Socket.emit (events.js:117:20)
at _stream_readable.js:944:16
at process._tickDomainCallback (node.js:492:13)
    --------------------
    at Protocol._enqueue (/home/***/node_modules/mysql/lib/protocol/Protocol.js:135:48)
    at Protocol.handshake (/home/***/node_modules/mysql/lib/protocol/Protocol.js:52:41)
    at PoolConnection.connect (/home/***/node_modules/mysql/lib/Connection.js:119:18)
    at Pool.getConnection (/home/***/node_modules/mysql/lib/Pool.js:45:23)
    at Object.exports.register (/home/***/node_modules/hapi-plugin-mysql/lib/index.js:40:27)
    at /home/***/node_modules/hapi/lib/plugin.js:242:14
    at iterate (/home/***/node_modules/hapi/node_modules/items/lib/index.js:35:13)
    at done (/home/***/node_modules/hapi/node_modules/items/lib/index.js:27:25)
    at Object.exports.register (/home/***/node_modules/lout/lib/index.js:95:5)
    at /home/***/node_modules/hapi/lib/plugin.js:242:14

An I also had this (not sure if related)

Error: Handshake inactivity timeout
at Handshake.sequence.on.on.on.on.on.self._connection._startTLS.err.code (/home/***/node_modules/mysql/lib/protocol/Protocol.js:154:17)
at Handshake.emit (events.js:92:17)
at Handshake._onTimeout (/home/***/node_modules/mysql/lib/protocol/sequences/Sequence.js:116:8)
at Timer.listOnTimeout [as ontimeout] (timers.js:112:15)
    --------------------
    at Protocol._enqueue (/home/***/node_modules/mysql/lib/protocol/Protocol.js:135:48)
    at Protocol.handshake (/home/***/node_modules/mysql/lib/protocol/Protocol.js:52:41)
    at PoolConnection.connect (/home/***/node_modules/mysql/lib/Connection.js:119:18)
    at Pool.getConnection (/home/***/node_modules/mysql/lib/Pool.js:45:23)
    at Object.exports.register (/home/***/node_modules/hapi-plugin-mysql/lib/index.js:40:27)
    at /home/***/node_modules/hapi/lib/plugin.js:242:14
    at iterate (/home/***/node_modules/hapi/node_modules/items/lib/index.js:35:13)
    at done (/home/***/node_modules/hapi/node_modules/items/lib/index.js:27:25)
    at Object.exports.register (/home/***/node_modules/lout/lib/index.js:95:5)
    at /home/***/node_modules/hapi/lib/plugin.js:242:14

And a new one, which seems to be only node-mysql related

Error: Connection lost: The server closed the connection.
at Protocol.end (/home/***/node_modules/mysql/lib/protocol/Protocol.js:103:13)
at Socket.<anonymous> (/home/***/node_modules/mysql/lib/Connection.js:98:28)
at Socket.emit (events.js:117:20)
at _stream_readable.js:944:16
at process._tickDomainCallback (node.js:492:13)
    --------------------
    at Protocol._enqueue (/home/***/node_modules/mysql/lib/protocol/Protocol.js:135:48)
    at Protocol.handshake (/home/***/node_modules/mysql/lib/protocol/Protocol.js:52:41)
    at PoolConnection.connect (/home/***/node_modules/mysql/lib/Connection.js:119:18)
    at Pool.getConnection (/home/***/node_modules/mysql/lib/Pool.js:45:23)
    at Pool.releaseConnection (/home/***/node_modules/mysql/lib/Pool.js:135:10)
    at Pool._removeConnection (/home/***/node_modules/mysql/lib/Pool.js:247:8)
    at Pool._purgeConnection (/home/***/node_modules/mysql/lib/Pool.js:228:8)
    at Ping.onPing (/home/***/node_modules/mysql/lib/Pool.js:90:12)
    at Ping.b [as _callback] (domain.js:183:18)
    at Ping.Sequence.end (/home/***/node_modules/mysql/lib/protocol/sequences/Sequence.js:96:24)

And this one, which seems to be an error on pool.getConnection()

Error: Connection lost: The server closed the connection.
at Protocol.end (/home/***/node_modules/mysql/lib/protocol/Protocol.js:103:13)
at Socket.<anonymous> (/home/***/node_modules/mysql/lib/Connection.js:98:28)
at Socket.emit (events.js:117:20)
at _stream_readable.js:944:16
at process._tickDomainCallback (node.js:492:13)
    --------------------
    at Protocol._enqueue (/home/***/node_modules/mysql/lib/protocol/Protocol.js:135:48)
    at Protocol.handshake (/home/***/node_modules/mysql/lib/protocol/Protocol.js:52:41)
    at PoolConnection.connect (/home/***/node_modules/mysql/lib/Connection.js:119:18)
    at Pool.getConnection (/home/***/node_modules/mysql/lib/Pool.js:45:23)
    at Object.internals.getConnection (/home/***/node_modules/hapi-plugin-mysql/lib/index.js:20:27)
    at /home/***/node_modules/hapi-plugin-mysql/lib/index.js:51:30
    at /home/***/node_modules/hapi/lib/handler.js:312:22
    at iterate (/home/***/node_modules/hapi/node_modules/items/lib/index.js:35:13)
    at Object.exports.serial (/home/***/node_modules/hapi/node_modules/items/lib/index.js:38:9)
    at /home/***/node_modules/hapi/lib/handler.js:307:15

And a case where it gives the error when I use a connection to query the db every minute.

@AdriVanHoudt
Copy link
Author

I'm still getting

Debug: internal, implementation, error 
     Error: Handshake inactivity timeout
     at Handshake.sequence.on.on.on.on.on.self._connection._startTLS.err.code (/home/***/node_modules/mysql/lib/protocol/Protocol.
js:154:17)
     at Handshake.emit (events.js:92:17)
     at Handshake._onTimeout (/home/***/node_modules/mysql/lib/protocol/sequences/Sequence.js:116:8)
     at Timer.listOnTimeout [as ontimeout] (timers.js:112:15)
     --------------------
     at Protocol._enqueue (/home/***/node_modules/mysql/lib/protocol/Protocol.js:135:48)
     at Protocol.handshake (/home/***/node_modules/mysql/lib/protocol/Protocol.js:52:41)
     at PoolConnection.connect (/home/***/node_modules/mysql/lib/Connection.js:123:18)
     at Pool.getConnection (/home/***/node_modules/mysql/lib/Pool.js:45:23)
     at Object.internals.getConnection (/home/***/node_modules/hapi-plugin-mysql/lib/index.js:20:27)
     at /home/***/node_modules/hapi-plugin-mysql/lib/index.js:51:30
     at /home/***/node_modules/hapi/lib/handler.js:312:22
     at iterate (/home/***/node_modules/hapi/node_modules/items/lib/index.js:35:13)
     at Object.exports.serial (/home/***/node_modules/hapi/node_modules/items/lib/index.js:38:9)

@dougwilson is there any way I can catch this and try getting a connection again or something?

@dougwilson
Copy link
Member

Sure, I mean, you're clearly already catching it: we are not printing Debug: internal, implementation, error, you must be catching the error and printing it out somewhere in your code, right?

@dougwilson
Copy link
Member

Looking at the referenced plugin, it seems like at https://github.com/Salesflare/hapi-plugin-mysql/blob/master/lib/index.js#L42 the plugin is purposely just giving up. The err above is getting there, but the plugin is programmed to give up. You could get logic added there to retry if you like.

@dougwilson
Copy link
Member

The actual meaning of Error: Handshake inactivity timeout, by the way, is that there is a MySQL server or network issue, because getting a new connection from the configured MySQL server is taking over 10 seconds. We offer various options as well in https://github.com/felixge/node-mysql#connection-options if you want to wait longer than 10 seconds, but depending on the root cause for your issues (would be be something you would have to determined), it may not fix the issue, just wait longer before erroring.

@AdriVanHoudt
Copy link
Author

Yeah its hapi that prints it, and yeah might be able to retry but that won't fix the source just the cause, as for waiting longer, that's not really an option, no api call should take like 30 seconds or even 10 for that matter
What can be a cause of mysql taking so long to give back a connection?

@dougwilson
Copy link
Member

I mean, that's starting to get outside of this module, and more into what oracle would help with. Unfortunately I just really can't say, especially because it could be a network or machine issue.

@AdriVanHoudt
Copy link
Author

True, I just figured you probably know a thing or 2 about MySQL ^^ and yeah it's super weird, we have the exact same env in staging (also google cloud compute engine --> google cloud SQL) and there it doesn't happen so the only thing I can imagine is load, some weird forgotten config gone bad or db can't handle the all the data/some queries are to complex for the amount of data

Also random thought, would increasing the pool size help?

Thanks anyway for taking the time and maintaining the module!

@dougwilson
Copy link
Member

Gotcha. Yes, I wish I could help, but just not sure what the issue is. What I'm going to do, however, is draw a little diagram here about the packet exchange:

client        server
------        ------

     TCP SYN
1 ---------------> 2

      TCP SYN-ACK
4 <--------------- 3

       TCP ACK
5 ---------------> 6

   MySQL handshake
7 ---------------> 8

 handshake response
10 <-------------- 9

In this diagram, the error you are encountering is because there is a timer between 7 and 10 that is timing out. This is a 10 second timer.

Also random thought, would increasing the pool size help?

It may not help, but am unsure. I know that I've heard people having "spin up" problems with Amazon RDS before, where RDS spins down their MySQL if it's idle for a while, and then the first time the client tries to connect/reconnect, the handshake can take a long time.

@AdriVanHoudt
Copy link
Author

Cool I love diagrams! And so the problem is definitely mysql taking to long or the network failing. I have the activation policy on always on so I would assume that means no shutdown ever :) also it happens at random when the server is being used so the db shutting down would be weird.
The only way to debug that I can think of is replacing each part and see what fixes it. Yay for production only errors haha

@dougwilson
Copy link
Member

Yay for production only errors

Yea, those are always the worst :) And still, though it's hard, since it perhaps needs investigation at the point of occurrence, I still cannot even completely rule out this module or it's interactions with Node.js networking, which is also why the issue is still open :)

Also, to note in the diagram, we also do run a timer from 1 to 5, which doesn't seem to fail for you (or you'd get a Error: connect ETIMEDOUT). This means that there isn't an issue bring up the TCP connection, but some kind of issue at the application layer, either in Node.js or in the MySQL server.

@AdriVanHoudt
Copy link
Author

hmm ok that helps, and since it doesn't happen in our staging env I would go with blaming mysql at this point

@dougwilson
Copy link
Member

Hey @AdriVanHoudt , did you find out any additional information so far?

@AdriVanHoudt
Copy link
Author

Not really, I removed a fairly heavy insert query and now it seems better but I will keep you posted

@dougwilson
Copy link
Member

Anyway, let me know if any new information comes to light :)

@AdriVanHoudt
Copy link
Author

Will do, I do think I was occupying the SQL to much at certain moments, after removing the heavy insert it definatlyhappens way less, I am in no way an sql expert so :D thanks for all the help and effort you put in (also in general for this lib and node community)!

@XavierGeerinck
Copy link

Just got the same problem:

Error: Handshake inactivity timeout
    at Handshake.<anonymous> (/home/ubuntu/sourcecode/Backend/node_modules/mysql/lib/protocol/Protocol.js:154:17)
    at Handshake.emit (events.js:104:17)
    at Handshake._onTimeout (/home/ubuntu/sourcecode/Backend/node_modules/mysql/lib/protocol/sequences/Sequence.js:116:8)
    at Timer.listOnTimeout (timers.js:119:15)
    --------------------
    at Protocol._enqueue (/home/ubuntu/sourcecode/Backend/node_modules/mysql/lib/protocol/Protocol.js:135:48)
    at Protocol.handshake (/home/ubuntu/sourcecode/Backend/node_modules/mysql/lib/protocol/Protocol.js:52:41)
    at PoolConnection.connect (/home/ubuntu/sourcecode/Backend/node_modules/mysql/lib/Connection.js:123:18)
    at Pool.getConnection (/home/ubuntu/sourcecode/Backend/node_modules/mysql/lib/Pool.js:45:23)
    at Object.module.exports.internals.connect (/home/ubuntu/sourcecode/Backend/src/middleware/db.js:22:14)
    at /home/ubuntu/sourcecode/Backend/src/middleware/db.js:37:23
    at tryCatcher (/home/ubuntu/sourcecode/Backend/node_modules/bluebird/js/main/util.js:24:31)
    at Promise._resolveFromResolver (/home/ubuntu/sourcecode/Backend/node_modules/bluebird/js/main/promise.js:427:31)
    at new Promise (/home/ubuntu/sourcecode/Backend/node_modules/bluebird/js/main/promise.js:53:37)
    at Object.module.exports.externals.query (/home/ubuntu/sourcecode/Backend/src/middleware/db.js:32:16)
    at /home/ubuntu/sourcecode/Backend/src/dao/user_charity.js:156:21
    at tryCatcher (/home/ubuntu/sourcecode/Backend/node_modules/bluebird/js/main/util.js:24:31)
    at Promise._resolveFromResolver (/home/ubuntu/sourcecode/Backend/node_modules/bluebird/js/main/promise.js:427:31)
    at new Promise (/home/ubuntu/sourcecode/Backend/node_modules/bluebird/js/main/promise.js:53:37)
    at Object.getFrontPageDetails (/home/ubuntu/sourcecode/Backend/src/dao/user_charity.js:147:20)
    at UserCharityController.getFrontPageDetails (/home/ubuntu/sourcecode/Backend/src/controllers/user_charity.js:12:28)

@XavierGeerinck
Copy link

I resolved this by changing my IP address from an internal class C address (192.168.x.x) to my loopback adapter (127.0.0.1). This solved it for me.

Since this resolved it I am not sure what caused the problem. However I think that a firewall issue or port issue might be causing this problem when using a class C address. (Not sure as I said it is just some thinking)

@AdriVanHoudt
Copy link
Author

Here we go again ;)

Debug: internal, implementation, error 
    Error: Connection lost: The server closed the connection.
    at Protocol.end (/home/adri_salesflare_com/Server/node_modules/hapi-plugin-mysql/node_modules/mysql/lib/protocol/Protocol.js:103:13)
    at Socket.<anonymous> (/home/adri_salesflare_com/Server/node_modules/hapi-plugin-mysql/node_modules/mysql/lib/Connection.js:102:28)
    at Socket.emit (events.js:117:20)
    at _stream_readable.js:944:16
    at process._tickDomainCallback (node.js:492:13)
    --------------------
    at Protocol._enqueue (/home/adri_salesflare_com/Server/node_modules/hapi-plugin-mysql/node_modules/mysql/lib/protocol/Protocol.js:135:48)
    at Protocol.handshake (/home/adri_salesflare_com/Server/node_modules/hapi-plugin-mysql/node_modules/mysql/lib/protocol/Protocol.js:52:41)
    at PoolConnection.connect (/home/adri_salesflare_com/Server/node_modules/hapi-plugin-mysql/node_modules/mysql/lib/Connection.js:123:18)
    at Pool.getConnection (/home/adri_salesflare_com/Server/node_modules/hapi-plugin-mysql/node_modules/mysql/lib/Pool.js:45:23)
    at Object.internals.getConnection (/home/adri_salesflare_com/Server/node_modules/hapi-plugin-mysql/lib/index.js:20:27)
    at /home/adri_salesflare_com/Server/node_modules/hapi-plugin-mysql/lib/index.js:51:30
    at /home/adri_salesflare_com/Server/node_modules/hapi/lib/handler.js:311:22
    at iterate (/home/adri_salesflare_com/Server/node_modules/hapi/node_modules/items/lib/index.js:35:13)
    at Object.exports.serial (/home/adri_salesflare_com/Server/node_modules/hapi/node_modules/items/lib/index.js:38:9)
    at /home/adri_salesflare_com/Server/node_modules/hapi/lib/handler.js:306:15

It is pointing to https://github.com/Salesflare/hapi-plugin-mysql/blob/master/lib/index.js#L20 so it fails on getting a new connection.

So would it be a solution to instead of throwing trying to get a new conn here https://github.com/Salesflare/hapi-plugin-mysql/blob/master/lib/index.js#L53 ? But doing that infinitely on error also seems wrong, maybe add a limit to it.

As for the cause, no idea, I could up the stack trace length in staging (happens there also) but not sure what that will help us.

@dougwilson
Copy link
Member

There is almost no way to recover from this, unfortunately. You may very likely have a networking issue at play here. Here is the play-by-play on that stack trace:

1. called get connection
2. no free connections in the pool, so create one and call connect
3. send the initial handshake packet to the mysql server
4. the TCP connection suddenly closed

@AdriVanHoudt
Copy link
Author

hmm weird, but fetching a new conn after 50ms or something, would that work?

@dougwilson
Copy link
Member

It may very well, but since the error seems like a network thing, you may just reduce the chance rather than fixing the underlying issue.

@AdriVanHoudt
Copy link
Author

Was the intention :P
Any suggestion on debugging this?

@AdriVanHoudt
Copy link
Author

Ok what I think is best, at least for now, is to first try fetching a new connection (up to 10 times) and if that doesn't work reply with the error instead of throwing and I will see if that helps
The best thing would be to switch to another db provider (on google cloud atm, server is on compute engine) and see if that fixes it to eliminate node from it but that is a bit tricky atm to do sadly

@chrisabrams
Copy link

I get the same issue with Google Cloud.

@AdriVanHoudt
Copy link
Author

Could be a GCloud issue then. I do not see it coming back in my staging env (grabs wood to not jinx it) but the load is waay less and it reboots more

@rfink
Copy link

rfink commented Nov 10, 2015

Getting this issue in AWS

@woodyhayday
Copy link

Was getting this issue on AWS, I believe it might be when operating between regions, (Elastic Beanstalk Node app in different region to RDS), though can't explain why (security groups should of allowed.) Anyhow, this works fine on AWS now with mysql ^2.10.2 and an RDS instance in same region.

dveeden pushed a commit to dveeden/mysql that referenced this issue Jan 31, 2023
Implements a Travis CI task that checks if the driver compiles on all platforms supported by Go.

Fixes mysqljs#1050
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

6 participants