New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don’t shutdown() shared sockets (fixes #213) #463
Conversation
eam
commented
Dec 18, 2013
The sockets really shouldn't be shared at all, if we can avoid it. In Ruby 2.0, sockets default to FD_CLOEXEC, which forces the socket closed in the child process. I wonder if we can do this across the board. As a thought experiment, does your approach work in this scenario:
Currently we may GC and close the socket at 3 and 5. In your patch, we would only close the socket at 3, but not 5? |
I think there's a use case for a connection-opening/pooling parent which does work in short-lived forked children, which is how Resque operates. I agree it's a delicate endeavor and close by default is probably appropriate. My specific goal in writing this patch is to support the Square use case in Resque. Perhaps this should be configurable behavior? I do think it should be possible to pass a handle to a child safely, even if the default behavior is to close on exec. In your scenario above, the behavior both currently and after my patch would be the same:
The trick here is that upon shutdown of a ruby process, a given individual object may not GC at all. But this behavior can be forced, to illustrate the nature of the problem. Try this, on an unpatched (or patched) mysql2:
|
Should we actively disable FD_CLOEXEC then, so that behavior is consistent going forward into Ruby 2.0? How is Resque handling Ruby 2.0? |
I reopened #213, and tentatively scheduling this for version 0.3.16. Mostly because 0.3.15 is almost ready to roll now. |
I imagine it should be acceptable to let apps such as Resque specifically ask to disable setting FD_CLOEXEC. On by default seems fine to me. I don't know much about Resque other than this specific bug, sorry. |
* common socket state. #close still functions in a child process if one | ||
* wishes to force the issue. | ||
*/ | ||
if (wrapper->refcount == 0 && getpid() == wrapper->origin_pid) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The child process will decrement the refcount but not free the socket, leaving the parent with a refcount < 0. This will cause an open socket leak.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this essentially disables GC collection for objects which were allocated in a parent. I believe this is the only use of refcount, so whether we decrement it or not may not matter -- the GC will never collect it if the pid differs. #close will still operate and does not check refcount.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sodabrew the child and parent each have a copy of the refcount, so the child will not decrement the parent's.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify, refcount is still properly tracked in the parent. When the parent's instance reaches zero the socket will be destroyed as it is now. This only disables collection in the child.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, thank you for reminding me that these are processes :)
Well let's back up a minute. I might not be understanding the issue here fully but: After you fork, you have to create a new connection anyway right? If the children re-establish a connection after fork this problem goes away, no? |
After a fork Resque does not create a new connection. The processing model is similar to say, pre-fork Apache where the parent maintains a connection pool to databases to avoid the latency overhead of establishing a new database connection per-job or per-request. The latency change of using a connection-pool vs not can be considerable for short-lived jobs/requests. |
@eam right, it's up to you to establish the connection in an after_fork callback like unicorn. As far as I know Resque does no coordination with other children about who's using what sockets from a connection pool. |
For context: https://devcenter.heroku.com/articles/forked-pg-connections#resque-ruby-queuing. This documentation is referencing a postgres connection, but the semantics are the same here. |
The child will inherit the connection. I recently fixed this in an app after I upgraded to Ruby 2.0, where FD_CLOEXEC clued me in to the fact that I was regularly sharing a mysql connection eight ways. I'm leaning on the side of calling this a bug that needs to be fixed in the user code, not in mysql2. The nondeterministic part of the issue makes me want to do something, though ;) |
My area of expertise is actually not Ruby so I can't offer meaningful opinions about specific Ruby apps, but passing DB connections to child workers is a pretty common pattern for app design. Pointing fingers, I would say this is a bug in libmysqlclient as it should be close()ing the socket for the immediate process but not using shutdown() to destroy the shared state. It should be possible to call a mysql_close() and close a particular process' usage of a connection without impacting other processes which also hold the socket. I'm following up with mysql as well, though this is a much slower path to resolution and won't fix the immediate issue as it likely would entail an api change in their client library. The problem we're faced with is that we can't avoid having this object registered for collection by the Ruby GC. There is no user code we can add to enable connection reuse across forks in Ruby as things are today. While establishing a connection per-job or per-request will work is very slow as it can add several serialized round-trips over the network. At scale, connection reuse becomes an essential optimization. I would like connection reuse to be possible. |
I believe most other db layers provide an explicit #close method distinct from object de-allocation for this reason. All languages which support connection pooling in Apache mod_* implementations would have to. |
I think the issue is that there is no facility for which to hand off connections from a pool to children using Resque. If you're using ActiveRecord - each child process will have it's own connection pool and assume it's the only process using each connection in it. This is why every pre-fork modeled system in Ruby suggests that you close connections in the before_fork callback and requires that you reconnect (not just database connections, redis, memcached, anything) in the after_fork callback. That said, designing a system for Resque which would let the master process hand off connections from a pool would definitely be interesting. It would require redesigning Resque though. So unless you guys have already built that, I'm afraid your only option here is to use the before/after fork hooks. Supporting this in mysql2 itself will involve making a series of trade-offs that may benefit your use case but hurt others. The way things work today should be safe for use assuming you're closing and reopening connections before and after the fork. Unicorn and Rainbows have this same requirement. For reference we've been using mysql2 with ActiveRecord in Resque at GitHub for almost 4 years now with no issues related to connections. At peak we're processing upwards of 100,000 jobs per minute. |
Let's talk numbers: On my local machine: >> Benchmark.bmbm { |x| x.report('reconnect') { 1_000.times { ActiveRecord::Base.connection.disconnect!; ActiveRecord::Base.establish_connection } } }; nil
Rehearsal ---------------------------------------------
reconnect 135.430000 3.380000 138.810000 (140.439120)
---------------------------------- total: 138.810000sec
user system total real
reconnect 138.780000 3.450000 142.230000 (145.258256)
=> nil
>> In my datacenter: >> Benchmark.bmbm { |x| x.report('reconnect') { 1_000.times { ActiveRecord::Base.connection.disconnect!; ActiveRecord::Base.establish_connection } } }; nil
Rehearsal ---------------------------------------------
reconnect 145.580000 4.000000 149.580000 (150.080117)
---------------------------------- total: 149.580000sec
user system total real
reconnect 137.900000 3.510000 141.410000 (141.895566)
=> nil
>> 100ms overhead per resque job 😿 |
http://bugs.ruby-lang.org/issues/5446 is relevent here as well |
I forgot that we recently started using https://github.com/spraints/resqued - it's worth a look if you haven't already. |
What are the negative effects of this patch that raise concern? There were some more heavy-handed ways to make this happen but I'm fairly sure I avoided introducing anything problematic. |
Seems that we all agree that mysql2 should either |
Sorry, I'll spend some time making sure I understand the consequences of this patch and try to come to a decision. |
I did some thinking over the break: I'd like to float a different approach to this, mimicking the way Perl's DBI handles things by adding an option:
Implementation should be straightforward: turn the getpid() check into a check for whether this default-off value is enabled, and allow users to twiddle it from Ruby. The documentation for the new option would be a great place to point people to when they encounter this. |
* common socket state. #close still functions in a child process if one | ||
* wishes to force the issue. | ||
*/ | ||
if (wrapper->refcount == 0 && getpid() == wrapper->origin_pid) { | ||
nogvl_close(wrapper); | ||
xfree(wrapper->client); | ||
xfree(wrapper); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't this cause a socket and memory leak in forked children which don't actually use the shared mysql connection?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kind of. The GC will not collect the object, but #close can still be used explicitly.
The reason for this is a bit of an oddity due to how mysql_close() uses shutdown(). Normally in cleanup you'd want to free the related client memory and only call close(), which would release the fd reference but not disrupt its use in other processes.
But we don't own the cleanup routines: mysql_close() also calls shutdown(), disrupts all use of the client object and the related socket. Ideally libmysqlclient would supply a way to free the client struct and close the handle without calling shutdown(). But as best I can tell this isn't possible.
There are some mitigating factors here: This can't leak resources in a loop, it becomes a one-time cost of forking. Because mysql_close() calls shutdown(s, 2), the TCP connection will be fully terminated and cleaned up from the kernel state. And #close still does the job explicitly -- anyone writing a proper program is going to want to explicitly manage objects with non-memory resources anyway; relying on the GC to collect things like stale connections is problematic in and of itself.
The limitations in libmysqlclient make this process a bit dirty no matter what so long as inheriting descriptors occurs. FD_CLOEXEC is the only completely clean answer, and probably correct by default, though it's also functionally restrictive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify, close(s) causes a process to drop its reference to a socket. shutdown(s,2) will terminate the socket regardless of what other processes reference it. So as long as one process (in this case the parent) still calls mysql_close(), the entire connection state will be cleaned up at that time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
anyone writing a proper program is going to want to explicitly manage objects with non-memory resources anyway; relying on the GC to collect things like stale connections is problematic in and of itself. -- We support GC as a first class use case. It is not acceptable to break it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absolutely, let me clarify: The GC is necessarily unaware (or at least has limited awareness) of non-memory resource limits. These typically must be managed explicitly, regardless of GC behavior. I think a common example from Ruby is the use of constructs such as:
loop { IO.popen("echo hi") { |f| f.read } }
Instead of
loop { IO.popen("echo hi").read }
The former can clean up state after the yield, whereas the GC is unequipped to handle the external resources in the latter case. I'm only suggesting we face a similar issue here, where the mysql C library bits are the externalized resources. Compounding this, the externalized resources are shared. It's an otherwise correct cleanup in one GC instance which breaks another.
To be clear, the gc interaction is broken right now, any time a fork occurs. This is a possible fix (which I've amended a bit, I think this suggestion may be more palatable: #463 (comment) )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, right. Adding an option sounds reasonable to me. Update your PR and let's see how it comes out?
Poke for an update? |
This pull request is intentionally creating a memory leak in the child process to avoid disconnecting the socket for the parent process. Instead, it would make more sense to close the socket's file descriptor in the child process, then call mysql_close without worrying about it affecting the connection in the parent processes. This can already be done manually as follows without exceptions: client = Mysql2::Client.new
client.query("select 1")
fork do
IO.new(client.socket).close
client.close
end
Process.wait
client.query("select 1") Setting the FD_CLOEXEC flag won't fix the issue being addressed by this pull request, since there is no |
I agree. I don't know why I thought the socket wasn't accessible -- closing it before entering mysql_close() should prevent all of this bad behavior. |
I've updated this PR per dylanahsmith's insight. Much simpler, no more downside. Thanks! |
@@ -186,6 +186,7 @@ static void *nogvl_close(void *ptr) { | |||
fcntl(wrapper->client->net.fd, F_SETFL, flags | O_NONBLOCK); | |||
#endif | |||
|
|||
close(wrapper->client->net.fd); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this portable with windows. We could we have this in the #ifndef _WIN32
block, since we don't have to worry about fork
being used on windows.
Fixed non-unix portability issues, thanks again. |
👍 |
This is awesome, thanks for staying with the issue to find a really clean fix. Ping when you're ready for a merge review. |
@sodabrew I think this is ready :) |
👍 |
@@ -184,6 +184,7 @@ static void *nogvl_close(void *ptr) { | |||
flags = fcntl(wrapper->client->net.fd, F_GETFL); | |||
if (flags > 0 && !(flags & O_NONBLOCK)) | |||
fcntl(wrapper->client->net.fd, F_SETFL, flags | O_NONBLOCK); | |||
close(wrapper->client->net.fd); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest adding a comment above close()
- the git text is pretty good, here's an alternate condensed suggestion:
/* Call close() on the socket before calling mysql_close(). This
* prevents mysql_close() from calling shutdown() on the socket.
* The difference is that close() acts on the current process only,
* while shutdown() acts for all parent and child processes, which
* has impacts on applications using a forked worker strategy.
*/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think mysql_close() is calling shutdown() on the socket, I think the socket is being closed by the server because mysql_close() sends a QUIT message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does call shutdown -- we initially found this via strace. The actual call is in include/mysql/psi/mysql_socket.h which is invoked through vio_cancel()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Partial stacktrace from some of our earlier debugging -- how we found it:
#0 0x0000003a678e9190 in shutdown () from /lib64/libc.so.6
#1 0x0000003451a56e06 in vio_close () from /usr/lib64/mysql/libmysqlclient_r.so.16
#2 0x0000003451a5683e in vio_delete () from /usr/lib64/mysql/libmysqlclient_r.so.16
#3 0x0000003451a52962 in end_server () from /usr/lib64/mysql/libmysqlclient_r.so.16
#4 0x0000003451a54bee in cli_advanced_command () from /usr/lib64/mysql/libmysqlclient_r.so.16
#5 0x0000003451a52a7a in mysql_close () from /usr/lib64/mysql/libmysqlclient_r.so.16
#6 0x00007fffefde8059 in nogvl_close (wrapper=0x13103c0) at ../../../../ext/mysql2/client.c:190
#7 decr_mysql2_client (wrapper=0x13103c0) at ../../../../ext/mysql2/client.c:205
#8 0x00007fffefde924c in rb_mysql_result_free (ptr=0x1419c50) at ../../../../ext/mysql2/result.c:88
#9 0x00007ffff7c01961 in run_final () at gc.c:1415
#10 finalize_list () at gc.c:1429
#11 finalize_deferred () at gc.c:1450
#12 rb_gc () at gc.c:3112
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, I stand corrected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Digging into this a bit more - it does both. The shutdown() and then close() occurs immediately so we'd never see the reply to the mysql-level QUIT. I'll edit the comment a bit to reflect this.
I have one last concern, from the server's perspective, are we now leaving dangling sockets open, since nobody is calling |
@@ -184,6 +184,7 @@ static void *nogvl_close(void *ptr) { | |||
flags = fcntl(wrapper->client->net.fd, F_GETFL); | |||
if (flags > 0 && !(flags & O_NONBLOCK)) | |||
fcntl(wrapper->client->net.fd, F_SETFL, flags | O_NONBLOCK); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we still need this? Since we aren't sending a QUIT message anymore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can go away, close() should never block.
@sodabrew
This is why setting FD_CLOEXEC is a good idea, since it will avoid |
Updated to remove the unnecessary nonblocking code and to comment per above discussion. |
@sodabrew ping. Looks like this is ready to merge. |
Thanks! I have one test I'd like to run locally, which is creating and closing a large number of connections and checking to make sure that the MySQL server correctly handles the closed socket. |
#ifndef _WIN32 | ||
flags = fcntl(wrapper->client->net.fd, F_GETFL); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
flags
is no longer used, please also remove a few lines up:
#ifndef _WIN32
int flags;
#endif
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
Verified that MySQL 5.1 and 5.5 do clean up the connection when the socket is closed this way. For history buffs, the fcntl flags were originally added in ec00873. Reminder to self: move |
Close the mysql socket before calling mysql_close(). mysql_close() calls shutdown() on the attached socket. In a situation involving fork() and copies of an object spanning multiple interpreter instances this will result in the first instance to GC breaking all other instances sharing the socket. dylanahsmith notes that we can avoid this by simply closing the socket before entering mysql_close(), preventing the shutdown() from having any impact on the shared socket state.
Don’t shutdown() shared sockets (fixes #213)