-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sequel fork safety #691
Comments
I don't see this as a problem that requires fixing. The existing solution is straight-forward, and I don't think it is brittle. Adding a Both of your proposed solutions would cause a performance hit, and I don't think it's fair to penalize all Sequel users just to make things slightly easier for some of them. Note that if you want to implement such a thing yourself, it's trivial to create a custom connection pool class that does what you want. |
Please see https://gist.github.com/eregon/89a0a7591721c3ce8c3c as an example of the problem.
You do not see data corruption caused by a forking webserver and usage of a default Sequel thread-safe connection pool (which is pretty common) as a problem which requires fixing?
But it is not exactly intuitive, I wish to provide a good experience for the default environment and newcomers. I thought to the For instance, in the gist, I first wrote Obviously, not everyone knows about gory details of fd sharing and fork and hopefully they not always need to. I appreciate this solution is documented but never saw it before as it is written in "Misc." (http://sequel.rubyforge.org/rdoc/files/doc/code_order_rdoc.html). Maybe it could benefit from a better visibility in some intro about
I think the performance hit would not be really important or significant, do you have a good benchmark for this so I could test myself?
I'll have a try. Finally, this is also so unexpected. You have well-separated connections per thread, but processes (which are a group of threads) do not always, so in one way we could say it is no thread-safe since two (main) threads share the same connection without any particular configuration. |
To be more specific, I don't see this as a problem that requires fixing in Sequel. Obviously, data corruption is a problem in applications, but I think that it is an application issue.
Well, I'll give you that it is not intuitive. However, not everything can be intuitive. If you are using a forking webserver and preloading code, you need to understand issues caused by file descriptor sharing.
If the forking webserver offers a before_fork hook, that should be used, as it is the best place to do something. I do not consider that brittle, since that's where all code that deals with shared file descriptors should go. The on_event(:starting_worker_process) is a little different, since I believe that is called after forking. That indeed is brittle. But that's a problem with Passenger, in that it doesn't offer a before_fork hook, only an after_fork hook.
True, it does require you put the code in the correct place. However, I have yet to come across an app where locating this place is difficult.
I disagree with the idea that every library that opens file descriptors should have code that always checks to see if a fork has happened. Do the ruby stdlib libraries that open file descriptors automatically handle this? Of course not, the onus is on the application user to handle file descriptors shared between processes.
If you send an a pull request for adding additional documentation in this area, I will definitely consider it.
If I had to guess, the performance hit would not be significant for most cases. Considering the correct way to handle this issue (before forking) doesn't cause any performance hit, I don't think taking any performance hit is acceptable.
Thread-safe doesn't mean fork-safe. Even when using the single connection pool, you can still run into issues with shared connections across processes. To sum up: I think file descriptors shared across fork is an application level issue, not a library level issue, and it should be dealt with in the application. |
I'd like to share a comment here. In my opinion, this related to the fact that Sequel gives to the OP the illusion that he can abstract from connection handling, through implicit connection boundaries. While very appealing at first glance, I personally think that such a feature is not a nice gift to users, as it generally leads to subtle bugs. @eregon provides a very good example IMHO. Just to be sure that my point is clear: DB = Sequel.connect(...) # does not actually connect
DB[:users].to_a # will create a connection instead of (explicit connection boundaries), DB = Sequel.database(...)
DB.connect do |conn|
conn[:users].to_a
end Note that this is not directly related to connection pooling per se. More to the fact that connections are created and maintained automatically by Sequel, instead of being created and closed explicitly by users (and reused across thread/processes only after having been closed by them). Now, Sequel made that choice a long time ago and that's probably fine. But, if the illusion has to be maintained, I would be temped to agree with @eregon that it would be more consistent for Sequel to be fork-safe as well as thread-safe. Otherwise, the abstraction tends to leak in very unpleasant and unexpected cases. |
Suppose we do have code that looks for pid changes to detect forking (all such code is buggy if fork is called more than once, due to pid reuse). Let's say we detect a fork has occurred, and there are outstanding connections. What do we do? Do we ignore them? Do we attempt to close them? The former leaks connections. The latter isn't safe, since multiple child processes could be operating on the connections simultaneously. So any attempt to handling this automatically by detecting pid changes is broken. The truth is, you cannot correctly handle things after forking, since then you are already sharing the file descriptors and connection objects. The only way to correctly handling things is to close the file descriptors before forking. As ruby does not offer a generic before_fork hook, I don't see that being possible automatically currently. Even if ruby did offer a generic before_fork hook, I don't think Sequel would use it, as fork can be called safely in many situations, and automatically disconnecting before fork could break existing code (e.g. fork{exec 'some command'} in the middle of your script). FWIW, PostgreSQL specifically recommends against forking while having open connections: http://www.postgresql.org/docs/9.0/interactive/libpq-connect.html. Note that they don't recommend forking and then fixing the problems, they just tell you not to fork with an open connection. Listen, I can understand that automatically handling this would make some things easier. But there are corner cases in doing so. The issue we are dealing with here is vary narrow (forking server that preloads code, and doesn't use connections in the parent after forking). Even though such usage may be common, we only want to make this change if we know we are in that situation. The only person that knows that is the application author (or possibly the webserver). So I think this is always going to be application/server-level decision. Since the only good way to handle it is to disconnect the connections before forking, I think the current recommendation of Is it common for other ruby libraries you use to specifically check for forking? I'd honestly be interested in how other libraries are handling this. AFAIK, ActiveRecord is similar to Sequel. Now, Passenger handles ActiveRecord disconnection/reconnection automatically when smart spawning, which is the reason it works automatically with ActiveRecord. But note how it is not ActiveRecord automatically reconnecting after fork, but Passenger doing so, which makes sense as Passenger is the one doing the forking. Maybe you want to ask the Passenger guys to automatically do a similar thing for Sequel that they do for ActiveRecord? |
Relative newbie to Ruby and Sequel. Here are my thoughts: Would PID reuse really be an issue? The only time a PID can be reused is when the previous owner of that PID has died (and thus is no longer using the associated database connection)? I've just spent a frustrating few days trying to figure out some of the weirdest behaviour I've ever seen, and It's essentially this problem. I'm with @eregon on this one. I think [pid, thread_id] would be an elegant solution. Not sure how it would be any more of a performance hit than calling DB.disconnect every time. Besides, ubuntu 13.10/nginx 1.4.3/phusion_passenger 4.0.21/ruby 2.0.0/sequel 4.8.0/postgres |
The issue is not what key is used for the connection pool's It is true that pid reuse is not a significant issue. To be a problem it requires a grandparent->parent->child process model where the grandparent sets up the connection, forks the parent and exits. I only mentioned it in passing, it wasn't a central point to my argument. I'm sorry if this took you a few days to track down. This type of issue is mentioned in the Sequel documentation, Passenger documentation, and the PostgreSQL documentation. If you can think of other appropriate places to add it in Sequel's documentation, I will add additional documentation to those places. Detecting pid changes and clearing the available connections is certainly possible. However, requires an additional system call on every single connection checkout (which are very frequent as connection checkouts should be short lived in Sequel), so it is not free from a performance standpoint. I'm sure you would have gladly paid the cost, but I don't think it's the correct decision to force the additional cost for all Sequel users. I could easily add an extension that makes the connection pool do such a check. However, since it wasn't the default, nobody would use it unless they think they needed it, and the only people who could figure out they needed it would be better off just disconnecting before fork. So I don't think it doesn't make sense to implement it as an extension. I'm not aware of cases where DB.disconnect can fail silently. Even if the disconnection of the underlying connection failed silently, |
@jeremyevans, I find that this statement of yours is false:
We use postgres |
@clord Without posting some example code, I can only guess what is happening. First, some background: If you are using the standard threaded connection pool, If you are using the single threaded pool, then it can delete in use connections, so if you are using that, you'd want to change to the default pool. Note that in order not to miss notifications, you need to be using the while true
notification = DB.listen("channel")
fork{do_something(notification)}
end it's expected that you will lose notifications, and is unrelated to disconnection. You would need to switch to: DB.listen("channel", :loop=>true) do |*notification|
fork{do_something(notification)}
end Note that you need to be careful to use If you still think this is a bug and you can come up with a self contained reproducible example that shows the problem, please post it on the sequel-talk Google Group or open a new issue and I'll definitely see if I can help. FWIW, the statement I made is not falsifiable, I said "should never break code", I didn't say "can never break code". :) Not to mention that the statement was made about disconnection before running/serving requests, not about doing so at runtime in response to notifications. |
I'm with @jeremyevans on this one, and admire his patience in educating about this caveat. FWIW this is documented everywhere when it concerns using fork-based app servers or Background job frameworks (Resque, Unicorn and Puma, besides Passenger), and usually for Active Record. Check Heroku documentation if you don't believe me. Everywhere there's a database involved and a fork, people warn against this sort of thing. This does not seem Sequel's problem to solve. If that, then ruby should clearly distinguish between the main thread and the "forked" main thread. But then again, should it? Fork is a Unix system feature. Ruby is not a Unix-only environment (it also supports windows). JRuby goes at extreme lengths to be able to fork, and I don't know any production case mixing fork with db pools. Even PostgreSQL recommends against it, as previously linked. The reason Passenger abstracts things away for Active Record is just based on popularity and common case (the main customer app is 95% almost always Rails-based), and to avoid yet-another-fork-db-connection-related-issue. They're not alone on that. Even Sidekiq takes care of checking connections back in after jobs are done only for Active Record (this is probably not documented in Sequel, and probably shouldn't). This does seem to be out-of-scope for Sequel. The only thing I'd argue is that Sequel stores are not fiber-safe, as they use Thread#[] as thread-local storage, but that's a completely different issue. |
@jeremyevans thanks, that was a very useful comment. It looks like the problem I'm seeing is that The code I am porting from would look like this if possible: begin
while running
DB.listen("channel", timeout: 10)
fork{do_something}
update_state
running = # Compute
end
ensure
DB.unlisten "channel"
end And the current ported version will unlisten via line 3 instead of in the I suppose I could manually exec Edit: I was worried after writing this that
So my original code is correct in that regard. It's not necessary to unlisten after every listen. |
Yes, unlistening on the channels before returning from listen is by design, and required if you want correct behavior with the following code: n1 = DB.listen("chan1")
n2 = DB.listen("chan2") Without the unlisten, the second call to listen here could return a notification for chan1 even though you are only supposed to be listening on chan2. You should be able to change your code to: running = true
DB.listen("channel", timeout: 10, :loop=>true) do |*notification|
fork{do_something}
update_state
break unless running = # Compute
end Note that if you want to just use the ruby-pg API, you can just use |
I tried this: DB.listen("channel", timeout: 10, loop:true) do
puts "called"
end and "called" is only printed after an event on the channel, not in the case of a timeout. Looking at the code for
it's pretty clear that &block will only be called in |
Well, you can probably do it in pure sequel, but it would be uglier: called = false
DB.listen("channel", timeout: proc{called = false;10}, loop: proc{puts "not called" unless called}) do
called = true
puts "called"
end |
@jeremyevans Hello! |
@eregon I wouldn't do it automatically, as it would break code. Let's say Sequel automatically uses
My reasoning hasn't changed in the last 5 years, it's still that trying to handle this automatically in Sequel is a bad idea. This should be handled in the application, as the application is the only place where it can be correctly handled. Yes, such handling may not be automatic. But I would rather no automatic handling than poor automatic handling, or automatic handling that works for a particular use case but deals poorly with other use cases. |
@jeremyevans I am not convinced yet this cannot be done automatically and sensibly for the vast majority of realistic use cases (maybe even all). Could you elaborate on a few points?
Yes, but who does that? Ruby has many utilities for spawning subprocesses and I see very little good reason to fork+exec manually instead of
Why not? If the parent closed (as in If you mean
Could you link to some documentation about that?
So intuitively this option sounds great to me. It's also similar to what Passenger uses to avoid file descriptor sharing automatically with ActiveRecord AFAIK. What are the drawbacks?
If I use Sequel and some forking Ruby webserver (most do, and many by default), I'd like to have a sane situation by default and e.g., avoid end users of the website seeing database results of one another (and so leaking private information) due to unintentional connection sharing.
Could you show realistic use cases where automatic fork-safety would be harmful? (using |
Not sure. However, I'm guessing the following is not that uncommon, since it is the recommended way to support fork without breaking things in the parent: fork do
something
exit!
end
You are talking about
Sequel doesn't have
The burden is not on me to show it can cause no harm. The burden is on the person who wants this added to prove doing so cannot cause harm. I've already described possible ways this can be harmful. This problem has a simple, manual solution with one line of code that causes no problems if used correctly. That is preferable in my opinion to a more complex automatic case that will handle certain cases fine and break other cases. As an analogy, there are many people using Sequel with multiple databases and model classes. If they create a model class, Sequel doesn't attempt to look at all databases to try to find which one is the best fit for the model class (i.e. has the related table). The burden is on the user to manually specify which database if the default database in use is not the desired one. |
Thank you for the reply.
To clarify, what would break exactly? The connection would be disconnected in the parent on It's not ideal that on every
This seems the meat of the issue, and the challenge is to do this correctly. It sounds fairly similar to So I think during the We should ideally also close the underlying file descriptors (and free native allocations for the connection pointer) as they won't be used. I'm not sure if e.g.,
One solution for postgres involves to close(2) the socket returned by They also propose just not caring about the opened socket and native PGConn allocation. This seems safe and of course portable/independent of the specific database. After all, the parent process likely doesn't have a thousand open connections which could lead to file descriptors exhaustion or a significant memory leak (but often just 1 open connection I would expect). FWIW it seems ActiveRecord deals with this problem directly nowadays, by checking |
Well...
I think you've answered your own question. Correct use of
Yes, it's way more complex than a single I'm sorry, but you and I just have a different philosophy regarding this. You are willing to accept a large amount of complexity for an automated approach that benefits a particular case, without considering too much about other cases. I think this is best solved by a simple manual solution in the particular case where it is needed, where a similar manual solution would be needed for most other libraries that use sockets. I've been maintaining Sequel for 10 years. Looking at the issues, there are only a handful that are related to this issue. It's simply not a major issue for most Sequel users. It's true that new users are occasionally bitten by it because they don't understand |
I'm trying to explore the available options. Agreed the approach of closing the socket but not communicating on the socket is tricky and needs special handling for each DB driver, sometimes not even exposed directly by the native DB library. Probably removing the finalizer or disabling it is easier: we just need to know which object the finalizer is defined on (and use From #1431 (comment):
Could preforking webservers use this? Specifically, since they fork, they would need to ensure all the forked processes end with That would avoid the double disconnect, but of course something else would still be needed so the connection is not reused in multiple processes. at_fork(:after_child) {
Sequel::KEEP_ALIVE = DB.pool.available_connections.dup
DB.pool.available_connections.clear
} should be enough for that (if we assume the webserver/forked process uses Alternatively, "a new |
I would not even consider messing with the internal state of the connection objects to try to implement this. A prerequisite for support in Sequel would be support in the drivers. That's a necessary condition, but not a sufficient one.
I don't think they could do so automatically or by default. Calling
That would only affect the case of currently checked out connections on the thread that forks, which would end up like currently checked out connections on threads other than the one that forked. It doesn't affect connections that are not currently checked out, which would still be available to be checked out in both the parent and child. |
Thank you for your time and efforts. I totally agree with your position... However, I find myself in a conundrum where the existing Sequel API doesn't allow me to provide an automated solution to this problem. I want to offer Sequel the same support offered to ActiveRecord by the iodine web server (which I'm now in the process of updating/fixing)... ...However, unlike ActiveRecord, Sequel database connections are instance/driver specific, where Is there any way that the I think having a connection/database "registry" that would allow Sequel connections to be globally re-established, could solve this discussion by moving the responsibility to the framework / server using Sequel rather than the developers. For your consideration. Kindly, |
@boazsegev Sequel stores each So you should be able to disconnect all Sequel database connections with: Sequel.synchronize do
Sequel::DATABASES.each { |database| database.disconnect }
end |
@janko-m , Thanks! I committed a patch to automate server ... unless you have objections... I wouldn't want to introduce a default behavior that the Sequel team disapproves of...? Kindly, P.S. I think adding the code to the servers (or frameworks) will make it easier for developers... after all, by the time developers use Rack or lower level features, they should already know about the issues related to |
@boazsegev The patch looks good, except that you should drop the usage of |
@jeremyevans Thanks! |
Hello,
I experienced some trouble due to the connection socket being shared between two processes (caused by fork and "smart spawning" of Phusion Passenger).
I read a couple issues/threads on the list about it and looked the code of the default connection pool, ThreadedConnectionPool.
The commonly proposed solution to this problem is to disconnect just before/after fork with a server-dependent hook. I find it unsatisfying because it is brittle code depending on the environment and it would seem logical if Sequel was also fork-safe as it is thread-safe.
Concretely, ThreadedConnectionPool use
Thread.current
as the key for connections. Unfortunately, in MRI Thread.current is identical for the main thread acrossfork
:While they are of course very different from the OS (two different native threads). This creates the sharing mentioned above.
I see possible two solutions:
[pid, thread id]
instead of justthread id
as key in@allocated
and ensure there is no connection leaking after fork.fork
by change of PID (since there seems to be no hook in standard Ruby before/after fork and decorating it might not be so nice). This could be done in#hold
and do the disconnect/reconnect needed once.What do you think?
I would really appreciate Sequel to be fork-safe by default.
The text was updated successfully, but these errors were encountered: