New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EventedFileUpdateChecker boots once per process #25302

Merged
merged 5 commits into from Jun 17, 2016

Conversation

Projects
None yet
5 participants
@schneems
Member

schneems commented Jun 6, 2016

We need one file checker booted per process as talked about in #24990. Before we do a check to see if any updates have been registered by the listener we first check to make sure that the current process has booted a listener.

We are intentionally not starting a listener when the checker is created. This way we can avoid #25259 in which puma warns of multiple threads created before fork. As written the listener for each process will be invoked by the ActionDispatch::Executor middleware when the updated? method is called. This is the first middleware on the stack and will be invoked before application code is read into memory.

The downside of this approach is that the API is a little less obvious. I.e. that you have to call updated? to get the listener to start is not intuitive. We could make boot! not private if we want to make the API a little nicer. Alternatively we could boot when the checker is initialized however this reintroduces the puma threads warning, and also means that in cases of rails server or when using preload! that we have extra threads notifying of changes on a process that we don't care about.

[close #24990] [close #25259]

schneems added some commits Jun 6, 2016

EventedFileUpdateChecker boots once per process
We need one file checker booted per process as talked about in #24990. Before we do a check to see if any updates have been registered by the listener we first check to make sure that the current process has booted a listener.

We are intentionally not starting a listener when the checker is created. This way we can avoid #25259 in which puma warns of multiple threads created before fork. As written the listener for each process will be invoked by the `ActionDispatch::Executor` middleware when the `updated?` method is called. This is the first middleware on the stack and will be invoked before application code is read into memory.

The downside of this approach is that the API is a little less obvious. I.e. that you have to call `updated?` to get the listener to start is not intuitive. We could make `boot!` not private if we want to make the API a little nicer. Alternatively we could boot when the checker is initialized however this reintroduces the puma threads warning, and also means that in cases of `rails server` or when using `preload!` that we have extra threads notifying of changes on a process that we don't care about.

[close #24990] [close #25259]
@schneems

This comment has been minimized.

Member

schneems commented Jun 6, 2016

@schneems schneems changed the title from Schneems/evented file boot at check time master to EventedFileUpdateChecker boots once per process Jun 6, 2016

@schneems schneems added this to the 5.0.0 milestone Jun 6, 2016

Listen earlier in EventedFileUpdateChecker
Some files like routes.rb may be very large and vary between the initialization of the app and the first request. In these scenarios if we are using a forked process we cannot rely on the files to be unchanged between when the code is booted and the listener is started. 

For that reason we start a listener on the main process immediately, when we detect that a process does not have a listener started we force the updated state to be true, so we are guaranteed to catch any changes made between the code initialization and the fork.
@schneems

This comment has been minimized.

Member

schneems commented Jun 6, 2016

Updated with logic to ensure we're never in a situation where a process has stale data.

end
def updated?
boot! unless @pid_hash[Process.pid]

This comment has been minimized.

@sgrif

sgrif Jun 7, 2016

Member

Does this need to be atomic? e.g.

pid = Process.pid
@pid_hash.compute_if_absent(pid) { boot! }
@pid_hash[pid] = true

This comment has been minimized.

@schneems

schneems Jun 8, 2016

Member

In practice it's fine if we accidentally boot more than one listener per process, we want to make sure we boot at least once though. Concurrent::Hash doesn't have a compute_if_absent locking method

This comment has been minimized.

@thedarkone

thedarkone Jun 9, 2016

Contributor

Concurrent::Hash doesn't have a compute_if_absent locking method

The method is from Concurrent::Map (Conc::Map is a Hash-like data structure with a few concurrency methods thrown in, Concurrecy::Hash is just a synchronized carbon copy of Ruby Hash).

Do we even need hash for this? Can't we just have @listen_started_on_pid and check Process.pid == @listen_started_on_pid, with @listen_started_on_pid = Process.pid in boot!?

This comment has been minimized.

@sgrif

sgrif Jun 9, 2016

Member

We should still use an atomic even if it's not a hash. While it's technically fine to have more than one listener, it's unnecessary and easy to avoid.

@sgrif

This comment has been minimized.

Member

sgrif commented Jun 7, 2016

I definitely think we need the API to be more explicit here. The fact that updated? is what causes the checker to boot makes the code path that causes this to work to become really nonlocal. (I was actually convinced that this didn't fix the issue at first, took about 20 minutes to see why it does).

It'd be good to add an integration test for this to the railties test suite as well.

@schneems

This comment has been minimized.

Member

schneems commented Jun 8, 2016

We need some hook that we know will be called in each process, in this case I know updated? will be executed in the middleware. If we had some kind of on_fork callback this logic would all be much simpler, but nothing like that exists and I don't feel like monkey patching fork is a great idea.

It'd be good to add an integration test for this to the railties test suite as well.

I can look into that.

Test how evented file checker handles forks
Pretty proud of this. We are testing distributed processes synchronized via pipes which makes it deterministic. Pretty cool.

We boot a listener in the parent process we then fork. Before we touch the file we verify the fork is booted using pipes. Then the parent process will touch the file while the fork waits on a pipe. Once the parent process signals that the file has been touched we continue inside of the fork.
@schneems

This comment has been minimized.

Member

schneems commented Jun 8, 2016

I added a test, i'm pretty pleased with the result. Using pipes to synchronize the two processes.

@schneems

This comment has been minimized.

Member

schneems commented Jun 9, 2016

Updated to have a threadsafe boot! process, also using a instance variable instead of a Concurrent::Hash

@schneems schneems force-pushed the schneems:schneems/evented-file-boot-at-check-time-master branch from 60c11b5 to 844af9f Jun 13, 2016

schneems added a commit to schneems/rails that referenced this pull request Jun 16, 2016

Experimental Alternative to rails#25302
With rails#25302 we have to assume that the files on disk have changed in a fork. This means that the first call to `updated? must always be true. Instead we can use 3 different IO.pipe to synchronize communications between between any number of forked processes.

We use 1 pipe for the primary communication, another pipe is used by children to notify the parent that a new fork has booted. Once a child is detected as being booted a value is written to the primary pipe either 'T' if the files have changed or 'F' if they haven't. Once the value is read the same pipe responds with on the final pipe with an ACK. This lets the parent know that 1 of the processes has received a message and it can begin waiting for the next process to boot.

There exists a scenario where a race condition is possible. Parent PID boots, and there's no changes on disk. Child process boots and asks the parent if there are any changes. The parent PID may send a "F" to the child but between the time that the "F" is sent and it is read the file may change. We can eliminate this race condition if we boot the listener on the child process before we tell the parent that the child has booted. That's exactly what this PR does, is hooks into the `before_listen` and uses that to boot a worker before we even tell the parent process we've booted.

As is this is a WIP and more of a proof of concept. I think it's an interesting solution that may possibly be viable. It seems more technically complete than rails#25302 however that completeness comes at additional complexity. cc @e2
@sgrif

This comment has been minimized.

Member

sgrif commented Jun 17, 2016

Going to go forward with this one. I think my previous concerns were rooted in thinking that the changed callback did more than it actually does. I think we should eventually consider an explicit Rails.application.forked! call which needs to occur if you're using a forking web server, but that is out of scope for 5.0.0 at this point.

@sgrif sgrif merged commit 30dd8b2 into rails:master Jun 17, 2016

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

sgrif added a commit that referenced this pull request Jun 17, 2016

Merge pull request #25302 from schneems/schneems/evented-file-boot-at…
…-check-time-master

EventedFileUpdateChecker boots once per process

sgrif added a commit that referenced this pull request Jun 17, 2016

Merge pull request #25302 from schneems/schneems/evented-file-boot-at…
…-check-time-master

EventedFileUpdateChecker boots once per process
@schneems

This comment has been minimized.

Member

schneems commented Jun 20, 2016

Yay! Thanks! Just saw this notification in my email.

My only concern is that it's kinda weird that update? returns true on the first call. It makes sense for our application of the checker but it's an odd implementation artifact that maybe shouldn't live in this class.

As an alternative we could implement a EventedFileChecker#forked? that takes a block and use that in the middleware. Something like:

evented_file_checker.forked? {|checker|
  checker.updated.make_true
  checker.boot!
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment