New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zero-downtime restart support #1406

Merged
merged 16 commits into from Aug 18, 2017

Conversation

Projects
None yet
1 participant
@strugee
Member

strugee commented Aug 7, 2017

Ref #1295

@strugee

This comment has been minimized.

Show comment
Hide comment
@strugee

strugee Aug 9, 2017

Member

Minimum viable product is done. Still TODO:

  • Deny if we have in-flight restarts
  • Prevent workers getting killed by SIGUSR2 when a restart is in-flight
  • Implement a timeout at which workers will get killed
  • Gracefully signal clients to reconnect SockJS
  • Limit to MongoDB
  • Wait until the new worker is actually listening
  • Handle worker respawn errors
  • Blacklist dangerous config file changes (too hard)
  • Some type of sanity checking? E.g. requiring that the new code can respond to HTTP requests? (not necessary for the initial feature; push to another issue)
Member

strugee commented Aug 9, 2017

Minimum viable product is done. Still TODO:

  • Deny if we have in-flight restarts
  • Prevent workers getting killed by SIGUSR2 when a restart is in-flight
  • Implement a timeout at which workers will get killed
  • Gracefully signal clients to reconnect SockJS
  • Limit to MongoDB
  • Wait until the new worker is actually listening
  • Handle worker respawn errors
  • Blacklist dangerous config file changes (too hard)
  • Some type of sanity checking? E.g. requiring that the new code can respond to HTTP requests? (not necessary for the initial feature; push to another issue)
@strugee

This comment has been minimized.

Show comment
Hide comment
@strugee

strugee Aug 18, 2017

Member

Note we need to ship this preffed off. Also I need to blog about it.

Member

strugee commented Aug 18, 2017

Note we need to ship this preffed off. Also I need to blog about it.

@strugee

This comment has been minimized.

Show comment
Hide comment
@strugee

strugee Aug 18, 2017

Member

Actually there's not much point to preffing this off, so we'll just ship it as-is. Will merge when Travis passes.

Member

strugee commented Aug 18, 2017

Actually there's not much point to preffing this off, so we'll just ship it as-is. Will merge when Travis passes.

strugee added some commits Aug 7, 2017

Start work on gracefully shutting down workers
Before this change, workers would just immediately shut down. Instead
we want them to stop accepting connections and gracefully complete
in-flight requests, which this patch starts to move towards.
Kill workers when the new one is *listening*
As opposed to just doing it on a timeout.
Restrict zero-downtime restarts to MongoDB
We need to test other popular Databank drivers before expanding
support to everything.

@strugee strugee changed the title from [WIP] Zero-downtime restart support to Zero-downtime restart support Aug 18, 2017

strugee added some commits Aug 18, 2017

@strugee strugee merged commit d236f48 into master Aug 18, 2017

2 of 3 checks passed

continuous-integration/travis-ci/pr The Travis CI build is in progress
Details
Node Security No known vulnerabilities found
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details

@vxcamiloxv vxcamiloxv deleted the zero-downtime-upgrades branch Nov 2, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment