Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate zero-downtime upgrades #1295

Closed
strugee opened this issue Mar 15, 2017 · 0 comments
Closed

Investigate zero-downtime upgrades #1295

strugee opened this issue Mar 15, 2017 · 0 comments
Labels
Milestone

Comments

@strugee
Copy link
Member

strugee commented Mar 15, 2017

It may be possible to perform zero-downtime upgrades for setups with clustering configured to use >1 worker. Essentially, the process would be:

  1. Admin sends e.g. SIGUSR1 to the pump master process
  2. A cluster worker is selected by the master process and told to shut down
  3. The worker stops accepting new connections and finalizes current connections (note this may be tricky because WebSockets)
  4. Worker shuts down
  5. Master process starts a new worker as normal
  6. Repeat from step 2, selecting a different cluster worker until all workers have upgraded

Potential problems here:

  1. This leaves the master process out-of-date, but we don't update it that often. Will this be a problem? Can we possibly exec() a new master process?
  2. How do we deal with semver-major upgrades? This should work if there's only config changes but database migrations will be sticky. Seems like we could just document whether each release was compatible with the zero-downtime feature?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant