New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Graceful restart #1033

Open
minichate opened this Issue May 29, 2017 · 6 comments

Comments

Projects
None yet
3 participants
@minichate
Contributor

minichate commented May 29, 2017

It'd be wonderful if ProxySQL could fork() and exec(), then gracefully drain existing connections on the old process, while accepting new connections on the new process. Ideally we'd be able to signal the old process to do this via SIGHUP.

This would allow upgrading the ProxySQL binary without dropping any in-flight client queries.

I may be able to contribute some time developing/testing this feature over the next few months, if you think its worth having.

@DrTyrell

This comment has been minimized.

Show comment
Hide comment
@DrTyrell

DrTyrell May 29, 2017

+1 , I'm no dev, so thanks @minichate !

+1 , I'm no dev, so thanks @minichate !

@minichate

This comment has been minimized.

Show comment
Hide comment
@minichate

minichate May 29, 2017

Contributor

For context: What we currently do is have a set of standby ProxySQL instances on different machines. Since our applications can gracefully restart, we reconfigure those apps to point at the standby ProxySQL instances and then restart them.

Once there is no more traffic pointed at the local ProxySQL instances, we can stop, upgrade and restart them.

Finally, once we've verified that the upgraded binaries are started, we do the reverse of the procedure above to shift traffic back at the local ProxySQL's.

The problem is that this takes considerable time when you have dozens/hundreds of machines. Furthermore, it is tricky to automate using tools like Puppet/Ansible. Finally, it incurs the cost of an additional network hop, which affects performance of the applications.

Contributor

minichate commented May 29, 2017

For context: What we currently do is have a set of standby ProxySQL instances on different machines. Since our applications can gracefully restart, we reconfigure those apps to point at the standby ProxySQL instances and then restart them.

Once there is no more traffic pointed at the local ProxySQL instances, we can stop, upgrade and restart them.

Finally, once we've verified that the upgraded binaries are started, we do the reverse of the procedure above to shift traffic back at the local ProxySQL's.

The problem is that this takes considerable time when you have dozens/hundreds of machines. Furthermore, it is tricky to automate using tools like Puppet/Ansible. Finally, it incurs the cost of an additional network hop, which affects performance of the applications.

@renecannao

This comment has been minimized.

Show comment
Hide comment
@renecannao

renecannao May 29, 2017

Contributor

Somehow related to #1018.
ProxySQL has a PROXYSQL STOP command that is meant to perform a gracefully shutdown.
This command is not tested from very long time and I need to verify if is still working as expected: it used to work in 1.1 , but the codebase is changed so much I think that doesn't work anymore.
Once PROXYSQL STOP is confirmed to work, this can be combined with --reuseport (see #997) for a real online upgrade.
The only requirement outside ProxySQL is to have a kernel >= 3.9 .

Contributor

renecannao commented May 29, 2017

Somehow related to #1018.
ProxySQL has a PROXYSQL STOP command that is meant to perform a gracefully shutdown.
This command is not tested from very long time and I need to verify if is still working as expected: it used to work in 1.1 , but the codebase is changed so much I think that doesn't work anymore.
Once PROXYSQL STOP is confirmed to work, this can be combined with --reuseport (see #997) for a real online upgrade.
The only requirement outside ProxySQL is to have a kernel >= 3.9 .

@minichate

This comment has been minimized.

Show comment
Hide comment
@minichate

minichate May 29, 2017

Contributor

@renecannao I'm wondering if there is a way to not require SO_REUSEPORT. Selfishly, our production servers don't have that kernel feature, so I was hoping that there is a way to gracefully upgrade without requiring it.

Contributor

minichate commented May 29, 2017

@renecannao I'm wondering if there is a way to not require SO_REUSEPORT. Selfishly, our production servers don't have that kernel feature, so I was hoping that there is a way to gracefully upgrade without requiring it.

@renecannao

This comment has been minimized.

Show comment
Hide comment
@renecannao

renecannao May 29, 2017

Contributor

@minichate , the way I see a graceful upgrade is to have two processes (let's not enter into the details of how these 2 processes were started, either by fork() + exec() of the same parent or different parents) listening on the same port for some time, then the old proxy closing the port while the new one continue accepting incoming request.
For this to happen, SO_REUSEPORT is required.

Another option is to have proxysql1 closes the port, and then start proxysql2 while proxysql1 is still running. This implementation doesn't require SO_REUSEPORT, but you will have a small "downtime" (maybe very small, but still...).

Contributor

renecannao commented May 29, 2017

@minichate , the way I see a graceful upgrade is to have two processes (let's not enter into the details of how these 2 processes were started, either by fork() + exec() of the same parent or different parents) listening on the same port for some time, then the old proxy closing the port while the new one continue accepting incoming request.
For this to happen, SO_REUSEPORT is required.

Another option is to have proxysql1 closes the port, and then start proxysql2 while proxysql1 is still running. This implementation doesn't require SO_REUSEPORT, but you will have a small "downtime" (maybe very small, but still...).

@renecannao

This comment has been minimized.

Show comment
Hide comment
Contributor

renecannao commented Aug 1, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment