Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hook to know when unit wants to shut down #581

Open
krburkhart opened this issue Sep 13, 2021 · 12 comments
Open

Hook to know when unit wants to shut down #581

krburkhart opened this issue Sep 13, 2021 · 12 comments
Assignees

Comments

@krburkhart
Copy link

I am running Django Channels under Unit 1.23.

If I have an ASGI application running with connected web sockets, that application does not shut down.

Is there a way from within my ASGI app to know that the Unit wants to shut down so that I can notify connected clients, clean up and let the new Unit instance take over?

I thought I might be able to use lifespan messages to manage this, but the lifespan shutdown message only occurs after all websockets have disconnected and it's too late to do anything. Perhaps if this message was sent immediately after listeners are removed rather than after all connections are closed?

Thank you,

-Kelly

@mar0x
Copy link
Contributor

mar0x commented Sep 14, 2021

Hello,

It seems I understand what you are trying to achieve. According to ASGI Spec, the lifespan.shutdown event should be

Sent to the application when the server has stopped accepting connections and closed all active connections.

And there is no events in specification to send before active connections closed :(

For investigation purpose, I've added logging to sample ASGI WebChat application (see attached asgi_chat.zip ) and run it under uvicorn-0.15.0, daphne-3.0.2 and hypercorn-0.11.

  • uvicorn: closes all connections before lifespan.shutdown;
  • daphne: does not support lifespan at all;
  • hypercorn: leave all connections opened (looks like what you need).

I assume there can be issues with Unit's ASGI implementation, so could you please describe the solution which works for other ASGI server(s).

@krburkhart
Copy link
Author

Based on above, I believe uvicorn follows the spec properly (and would fix my issue). Hypercorn and Unit do not.

Sent to the application when the server has stopped accepting connections and closed all active connections.

The closing of connections clause above implies that the server will actively close connections, not passively wait for them to close.

My issue is that the Unit server does not close the sockets and my application has no way of knowing when we enter this state, so my application cannot close them. Any clients connected to the wanting-to-die instance will stay connected and prevent the process from shutting down.

I suggest that Unit either should adopt uvicorn's lifespan behavior, or provide an alternate facility for application code to know when the server wants to shut down.

-K

@mar0x
Copy link
Contributor

mar0x commented Sep 14, 2021

To reproduce your issue should I stop Unit or reconfigure application?

@krburkhart
Copy link
Author

Reconfigure application. What I am doing is:

  • deploy new code
  • set up alternate listener for new code, verify working
  • move default listener to new code

This should cause old unit instances to shut down. Reconfigure with ENV change or similar should do the same thing.

Thanks,

-K

@krburkhart
Copy link
Author

I am using the channels chat example. After reconfigure, the old processes are still running. If I refresh my browser, causing websocket to disconnect/reconnect, the old unit processes immediately exit.

@mar0x
Copy link
Contributor

mar0x commented Sep 15, 2021

Correct. Unit should keep the application process running while WebSocket connections served by the process are opened.

Unfortunately, there is a bug in Unit code, which makes established WebSocket connections hang if the listener removed. Please try this patch to fix the issue.

@krburkhart
Copy link
Author

That did not work. I am running 1.22.0. Let me know if I need to patch a newer version or HEAD. I have your example running:

kelly@nkcdev11:~/download/nginx/unit-1.22.0.patched $ curl -X GET http://localhost:8000/config
{
"listeners": {
"*:8086": {
"pass": "applications/asgi_chat"
}
},

"applications": {
	"asgi_chat": {
		"environment": {
			"ver": "3"
		},

		"module": "asgi_chat",
		"path": "/home/kelly/download/asgi",
		"processes": 1,
		"type": "python 3"
	}
},

"access_log": "/var/log/unit/access.log"

}

I hit this with a browser, thus establishing a web socket connection.
I re-PUT the config, with updated environment. This causes a new instance of asgi_chat to start, but the old version does not shut down.
If I refresh the web browser, then the old process shuts down.

I need the old version to shut down when I reconfigure. OR I need some way to know in my application code that reconfigure has happened so I can shut down web sockets. Otherwise old asgi_chat instances will never shut down.

-K

@mar0x
Copy link
Contributor

mar0x commented Sep 16, 2021

... old version does not shut down.
If I refresh the web browser, then the old process shuts down.

This is how it supposed to work. Unit keeps the process running to serve established WebSocket connections. The patch only fixes the functionality of these connections.

There is an "application restart" feature (see Process Management ) introduced in 1.25.0. During restart, WebSocket connections are closed.

@mar0x
Copy link
Contributor

mar0x commented Sep 16, 2021

I would suggest the following sequence for test and deployment using restart feature introduced in 1.25.0:

  • deploy new code;
  • set up exact copy of the application with"-test" suffix, bound it to "test" listener;
  • verify working on test port;
  • if does not work, re-deploy code, restart "test" app, go to verify stage;
  • restart the main app (should stop all processes, close WebSockets);
  • remove "-test" application and "test" listener.

@krburkhart
Copy link
Author

The new restart feature allows me to accomplish my goal. Thank you.

nginx-hg-mirror pushed a commit that referenced this issue Sep 20, 2021
Because the configuration values were read from the listener's configuration,
an established WebSocket connection was unable to work properly (i. e. stuck)
if the listener was removed.  The correct source of configuration values is the
request config joint.

This is related to issue #581 on GitHub.
@VBart
Copy link
Contributor

VBart commented Nov 25, 2021

@krburkhart the event, that would help to gracefully resolve this situation was discussed here: django/asgiref#199

@krburkhart
Copy link
Author

@VBart, I had not seen that, thanks.

The lifespan.shutdown.notice event is exactly what I would like to see. However it appears to have been vetoed. Disappointing for me, however I'm reluctant to squander goodwill by beating a dead horse. Your restart functionality with my shenanigans is sufficient for me to achieve my goals.

While not entirely in Unit's court, it would IMO be useful if the asgiref people tightened up the shutdown spec and uvicorn, daphne, hypercorn, unit implemented lifespan consistently.

Thanks,

-K

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants