-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Calling Presence.Track in a channel process changes the behavior of the channel when the server is stopped #68
Comments
It's not clear in your description why the second channel is erroneously closing. If it closes for a non-normal reason, you will necessarily receive error logs about the reason. Do you have any relevant logs for BChannel? We link to the tracker because our presence is necessarily tied to the tracker. If it goes down, we should also fails and when we recover and the client reconnects, the presence will be re-tracked. |
The channels are closing because the Application is being stopped. The issue is that they're closing with different reasons - AChannel is exiting with reason :shutdown, whereas BChannel exits with reason :killed. The way I repro-ed this was:
|
Thanks for following up, that has been very well detailed. If we are going to fix this, we will likely do it by making BChannel's exist with shutdown reason too. Therefore, could you let us know why is your application being restarted? |
For deployment I build releases using distillery. We're not doing hot upgrades, so to deploy I just untar the new release and run: This cleanly stops the application and then starts it with the new code. My interim solution to this problem is to just Incidentally - the behavior I'd prefer is for AChannel to exit with :killed :) That way after the restart the client will reconnect to the channel. I can see why that doesn't make sense in a clean shutdown, I just wonder how many people already rely on that reconnect behavior and would get thrown off by it not reconnecting. Perhaps an alternate solution is to have the Presence process monitor the channel process instead of linking. Then if the channel process dies, the presence process can also kill itself, but if the presence process does the supervisor can restart it and it can monitor the channel process again? |
Maybe the answer is to force a client reconnect unless the channel exits with normal reason. |
…:left} and {:shutdown, :closed}. Fixes phoenixframework/phoenix_pubsub#68
I put together what I think is the least invasive approach that makes sense. You can see the change here: hanrelan/phoenix@2a3b38e Essentially I treat Happy to submit this as a PR in a topic branch if you think this makes sense. |
This should already happen today. If the channel exits ungracefully, the client will auto rejoin with expo back off. @hanrelan are you using the phoenix.js channels client? |
I am using the phoenix.js channels client I think Jose meant However, The diff I sent before changes it so only explicitly closing the channel (via the client's This keeps the behavior between processes that are linked to a presence and processes that aren't linked to a presence the same with respect to shutdowns and reconnect attemps. Happy to discuss on Slack as well if that's easier, I'm @hanrelan in the phoenix channel |
Ah I see. We could change the behavior of the phx_close event to differentiate
What should happen in that case? We won't be able to determine the difference between this graceful exit and a "graceful restart" where the client should re-establish a connection. |
@chrismccord I think we should consider it the same as a link. Which means that we will break on everything that is not |
So I attempted a fix at this and we use |
yes, that is my concern that |
Another thought, but obviously you guys are much more familiar with this than I am so feel free to ignore if this makes no sense. The other case to deal with is if the Tracker process errors out for some reason, you don't want presence to silently stop working for a channel process. My understanding is a fuzzier here, but maybe this could be resolved by using a |
@hanrelan the tracker trap exits exactly because we don't want the tracker to crash if a "client" crashes. But we need to link to solve your last paragraph otherwise there is no way for the "presence client" to be brought down if the presence server crashes. The client could monitor and raise but links are more appropriate here. I believe @chrismccord is working on a solution though. :) |
Ah got it I misunderstood the purpose of the link. Thanks for the explanation and the quick response to this issue @chrismccord and @josevalim! |
ref: phoenixframework/phoenix_pubsub#68 Previously a channel could gracefully terminate using the stop semantics of a regular genserver; however when restarting an application for deploys the shutdown of the transport and channel processes would be indistinguisable from an intentional channel shutdown and would cause clients to incorrectly not reconnect after server restart. This commit adds a {:graceful_exit, channel_pid, %Phoenix.Socket.Message{}} contract to distinguish an intentional channel exit from what should be regarded as an error condition on the client.
ref: phoenixframework/phoenix_pubsub#68 Previously a channel could gracefully terminate using the stop semantics of a regular genserver; however when restarting an application for deploys the shutdown of the transport and channel processes would be indistinguisable from an intentional channel shutdown and would cause clients to incorrectly not reconnect after server restart. This commit adds a {:graceful_exit, channel_pid, %Phoenix.Socket.Message{}} contract to distinguish an intentional channel exit from what should be regarded as an error condition on the client.
This has been fixed on phoenix master and will ship with the 1.3 release. Check the PR commit details for details but tldr; is your app restarts will be seen as errors by the client and they will reconnect. Cheers! |
ref: phoenixframework/phoenix_pubsub#68 Previously a channel could gracefully terminate using the stop semantics of a regular genserver; however when restarting an application for deploys the shutdown of the transport and channel processes would be indistinguisable from an intentional channel shutdown and would cause clients to incorrectly not reconnect after server restart. This commit adds a {:graceful_exit, channel_pid, %Phoenix.Socket.Message{}} contract to distinguish an intentional channel exit from what should be regarded as an error condition on the client.
ref: phoenixframework/phoenix_pubsub#68 Previously a channel could gracefully terminate using the stop semantics of a regular genserver; however when restarting an application for deploys the shutdown of the transport and channel processes would be indistinguisable from an intentional channel shutdown and would cause clients to incorrectly not reconnect after server restart. This commit adds a {:graceful_exit, channel_pid, %Phoenix.Socket.Message{}} contract to distinguish an intentional channel exit from what should be regarded as an error condition on the client.
ref: phoenixframework/phoenix_pubsub#68 Previously a channel could gracefully terminate using the stop semantics of a regular genserver; however when restarting an application for deploys the shutdown of the transport and channel processes would be indistinguisable from an intentional channel shutdown and would cause clients to incorrectly not reconnect after server restart. This commit adds a {:graceful_exit, channel_pid, %Phoenix.Socket.Message{}} contract to distinguish an intentional channel exit from what should be regarded as an error condition on the client.
If you have two channels, call them AChannel and BChannel defined as:
And you deploy this server in a release, then stopping or restarting the release (eg.
bin/myapp restart
) has subtly different behavior:In the AChannel case, when the server is restarted, the client will receive an
onClose
event.In the BChannel case, when the server is restarted, the client will receive an
onError
event!This results in the client for AChannel not attempting to reconnect to the channel when the server comes back up, but the client in BChannel will reconnect.
I believe this is due to this line:
https://github.com/phoenixframework/phoenix_pubsub/blob/master/lib/phoenix/tracker.ex#L402
I think that this causes the process in AChannel to get shutdown correctly when the supervision trees are brought down, so it sends a close message, whereas people are probably expecting the behavior exhibited by BChannel.
Not sure what the fix is here - setting trap_exit on the channel process seems like overkill. Perhaps the Tracker should only set up a monitor?
The text was updated successfully, but these errors were encountered: