New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Desktop and mobile apps loses connection to server #3449
Comments
We got some log files and the last line on the worker log shows a "failed request", PUT, |
Turns out it is related to the sleep mode. Steps to reproduce:
|
This issue is also occurring on Windows devices and mobile apps. |
I struggle to reproduce it on mac but I was able to reproduce in on Linux with sleep mode. I would suggest detecting the sleep mode by timer and forcible reconnecting websocket. By spec websocket must send pings and pongs but somehow it takes a long time for it to recover. |
It seems like the issue is caused by how Chromium handles suspension (or rather doesn't). After coming back from suspension of the device it doesn't detect that the connection might have been broken until ping interval expires (which is minutes for us). In this time websocket is in the peculiar state where it thinks that it is connected, but it missed some events that were sent by the server but not acknowledged by the client. Once ping interval occurs the server would re-send all the events missed by the client which is what we would want ideally, but it seems to be too slow to be reliable. In this time users don't see the new state and the results of their actions are not displayed. This is a suboptimal experience. To address this we introduced SleepDetector. It uses simple idea with scheduling a recurring task and measuring the time between invocations. Our time period is hopefully loose enough that slight throttling by browsers won't affect it. Once we realize that more time than needed has passed we force reconnect. Alternatively, we could try to send an event over the socket to force re-synchronization, but it might not produce the result we want, restarting the connection seems to be the most reliable method.
It seems like the issue is caused by how Chromium handles suspension (or rather doesn't). After coming back from suspension of the device it doesn't detect that the connection might have been broken until ping interval expires (which is minutes for us). In this time websocket is in the peculiar state where it thinks that it is connected, but it missed some events that were sent by the server but not acknowledged by the client. Once ping interval occurs the server would re-send all the events missed by the client which is what we would want ideally, but it seems to be too slow to be reliable. In this time users don't see the new state and the results of their actions are not displayed. This is a suboptimal experience. To address this we introduced SleepDetector. It uses simple idea with scheduling a recurring task and measuring the time between invocations. Our time period is hopefully loose enough that slight throttling by browsers won't affect it. Once we realize that more time than needed has passed we force reconnect. Alternatively, we could try to send an event over the socket to force re-synchronization, but it might not produce the result we want, restarting the connection seems to be the most reliable method.
Reconnect works, but it seems a little noisy with ERR_NOT_RESOLVED errors before it manages to get back on track. Just in case that's problematic: log
This does not occur every time, suspend during signup was fine |
I think the |
on windows, it doesn't show the |
Tested on iOS, no change in behavior, immediate reconnect |
We are getting reports from macOS desktop app users that the app stops getting updates and they are unable to move/delete mails.
App versions: at least 3.85.10 - 3.87.1
Possibly related to #2337
Test notes
Websocket Spec for regression
The text was updated successfully, but these errors were encountered: