Excessively long timeout on stream reconnect #696
Comments
Gosh, I remember the day (long ago) that I set it to 30 minutes, but I don't remember why I chose that value. I agree, it seems excessively long. I will circulate the idea of reducing it to 30 seconds and see if anyone objects. I suppose ideally we'd use an even shorter retry time on platforms where battery life or data charges are known not to be an issue. |
@sixolet noticed that HTML5 has navigator.onLine which lets us detect good times to reconnect! Cool demo: One direction here would be to keep the max timeout relatively long (still probably not 30 minutes) but force an immediate retry when we get the HTML5 event telling us that we've just reestablished connectivity. |
Couple issues with navigator.onLine ... on mobile devices, it detects whether there's a mobile data or WiFi connection, but not the quality of the connection. I can be on the edge of the range of a WiFi router and appear to be online, but not actually be able to make connections. On the desktop navigator.onLine is fairly useless... it detects if the user has manually put the browser into offline mode, but not whether the laptop has an active Internet connection at all. What makes doing retries at the same interval as the longpoll interval worse than longpolls? Wouldn't they involve about the same amount of network activity? |
- Lower reconnect timeout to 5m - Respond to the 'online' event to reconnect immediately eg. in case you switched from 3G to Wi-Fi
One thing I learned while doing this - navigator.onLine on desktop does detect whether you have a wi-fi connection open (at least on Chrome on Mac OS) The reason we need to make sure retries on reconnect are less frequent than long polling is because we might reach a disconnected state specifically because of a server load issue, in which case we must make sure to connect less frequently to enable to server to recover. |
This is an improvement, but the issue is not resolved. Five minutes to reconnect (after the Internet connection has been poor enough to so that the browser was unable to connect to the server, without actually being fully offline and notified of that state) is still a miserable experience. The fix for server overload is to throttle connections at the server and return a 503 from the server when it is busy. I understand that you wouldn't want to resolve this issue by means of reducing the client timeout until you have something like that implemented on the server, but that just means that addressing your server architecture is a prerequisite for fixing this issue. A note for comparison is that in my informal testing with your competition the recovery time is usually around 30 seconds. |
Gmail has a similar timeout
|
30 sec or 5 min? |
Much longer.
|
I'm sorry, I'm not clear on what you're saying... it's OK to have a bad mobile experience on Meteor because gmail is also bad? :) Or that you intend to implement gmail's "Try Now" UI, which while crappy, at least makes it possible for the user to reconnect? If you want to have a seamless "just works" experience without user intervention, 30 seconds is about as long as you can wait before the user starts to wonder why the application isn't working when they've walked back into range of the router and other sites are working. 30 seconds is also the recovery time I noticed when I was trying out some of the other real time services, though my testing was informal. On the other hand, maybe it's not your intention to provide the same level of real-time service on *.meteor.com (for free :-) ? And perhaps you'll have a sample "reconnect now" UI, similar to the accounts-ui package? Which is also fine, though to close out this issue I'd suggest documenting it somewhere, whether in the wiki or on the roadmap or wherever. |
Oops, I think I may be guilty of projecting my own goals onto Meteor ^_^ |
All I am saying is that if we were to support 30 second timeouts we would have to make sure to do something intelligent on the server to make sure we don't overload our severs in certain failure cases. We might want to end up doing that, but we don't have that at the moment. GMail and Asana do similar things, for similar reasons as far as I can tell. I think a package that shows an overlay when you're disconnected, with a timer and a "reconnect now" button is a great idea. Would you like to build that? Alternatively, you're welcome to add this to your list of issues on mobile. I don't think such a thing should be on our roadmap as the roadmap is intended to be at a much higher level. |
That'd be easy to write with When I said I was "guilty of projecting my own goals onto Meteor" what I meant was that personally for my own projects I would prefer something that was better than gmail along this particular dimension. But that's something I could buy if I wanted to: I could run my own server, or see if one of the commercial messaging services will do what I want. "As good as gmail" is a fine goal (awesome, actually) for a service which is a) free and b) an order of magnitude easier to implement with. Sorry for the rant! :) |
stream_client uses an exponential back-off algorithm when attempting reconnects, maxing out to an interval of 30 minutes between reconnect attempts (!). This is a miserable experience on mobile devices with an intermittent Internet connection, as the Internet connection returns but the app doesn't attempt to reconnect for long periods.
For reconnects unassisted by a "retry now" UI, it's best if the reconnect attempt interval not be longer than 30 seconds for a reasonable experience. 30 seconds is actually a long time to wait if one has gotten back on Internet and is waiting for the app to wake up again, but it's bearable... longer wait times get frustrating.
Of course if the potential impact on the server cluster wasn't an issue it would be an easy fix, just change RETRY_MAX_TIMEOUT on the client to 30000 :-)
The text was updated successfully, but these errors were encountered: