Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce connectivity check #3947

Open
tobiasKaminsky opened this issue May 3, 2019 · 11 comments
Open

Reduce connectivity check #3947

tobiasKaminsky opened this issue May 3, 2019 · 11 comments
Assignees
Labels
connectivity DNS, TLS, proxies, network connection, etc. related matter enhancement high technical debt

Comments

@tobiasKaminsky
Copy link
Member

Currently we use connectivity check very often, but e.g. if the data connection did not changed, we do not have to check again.

Also it was reported that some users fire 6x status.php checks in a second:

99.99.x.x - user@server [02/May/2019:15:23:47 +0200] "GET /status.php HTTP/1.1" 200 145 "-" "Mozilla/5.0 (Android) ownCloud-android/3.6.0" 0
99.99.x.x - user@server [02/May/2019:15:23:47 +0200] "GET /status.php HTTP/1.1" 200 145 "-" "Mozilla/5.0 (Android) ownCloud-android/3.6.0" 0
99.99.x.x - user@server [02/May/2019:15:23:47 +0200] "GET /status.php HTTP/1.1" 200 145 "-" "Mozilla/5.0 (Android) ownCloud-android/3.6.0" 0
99.99.x.x - user@server [02/May/2019:15:23:47 +0200] "GET /status.php HTTP/1.1" 200 145 "-" "Mozilla/5.0 (Android) ownCloud-android/3.6.0" 0
99.99.x.x - user@server [02/May/2019:15:23:47 +0200] "GET /status.php HTTP/1.1" 200 145 "-" "Mozilla/5.0 (Android) ownCloud-android/3.6.0" 0
99.99.x.x - user@server [02/May/2019:15:23:47 +0200] "GET /status.php HTTP/1.1" 200 145 "-" "Mozilla/5.0 (Android) ownCloud-android/3.6.0" 0
99.99.x.x - user@server [02/May/2019:15:35:06 +0200] "GET /status.php HTTP/1.1" 200 145 "-" "Mozilla/5.0 (Android) ownCloud-android/3.6.0" 0
99.99.x.x - user@server [02/May/2019:15:35:06 +0200] "GET /status.php HTTP/1.1" 200 145 "-" "Mozilla/5.0 (Android) ownCloud-android/3.6.0" 0
99.99.x.x - user@server [02/May/2019:15:35:06 +0200] "GET /status.php HTTP/1.1" 200 145 "-" "Mozilla/5.0 (Android) ownCloud-android/3.6.0" 0
99.99.x.x - user@server [02/May/2019:15:35:07 +0200] "GET /status.php HTTP/1.1" 200 145 "-" "Mozilla/5.0 (Android) ownCloud-android/3.6.0" 0
99.99.x.x - user@server [02/May/2019:15:35:50 +0200] "GET /status.php HTTP/1.1" 200 145 "-" "Mozilla/5.0 (Android) ownCloud-android/3.6.0" 0
99.99.x.x - user@server [02/May/2019:15:35:50 +0200] "GET /status.php HTTP/1.1" 200 145 "-" "Mozilla/5.0 (Android) ownCloud-android/3.6.0" 0
99.99.x.x - user@server [02/May/2019:15:47:22 +0200] "GET /status.php HTTP/1.1" 200 145 "-" "Mozilla/5.0 (Android) ownCloud-android/3.6.0" 0
99.99.x.x - user@server [02/May/2019:15:47:22 +0200] "GET /status.php HTTP/1.1" 200 145 "-" "Mozilla/5.0 (Android) ownCloud-android/3.6.0" 0
99.99.x.x - user@server [02/May/2019:15:47:22 +0200] "GET /status.php HTTP/1.1" 200 145 "-" "Mozilla/5.0 (Android) ownCloud-android/3.6.0" 0
99.99.x.x - user@server [02/May/2019:15:47:22 +0200] "GET /status.php HTTP/1.1" 200 145 "-" "Mozilla/5.0 (Android) ownCloud-android/3.6.0" 0
99.99.x.x - user@server [02/May/2019:15:47:53 +0200] "GET /status.php HTTP/1.1" 200 145 "-" "Mozilla/5.0 (Android) ownCloud-android/3.6.0" 0
99.99.x.x - user@server [02/May/2019:15:47:53 +0200] "GET /status.php HTTP/1.1" 200 145 "-" "Mozilla/5.0 (Android) ownCloud-android/3.6.0" 0
99.99.x.x - user@server [02/May/2019:15:47:53 +0200] "GET /status.php HTTP/1.1" 200 145 "-" "Mozilla/5.0 (Android) ownCloud-android/3.6.0" 0
@ezaquarii
Copy link
Collaborator

I have a refactoring of this ready on my branch somewhere.

@ezaquarii
Copy link
Collaborator

ezaquarii commented May 12, 2019

The main challenge is that the check uses synchronous getter that is called liberally all over the place. To fix this problem we'd probably need to migrate to a newer, event-based network status API.

Synchronous getters are marked as @Deprecated BTW.

PR #4008 should lay a foundation for this.

@tobiasKaminsky
Copy link
Member Author

a newer, event-based network status API.

That would be a good idea.
Do I understand this correct: every time we get a notification about a network change we try to reach server and store this info.
We assume that the info is valid until a network change occurs again?

@ezaquarii
Copy link
Collaborator

ezaquarii commented May 13, 2019 via email

@ezaquarii ezaquarii self-assigned this Aug 5, 2019
@ezaquarii
Copy link
Collaborator

I started working on an event based network listener.

@ezaquarii
Copy link
Collaborator

I started working on it, but I need to consult you before I move forward.

The check is called synchronously in few background operations. That means that whenever we run a remote op:

  1. pro: we have a current state of network and server connectivity
  2. con: this is rather expensive

My initial simplistic approach was to simply wait for platform network status updates, but I see that this check is also verifying server availability by GET-ting status.php. Generally this looks for me like re-invented 503 Service Unavailable HTTP code. I'll assume there was a reason for this.

I'm trying to understand:

  1. why we do this check in the first place
  2. what value it brings for the user to have it
  3. what is the UI/UX implication of it or lack of it?
  4. can we drop it from mobile client?

The issues I see with it right now:

  1. it is unreliable, as it can change between network ops (upstream network outages, admin actions, etc)
  2. moving from polled to event-based API will make this check event more unreliable
  3. polling status.php just after network change is unreliable as many public networks blocks connections until user accepts TOS (walled garden) or grant you access to the world few seconds after device registers itself (ex. by updating firewall rules), so I expect a lot of edge issues around it

My proposed solution:

  1. use platform API to listen for network changes
  2. assume the server is available if network is available
  3. ops fail gracefully if it's not reachable (including "maintenance mode")

I'm not sure if our UX is ready for 3.

@tobiasKaminsky @mario @AndyScherzinger Can you brainstorm this?

@mario
Copy link
Contributor

mario commented Aug 14, 2019

We poke status.php to see if we're behind a walled garden on not pretty much.

@tobiasKaminsky
Copy link
Member Author

We use status.php for old (<NC13) or /index.php/204 for new ones to check if we can access the server.

My proposed solution:

  1. use platform API to listen for network changes
  2. assume the server is available if network is available
  3. ops fail gracefully if it's not reachable (including "maintenance mode")

In 2. we will get problems if user is connected to wifi, but not to internet (e.g. walled garden, or simply a local network)

  1. we do support "server offline", "server in maintenance mode", both in uploads and if you browse files.

Sorry, if I did not get your point, but why cannot we use event-based check:
If platform api notifies us about a network change, we test with status.php/204 once and use the result of this until a new network change is detected

This way we do not have calls that often?
Also we could re-trigger uploads if we detect a switch from "no connection to server" to "server is available".

@mario
Copy link
Contributor

mario commented Aug 15, 2019

@tobiasKaminsky this approach will not work, reason being that network change will not be trigger if for example your "walled garden" session timeouts.

@ezaquarii
Copy link
Collaborator

ezaquarii commented Aug 15, 2019

In 2. we will get problems if user is connected to wifi, but not to internet (e.g. walled garden, or simply a local network)

What kind of issues do we expect? I guess that the op will simply fail to be re-scheduled during next sync window.

In comparision, what happens if we're in walled garden? If I understand the code correctly, op is aborted and re-scheduled.

It looks for me like it's the same thing, except that we make 1 redundant call to status.php on happy path:

Here is the pseudo-code of what I believe is happening:

isOnline = checkStatus()
if !isOnline {
    reshedule()
    return
}
success = sync()
if !success {
    reschedule()
} else {
   showUpdatesOnScreen()
}

I think we could do it a bit simpler:

success = sync()
if !success {
    reschedule()
} else {
   showUpdates()
}
  1. we do support "server offline", "server in maintenance mode", both in uploads and if you browse files.

I'm trying to understand what this means for the user.

Is there any existing UX that depends on this status that we're at risk of breaking?

This is exactly what I'm trying to avoid here - ie. UX regressions - but also I'm trying to figure out how much space for change we have.

Sorry, if I did not get your point, but why cannot we use event-based check:
If platform api notifies us about a network change, we test with status.php/204 once and use the result of this until a new network change is detected

Internet access might depend on other factors, such as acceptance of terms-of-service or delayed firewall updates In such case:

  1. device connects to temporarily walled network
  2. app checks status.php and caches "not accessible" state
  3. user accepts TOS, firewall updates, etc
  4. internet access is granted
  5. app is not aware of it

To mitigate 5. we'd need to poll the status again, which means we're back at square 1 or we risk random complaints of "doesn't-work-at-my-workplace-but-works-in-train"-type.

This way we do not have calls that often?

As you see above, we'd still need to call it often. We can throttle it of course, but:

  1. it adds extra complexity (let's think about it from QA perspective for a moment)
  2. is still not reliable, with occasional hiccups that are tricky to reproduce

Also we could re-trigger uploads if we detect a switch from "no connection to server" to "server is available".

This would not be reliable because "having network" and "having clear connection to server" are separate concepts and the latter one ("server") is not signalled asynchronously. We need to perform expensive polling again. Making polling less frequent makes it less and less reliable.

I'd honestly rely on other mechanisms to handle that and treat "walled garden" simply as yet another random network issue. If app can recover from that gracefully, status.php would not be needed for rescheduling.

But I don;t know what UX relies on that status.php. If this is only sync restart, than I'd say we're safe to explore alternatives.

@tobiasKaminsky
Copy link
Member Author

Thanks for your very detailed answer!

  1. we do support "server offline", "server in maintenance mode", both in uploads and if you browse files.

I'm trying to understand what this means for the user.

Is there any existing UX that depends on this status that we're at risk of breaking?

On FDA we check result of sync and show matching info:

switch (synchResult.getCode()) {
case SSL_RECOVERABLE_PEER_UNVERIFIED:
showUntrustedCertDialog(synchResult);
break;
case MAINTENANCE_MODE:
showInfoBox(R.string.maintenance_mode);
break;
case NO_NETWORK_CONNECTION:
showInfoBox(R.string.offline_mode);
break;
case HOST_NOT_AVAILABLE:
showInfoBox(R.string.host_not_available);
break;
default:
// nothing to do
break;

Thanks for explaining why my idea is not working :-)

I'd honestly rely on other mechanisms to handle that and treat "walled garden" simply as yet another random network issue. If app can recover from that gracefully, status.php would not be needed for rescheduling.

How could we recover gracefully from walled garden / no internet connection without polling to status.php/204?

What about this

  • we do not check for status.php/204 on uploads/calls
  • if we detect that we have no internet connection/server offline/server maintenance mode
    • we set matching UI info
    • we do exponential polling
    • if we have server connection again, we restart pending uplodas

@joshtrichards joshtrichards added technical debt connectivity DNS, TLS, proxies, network connection, etc. related matter labels Oct 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
connectivity DNS, TLS, proxies, network connection, etc. related matter enhancement high technical debt
Projects
None yet
Development

No branches or pull requests

4 participants