Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Couldn't connect to server #72

Closed
rfc2822 opened this issue Nov 11, 2022 · 21 comments
Closed

Couldn't connect to server #72

rfc2822 opened this issue Nov 11, 2022 · 21 comments
Assignees
Labels
bug Something isn't working
Milestone

Comments

@rfc2822
Copy link
Member

rfc2822 commented Nov 11, 2022

Updating the calendars doesn't work regularly. Error message: "Couldn't connec to server."

Seems like the Internet connection is not ready yet when the work manager initiates the synchronization.

Similar to #24

@rfc2822 rfc2822 added the bug Something isn't working label Nov 11, 2022
@rfc2822
Copy link
Member Author

rfc2822 commented Nov 12, 2022

Screenshot_20221112-151740_ICSx.png

Maybe related to square/okhttp#3974 ?
Maybe also square/okhttp#6611

@rfc2822 rfc2822 added this to the 2.0.4 milestone Dec 1, 2022
@rfc2822
Copy link
Member Author

rfc2822 commented Dec 4, 2022


Unable to resolve host "zen.dev001.net": No address associated with hostname

java.net.UnknownHostException: Unable to resolve host "zen.dev001.net": No address associated with hostname
	at java.net.Inet6AddressImpl.lookupHostByName(Inet6AddressImpl.java:156)
	at java.net.Inet6AddressImpl.lookupAllHostAddr(Inet6AddressImpl.java:103)
	at java.net.InetAddress.getAllByName(InetAddress.java:1152)
	at okhttp3.Dns$Companion$DnsSystem.lookup(Dns.kt:49)
	at okhttp3.internal.connection.RouteSelector.resetNextInetSocketAddress(RouteSelector.kt:169)
	at okhttp3.internal.connection.RouteSelector.nextProxy(RouteSelector.kt:132)
	at okhttp3.internal.connection.RouteSelector.next(RouteSelector.kt:74)
	at okhttp3.internal.connection.RealRoutePlanner.planConnect(RealRoutePlanner.kt:147)
	at okhttp3.internal.connection.RealRoutePlanner.plan(RealRoutePlanner.kt:67)
	at okhttp3.internal.connection.SequentialExchangeFinder.find(SequentialExchangeFinder.kt:30)
	at okhttp3.internal.connection.RealCall.initExchange$okhttp(RealCall.kt:267)
	at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.kt:32)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
	at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.kt:95)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
	at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.kt:84)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
	at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.kt:65)
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
	at okhttp3.internal.connection.RealCall.getResponseWithInterceptorChain$okhttp(RealCall.kt:205)
	at okhttp3.internal.connection.RealCall$AsyncCall.run(RealCall.kt:533)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1137)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:637)
	at java.lang.Thread.run(Thread.java:1012)
Caused by: android.system.GaiException: android_getaddrinfo failed: EAI_NODATA (No address associated with hostname)
	at libcore.io.Linux.android_getaddrinfo(Native Method)
	at libcore.io.ForwardingOs.android_getaddrinfo(ForwardingOs.java:133)
	at libcore.io.BlockGuardOs.android_getaddrinfo(BlockGuardOs.java:222)
	at libcore.io.ForwardingOs.android_getaddrinfo(ForwardingOs.java:133)
	at java.net.Inet6AddressImpl.lookupHostByName(Inet6AddressImpl.java:135)
	... 23 more

@devvv4ever devvv4ever modified the milestones: 2.0.4, 2.1 Dec 4, 2022
@ArnyminerZ
Copy link
Member

Well, if this is thrown when refreshing from the UI, I think it makes sense, since we are calling the SyncWorker with the force parameter at true:

This makes the worker not to include the network type NetworkType.CONNECTED, so it doesn't check if network is available. Doesn't it?

@devvv4ever
Copy link
Member

devvv4ever commented Dec 5, 2022

Actually it's the other way around :-/ This error mostly occurs when a sync is done in the background (manually refreshing via UI works) and the network is not ready yet. So the question is why does the system have the state NetworkType.CONNECTED and then ICSx5 can't still can't resolve the host by that time.

EDIT: Ah I think I understand what you mean now. Maybe this has an effect on this. Let's wait for @rfc2822 to look at your findings!

@ArnyminerZ
Copy link
Member

Android is horrible checking if it has Internet connection or not, when the signal strength is low, or if the mobile network repeater reports a wrong connectivity speed (e.g. reporting LTE with E speeds) it just breaks, and doesn't know if Internet works or not.

In any case, I don't think that should cause major issues, only on some specific cases, so I don't think that's the issue here.

@ArnyminerZ
Copy link
Member

I have been thinking a bit on this. Since before adding the calendar, it's checked that it exists, we can just assume that the server is not going to be deleted.

Then, in the worker:

@SuppressLint("Recycle")
override suspend fun doWork(): Result {
applicationContext.contentResolver.acquireContentProviderClient(CalendarContract.AUTHORITY)?.let { providerClient ->
try {
return withContext(Dispatchers.Default) {
performSync(AppAccount.get(applicationContext), providerClient)
}
} finally {
providerClient.closeCompat()
}
}
return Result.failure()
}

We try-catch an UnknownHostException, if it's thrown, return Result.retry(). Something like this:

    override suspend fun doWork(): Result {
        val forceResync = inputData.getBoolean(FORCE_RESYNC, false)
        var serverUnreachable = false
        applicationContext.contentResolver.acquireContentProviderClient(CalendarContract.AUTHORITY)?.let { providerClient ->
            try {
                return withContext(Dispatchers.Default) {
                    performSync(AppAccount.get(applicationContext), providerClient, forceResync)
                }
            }catch (e: UnknownHostException) {
                Log.w(Constants.TAG, "Could not reach server. Trying again later.", e)
                serverUnreachable = true
            } finally {
                providerClient.closeCompat()
            }
        }
        return if (serverUnreachable)
            Result.retry()
        else
            Result.failure()
    }

I think WorkManager checks all the constraints again. If not, we can add a backoff policy. And maybe that helps.

@rfc2822
Copy link
Member Author

rfc2822 commented Dec 11, 2022

I also think that we should try a defined number of times (regardless of the actual I/O exception) before reporting the error.

However I wonder what's the real cause of the problem (no Internet connection although WorkManager says so) and why I can't find anything about it "on the Internet" – maybe it's only related to okhttp?

@ArnyminerZ
Copy link
Member

Well, it's a DNS error. It can't find an IP address associated with that domain. What comes up to my mind is bad Internet connection, so that when the workmanager checks for connectivity, it's available, but a moment later, when the proper code runs, that connection is no longer available.

@ArnyminerZ
Copy link
Member

I also think that we should try a defined number of times (regardless of the actual I/O exception) before reporting the error.

We can use an extra data key for this, and keep incrementing it and returning Result.retry() until reached 5 attempts, for example, and then return Result.failure().

@ArnyminerZ
Copy link
Member

Just found out there's a thing called runAttemptCount for the worker. We can just use that 😄

@ArnyminerZ
Copy link
Member

I've commited a94e7ad, which adds a constant for the maximum number of attempts:

/**
* The maximum number of attempts to make until considering the server as "unreachable".
* @since 20221212
*/
const val MAX_ATTEMPTS = 5

And then, when the job fails, we return a failure or a retry depending on this count:

}
return if (runAttemptCount >= MAX_ATTEMPTS)
Result.failure()
else
Result.retry()
}

However, even though this might fix the issue, we should still add some kind of backoff policy.

@rfc2822
Copy link
Member Author

rfc2822 commented Dec 12, 2022

Well, it's a DNS error. It can't find an IP address associated with that domain. What comes up to my mind is bad Internet connection, so that when the workmanager checks for connectivity, it's available, but a moment later, when the proper code runs, that connection is no longer available.

The other exception is a timeout. I guess the first TCP package is sent while there's still no real connection and then it waits and times out. However I wonder why there are not many reports about that on StackOverflow etc. because almost every app I can imagine that uses WorkManager will require Internet for its work, so shouldn't this a very big topic? And I couldn't find anything about it.

@ArnyminerZ
Copy link
Member

The other exception is a timeout. I guess the first TCP package is sent while there's still no real connection and then it waits and times out. However I wonder why there are not many reports about that on StackOverflow etc. because almost every app I can imagine that uses WorkManager will require Internet for its work, so shouldn't this a very big topic? And I couldn't find anything about it.

I've seen that most of the examples out there add an initial delay. Maybe adding this helps on the network finishing booting, and people don't even notice. However, Google should be aware of this, and I couldn't be able to find any related issues on their issuetracker.

@ArnyminerZ
Copy link
Member

I'd add the retries logic, some backoff policy, and a short initial delay (maybe 10-20 seconds), and check if the issue is gone. Otherwise we can open an issue at Google, and check if they have got any more info

@ArnyminerZ
Copy link
Member

All the suggested changes are implemented in #83

@rfc2822
Copy link
Member Author

rfc2822 commented Dec 14, 2022

Thanks! I'd still want to know what's the cause of the problem instead of only working around it but maybe we will never know…

@ArnyminerZ
Copy link
Member

Thanks! I'd still want to know what's the cause of the problem instead of only working around it but maybe we will never know…

Yeah... It's hard to know without further inspection of the insides of the WorkManager

@rfc2822
Copy link
Member Author

rfc2822 commented Dec 14, 2022

@devvv4ever
Copy link
Member

devvv4ever commented Jan 9, 2023

Another report for this came in - but I think no additonal information can be gathered from the error details (basically the same as above):

(S21 Ultra, Android 13)

Unable to resolve host "[status.gwdg.de](http://status.gwdg.de/)[https://status.gwdg.de/]": No address associated with hostname

[java.net](http://java.net/)[https://java.net].UnknownHostException: Unable to resolve host "[status.gwdg.de](http://status.gwdg.de/)[https://status.gwdg.de/]": No address associated with hostname
at [java.net](http://java.net/)[https://java.net].Inet6AddressImpl.lookupHostByName(Inet6AddressImpl.java:156)
at [java.net](http://java.net/)[https://java.net].Inet6AddressImpl.lookupAllHostAddr(Inet6AddressImpl.java:103)
at [java.net](http://java.net/)[https://java.net].InetAddress.getAllByName(InetAddress.java:1152)
at okhttp3.Dns$Companion$DnsSystem.lookup(Dns.kt:49)
at okhttp3.internal.connection.RouteSelector.resetNextInetSocketAddress(RouteSelector.kt:169)
at okhttp3.internal.connection.RouteSelector.nextProxy(RouteSelector.kt:132)
at [okhttp3.internal.connection.RouteSelector.next](http://okhttp3.internal.connection.routeselector.next/)(RouteSelector.kt:74)
at okhttp3.internal.connection.RealRoutePlanner.planConnect(RealRoutePlanner.kt:147)
at okhttp3.internal.connection.RealRoutePlanner.plan(RealRoutePlanner.kt:67)
at okhttp3.internal.connection.SequentialExchangeFinder.find(SequentialExchangeFinder.kt:30)
at okhttp3.internal.connection.RealCall.initExchange$okhttp(RealCall.kt:267)
at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.kt:32)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.kt:95)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.kt:84)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.kt:65)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
at okhttp3.internal.connection.RealCall.getResponseWithInterceptorChain$okhttp(RealCall.kt:205)
at okhttp3.internal.connection.RealCall$AsyncCall.run[https://AsyncCall.run](RealCall.kt:533)
at java.util.concurrent.ThreadPoolExecutor.runWorker([ThreadPoolExecutor.java:1137](http://threadpoolexecutor.java:1137/))
at java.util.concurrent.ThreadPoolExecutor$Worker.run[https://Worker.run](ThreadPoolExecutor.java:637)
at [java.lang.Thread.run](http://java.lang.thread.run/)[https://java.lang.Thread.run](Thread.java:1012)
Caused by: android.system.GaiException: android_getaddrinfo failed: EAI_NODATA (No address associated with hostname)
at [libcore.io](http://libcore.io/)[https://libcore.io].Linux.android_getaddrinfo(Native Method)
at [libcore.io](http://libcore.io/)[https://libcore.io].ForwardingOs.android_getaddrinfo(ForwardingOs.java:133)
at [libcore.io](http://libcore.io/)[https://libcore.io].BlockGuardOs.android_getaddrinfo(BlockGuardOs.java:222)
at [libcore.io](http://libcore.io/)[https://libcore.io].ForwardingOs.android_getaddrinfo(ForwardingOs.java:133)
at [java.net](http://java.net/)[https://java.net].Inet6AddressImpl.lookupHostByName(Inet6AddressImpl.java:135)
... 23 more

@rfc2822 rfc2822 modified the milestones: 2.1, 2.2 Feb 11, 2023
@rfc2822
Copy link
Member Author

rfc2822 commented Jul 28, 2023

BTW is this still being reported @devvv4ever ? I didn't have the problem myself for a long time now… maybe Samsung has fixed it?

@devvv4ever
Copy link
Member

No reports like this for quite a few months! We can re-open if necessary.

@rfc2822 rfc2822 closed this as not planned Won't fix, can't repro, duplicate, stale Jul 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants