Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apache HTTP Client often hangs forever (deadlocks) in async block #770

Closed
MartinHaeusler opened this issue Dec 2, 2018 · 10 comments
Closed
Assignees
Labels

Comments

@MartinHaeusler
Copy link

Ktor Version

1.0.0

Ktor Engine Used(client or server and name)

Apache HTTP Client

JVM Version, Operating System and Relevant Context

Oracle JDK 1.8.0_161-b12 (same issue on OpenJDK 8)
OS: Windows 10, 64bit

Feedback

When running the Apache HTTP client (through the ktor client API), the client sometimes just "freezes" and hangs indefinitly, sending the application into a deadlock. This effect is very noticeable when calling the client from an async block.

Please consider the following self-contained JUnit test:

import io.ktor.client.HttpClient
import io.ktor.client.engine.apache.Apache
import io.ktor.client.engine.cio.CIO
import io.ktor.client.request.get
import kotlinx.coroutines.*
import org.junit.Test

class CoroutineTest {

    @Test
    fun testWithGlobalScope(){
        println("WITH GLOBAL SCOPE")
        runBlocking {
            runTest(GlobalScope)
        }
    }

    @Test
    fun testWithCoroutineScope(){
        println("WITH COROUTINE SCOPE")
        runBlocking {
            runTest(this)
        }
    }

    @Test
    fun testWithIndependentScope(){
        println("WITH BLOCK SCOPE")
        runBlocking {
            coroutineScope{
                runTest(this)
            }
        }
    }

    private suspend fun runTest(scope: CoroutineScope): Int{
        // retrieve number of CPU cores
        var sum = 0
        val tasks = mutableListOf<Deferred<Int>>()
        for(i in 0 until 50){
            println("Launching async task #${i+1}")
            tasks += scope.async {
                val client = HttpClient(Apache)
                try{
                    log("HTTP client open. Performing GET")
                    val string = client.get<String>("https://www.google.com")
                    log("GET result received: ${string.replace("\r?\n", " ").substring(0, 20)}...")
                }finally{
                    log("Closing HTTP client.")
                    client.close()
                    log("HTTP client closed.")
                }
                return@async 1
            }
        }
        for(i in 0 until tasks.size){
            sum += tasks[i].await()
            log("Task #${i} completed.")
        }
        return sum
    }

    private fun log(message: String){
        println("[${Thread.currentThread().name}] ${message}")
    }
}

In particular, testWithGlobalScope (Apache HTTP, called from GlobalScope.async(...)) tends to freeze almost every try at some point (although I observed that the target web URL does make a difference; perhaps it's a race condition dependent on the response time?). The other two variants work fine for me.

As a side note, I have observed that if you are in a test run where the HTTP client causes a freeze, it will always occur after the n-th request, where n is the number of CPU cores on your machine (I guess that kotlin determines the size of its executor pools based on this number, when the pools run out of usable threads the remaining coroutines can't execute anymore).

Interestingly, if we replace HttpClient(Apache) by HttpClient(CIO), then things work like a charm. I would argue that this strongly points towards the Apache HTTP Client as the culprit of the deadlocks.

@MartinHaeusler
Copy link
Author

Here's a thread dump of the deadlock situation with GlobalScope.async on a 12-core machine:
https://pastebin.com/Ur6fvs6X

After this point, the JVM is completely idle, the JVisualVM reports no cpu activity on any thread whatsoever.

@e5l e5l self-assigned this Dec 2, 2018
@e5l e5l added the bug label Dec 2, 2018
@ps-feng
Copy link

ps-feng commented Dec 19, 2018

I've run into a similar issue I believe while using the Apache engine as well. In my case I'm using the Gson serializer and I've noticed that when it freezes it hangs up within GsonSerializer.read(), in the response.readText() call. Internally that's calling ByteReadChannel.readRemaining(), which is the one that hangs in my tests.

@e5l
Copy link
Member

e5l commented Feb 14, 2019

Hi @MartinHaeusler, thanks for the report.
We fix the Apache engine in 1.1.2. Could you recheck?

@MartinHaeusler
Copy link
Author

@e5l Thanks for the heads-up. I just ran the JUnit test again with KTOR 1.1.2, and the issue is unfortunately still present. It does not happen on every run, but roughly every second run the test case testWithGlobalScope still hangs/freezes. It seems to be a race condition.

@e5l
Copy link
Member

e5l commented Feb 14, 2019

Thanks for the report, I'll investigate and report the details.

@e5l
Copy link
Member

e5l commented Feb 18, 2019

It looks like we hanging in the ByteChannel. Trying to reproduce it without ktor-client.

@e5l
Copy link
Member

e5l commented Feb 27, 2019

Fixed in master.

@e5l e5l closed this as completed Feb 27, 2019
@MartinHaeusler
Copy link
Author

Forgive me for being curious, but was the issue in the Apache HTTP client itself, or in the KTOR client wrapper?

@e5l
Copy link
Member

e5l commented Feb 27, 2019

The issue was in kotlinx.coroutines ExperimentalDispatcher.close() method.
When we try to close dispatcher from the external thread - it kills the closing thread.

So single client.close() call kills a single thread from the DefaultDispatcher.

@MartinHaeusler
Copy link
Author

Oh okay... well, I'm glad to see this issue fixed, thanks for the explanation 😃 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants