Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

withContext from inside a cancelled coroutine should not be executed #962

Closed
abusalimov opened this issue Jan 31, 2019 · 13 comments
Closed

Comments

@abusalimov
Copy link

Consider the following example:

runBlocking {
    println("before withContext")
    withContext(EmptyCoroutineContext) { println("withContext 1") }
    println("cancelling")
    coroutineContext.cancel()
    withContext(EmptyCoroutineContext) { println("withContext 2") }
    withContext(EmptyCoroutineContext) { println("withContext 3") }
    println("after withContext")
}

Currently this outputs the following:

before withContext
withContext 1
cancelling
withContext 2

kotlinx.coroutines.JobCancellationException: Job was cancelled

That is, the second withContext is executed despite the fact the parent coroutine has already been cancelled by the time the child is created.

This behavior seems a bit counterintuitive, and it is also inconsistent with how a similar example using a Default coroutine dispatcher works:

runBlocking {
    println("before withContext")
    withContext(Dispatchers.Default) { println("withContext 1") }
    println("cancelling")
    coroutineContext.cancel()
    withContext(Dispatchers.Default) { println("withContext 2") }
    withContext(Dispatchers.Default) { println("withContext 3") }
    println("after withContext")
}

Output:

before withContext
withContext 1
cancelling

kotlinx.coroutines.JobCancellationException: Job was cancelled

Note that async { ... }.await() works fine with either of the dispatchers:

runBlocking {
    println("before async.await")
    async(EmptyCoroutineContext) { println("async.await 1") }.await()
    println("cancelling")
    coroutineContext.cancel()
    async(EmptyCoroutineContext) { println("async.await 2") }.await()
    async(EmptyCoroutineContext) { println("async.await 3") }.await()
    println("after async.await")
}

Output:

before async.await
async.await 1
cancelling

kotlinx.coroutines.JobCancellationException: Job was cancelled

My suggestion is to make the behavior of child coroutines created using different builders consistent so that a child doesn't start executing if a parent has already been cancelled by the time the child is created.

BTW IDEA has an quick fix suggesting to replace async { ... }.await() with withContext, which now happens to change the semantics w.r.t. cancellation when a child uses the same dispatcher.

@elizarov
Copy link
Contributor

elizarov commented Feb 1, 2019

This happens so by design. The reason for such behavior is to enable withContext(NonCancellable) { ... } to execution suspending code inside the coroutine despise its cancellation. Do this explanation help?

@LouisCAD
Copy link
Contributor

LouisCAD commented Feb 4, 2019

@elizarov Couldn't a special case be added for NonCancellable instead? Maybe it could also work by checking if resulting context (from current + passed argument) is active?

@abusalimov
Copy link
Author

@elizarov thanks, the explanation makes sense. Although I would expect the NonCancellable thing to be a special-case check, just as @LouisCAD suggests, so that the general behavior is consistent across the contexts/builders.

@Dougrinch
Copy link

Dougrinch commented Feb 16, 2019

@elizarov If withContext should call always, then why third withContext(EmptyCoroutineContext) {...} did not executed? Moreover this is very unexpectedly that async(ctx) { body }.await() and withContext(ctx) { body } has different behaviour. For example, I did spent 30 minutes just now to find a bug. The reason was different behaviour between async(NonCancellable) { body } and withContext(NonCancellable) { async { body } }. I don't understand why first hangs as expected, but the next one just ignoring NonCancellable context and completes instantly (same with launch instead of async).

fun main() = runBlocking<Unit> {
    launch {
        withContext(NonCancellable) {
            async {
                delay(10000000000)
            }
        }
        coroutineContext.cancel()
    }
}
fun main() = runBlocking<Unit> {
    launch {
        async(NonCancellable) {
            delay(10000000000)
        }
        coroutineContext.cancel()
    }
}

@elizarov
Copy link
Contributor

elizarov commented Feb 22, 2019

Unfortunately, the ship might have sailed for this particular decision, but I do see the problem it causes.

The way it was originally designed is that withContext(NonCancellable) { ... } is a synonym for withContext(Job()) { ... }, for you can use any of them to get a context that is detached for the parent's cancellation. It is also consistent with how you can use async and launch in that you can do either launch(Job()) { ... } or launch(NonCancellable) { ... } to launch a new coroutine in a scope, yet detach it from parent.

In the experimental version we also had an optional start parameters to withContext, so that you can use for both cancellable and non-cancellable start at will. However, the devil is in defaults. Just adding an option does not protect users from making mistakes.

The reason your third block is not executed, is because withContext checks on cancellation on completion of the block, before returning.

I do not have easy solution out of this, but to provide a separate withContext { ... } builder somehow aptly named with a different behavior that checks for cancellation before running the block. Maybe we could, indeed, tweak the behavior of existing withContext, so that it checks the cancellation of the new context before it executes the code. We'll have to study the impact of this change.

@LouisCAD
Copy link
Contributor

@elizarov Maybe this could be the behavior of the new invoke operator currently in PR #980?

@elizarov
Copy link
Contributor

I've investigated what happens if we change this behavior and check for cancellation on the entrance to withContext. No test that we have had failed. I interpret this as, although this is a breaking change, it should have a minor impact and would do it mostly for good. withContext is already "cancellable" (on the return path when switching dispatcher), so making it more cancellable will just make it more explicit, actually helping people to avoid subtle bugs they might have. So I'm adding PR with a change.

elizarov added a commit that referenced this issue Feb 22, 2019
* Immediately check is the new context is cancelled and throw
  CancellationException as needed
* Properly explain cancellation behavior in documentation

Fixes #962
@uliluckas
Copy link

uliluckas commented Feb 22, 2019

While we talk about the cancelation semantics of withContext, an issue came up on slack yesterday:
https://kotlinlang.slack.com/archives/C1CFAFJSK/p1550766435021300

The issue is, that canceling a coroutine at the end of a withContext block leaves the returned object in an unowned state. This makes failure to release a ressource in the following code 1) surprising and 2) implicit:

import kotlinx.coroutines.*

class Ressource {

    init {
        println("Initializing ressource")
        Thread.sleep(1000)
        println("Slow initialization done")
    }

    fun release() {
        println("Released")
    }

}

suspend fun initWithContext() {
    var ressource: Ressource? = null
    try {
        ressource = withContext(Dispatchers.Default) {
            Ressource()
        }
    } finally {
        ressource?.run {
            release()
        } ?: println("Ressource not released")
    }
}

fun main() = runBlocking {
    val initJob = launch {
        initWithContext()
    }
    yield()
    initJob.cancel()
}

Which outputs:

Initializing ressource
Slow initialization done
Ressource not released

@LouisCAD
Copy link
Contributor

LouisCAD commented Feb 22, 2019

@uluckas Having this YouTrack issue resolved would allow to avoid the resource leak demonstrated above the following way:

import kotlinx.coroutines.*

class Resource {

    init {
        println("Initializing resource")
        Thread.sleep(1000)
        println("Slow initialization done")
    }

    fun release() {
        println("Released")
    }

}

suspend fun initWithContext() {
    val resource: Resource
    try {
        withContext(Dispatchers.Default) {
            resource = Resource()
        }
    } finally {
        resource.release()
    }
}

fun main() = runBlocking {
    val initJob = launch {
        initWithContext()
    }
    yield()
    initJob.cancel()
}

In fact, you can already resolve this issue by changing back the local resource to a nullable var.
Assigning a resource after another suspension point will always be risky.

EDIT: My bad, even with contracts for suspend functions, this needs to be a var because the variable may have not been initialized the compiler will say.

elizarov added a commit that referenced this issue Feb 22, 2019
* Immediately check is the new context is cancelled and throw
  CancellationException as needed
* Properly explain cancellation behavior in documentation

Fixes #962
Fixes #785
@elizarov
Copy link
Contributor

@uluckas withContext was supposed to be cancellable by design and this issue goes into opposite direction -- make it more cancellable. I've also added a better documentation as a part of this PR. See:

* Calls the specified suspending block with a given coroutine context, suspends until it completes, and returns

This is also related to a bunch of other problems (see #279 and #845).

@Dougrinch
Copy link

Dougrinch commented Feb 22, 2019

@elizarov I see the problem that NonCancellable in launch/async not only tells us about cancellation, but also implicitly detaches the coroutine scope from parent. As I understand your posts about structured concurrency, the key point of scopes is "parent scope will be completed only after all child scopes". And it is true with coroutineScope { launch { ... } }. But it is not with coroutineScope { launch(NonCancellable) { ... } }. Although child coroutine still running, parent will be completed instantly. I think this is very unintuitive behaviour.

@elizarov
Copy link
Contributor

@Dougrinch this is indeed the case. Let's move discussion on overriding Job in the context into #1001

@uliluckas
Copy link

@uluckas withContext was supposed to be cancellable by design and this issue goes into opposite direction -- make it more cancellable. I've also added a better documentation as a part of this PR. See:

kotlinx.coroutines/kotlinx-coroutines-core/common/src/Builders.common.kt

Thanks @elizarov for the update

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants