Can't join a canceled task #64

blastrock · 2017-05-31T19:45:35Z

Hi,

In the following code, I want to make sure my code blocks until the task is really finished:

        val job = async(CommonPool) {
            while (isActive);
            Thread.sleep(1000)
            System.out.println("end of async")
        }

        job.cancel()
        job.join()
        System.out.println("end of test")

The result is that I see "end of test" instantly, and "end of async" appears a second after, even though I asked to join() the task. Am I misusing the library or is this a bug?

cy6erGn0m · 2017-06-02T09:21:18Z

According to join() it only waiting for completion. Job and running coroutine are not the same things and have different life-cycles. Once you cancel job it becomes completed: you can see it if you add job.invokeOnCompletion { println("completed") }. However even after job completion a running coroutine could continue execution.

cy6erGn0m · 2017-06-02T09:22:43Z

@elizarov However I see why it is confusing as async {}, run {} and launch {} looks like a thread {} while job.join() looks like thread.join() this is why users get confused.

cy6erGn0m · 2017-06-02T09:26:41Z

@blastrock so to get execution synchronized you can need some different way, consider the following


import kotlinx.coroutines.experimental.*
import kotlinx.coroutines.experimental.channels.*
import java.util.concurrent.*

fun main(args: Array<String>) {
    runBlocking {
        val c = RendezvousChannel<Boolean>()

        val job = async(CommonPool) {
            while (isActive);

            Thread.sleep(1000)

            System.out.println("end of async")
            c.send(isActive)
        }

        job.invokeOnCompletion {
            println("completed")
        }

        delay(1, TimeUnit.SECONDS)
        job.cancel()

        c.receive()

        System.out.println("end of test")

        delay(1, TimeUnit.SECONDS)
    }
}

Notice that it this solution is not the same as join()

elizarov · 2017-06-02T12:42:08Z

The difference in behaviour between Thread.join and Job.join is indeed confusing. Thanks for pointing it out. Maybe we should find a better name.

blastrock · 2017-06-02T14:14:46Z

Thanks for the answer.

If I may, even if you rename join to make it less confusing, I think a join() API would be very useful. I came up with a workaround similar to this:

        @Volatile
        var keepRunning = true
        val job = async(CommonPool) {
            while (keepRunning);
            Thread.sleep(1000)
            System.out.println("end of async")
        }

        keepRunning = false
        job.join()
        System.out.println("end of test")

Both your workaround and mine seem cumbersome to me.

It happens very often to me that I need to cancel a coroutine and be sure it is finished to ensure there is no race condition, for example when I am shutting down, or if I want to start a new coroutine which will access the same data as the one that is running.

I think the join() primitive would be very useful with cancelation, and probably not much harder to implement than what you have now.

elizarov · 2017-06-05T10:20:33Z

The reason launch/async behaves like it is now was intentional to make the state of Job/Deferred objects simpler. Right now, there have essentially two states: active and completed, and a join/await wait for completion. Introducing a join operation that waits until the coroutine has finished it execution (let's call it a terminated state) makes their state more complex. A new state appears -- "cancelled by not yet terminated". I'm not convinced yet that the presented use cases warrant the complication of API that is already not so simple.

However, the workarounds presented so far are not ideal either. Channel is too heavy-weight for such light-weight task as tracking job termination. I'll think on how it can be done easier.

elizarov · 2017-06-05T11:30:55Z

@blastrock As a better workaround for cases where you need to wait until the task is fully terminated I'd suggest the following utility function:

fun launchWithTermination(
    context: CoroutineContext,
    block: CoroutineScope.() -> Unit
): Job {
    val terminated = Job()
    launch(context) {
        try {
            block()
        } finally {
            terminated.cancel()
        }
    }
    return terminated
}

You can use it instead of launch and the Job that it return completes when the code is really finished. You cannot cancel the running coroutine though the resulting Job object, though. It is used only to signal termination. You can still cancel it via the parent Job in the context.

Note, that is the next release of kotlinx.coroutines (version 0.16) you'll need to change launch(context) in the above code to launch(context, CoroutineStart.ATOMIC) to make sure its body starts to execute even if it is cancelled before start.

cy6erGn0m · 2017-06-05T17:16:06Z

@elizarov exception also need to be propagated, so it requires a lot of preparation to get this fake job work as intended

elizarov · 2017-06-05T21:47:29Z

I would also highly recommend using an actor for the case where you need to ensure that at most one action is running at the same time. However, changing join behavior so that it waits until the job fully finishes on cancel is also on the table (with the downside of having a more complicated state machine).

blastrock · 2017-06-06T19:19:27Z

Indeed it would introduce a new state in the Job class, but maybe this state doesn't need to be exposed. I have yet to find a use-case, in kotlin or another language, where I need to know if (or when) a job is "completed", I usually only need to know when it is "terminated" (as you called it). Of course the "canceled but running"-state must still be exposed inside the coroutine through the isActive call.

Maybe I can make a wrapper with a workaround, but maybe I would need to expose my own Job class to do that. I'll think about it.

As for the Actor pattern, I don't think I can implement job cancelation easily with it. But the one-task-at-a-time guarantee is there indeed.

elizarov · 2017-06-26T10:37:31Z

I've pushed the corresponding changes to the develop branch. Now jobs (including Deferred) have an additional cancelling state. So, on invocation of cancel they go into cancelling state. isActive starts returning false, but isComplete is still false in this state. Both join and await wait for completion, that is they wait until the coroutine completes it execution.

It is important to note, that in current implementation a cancelled coroutine is doomed to complete exceptionally. Even if CancellationException is caught inside a coroutine and is replaced with any kind of result, it will still go into cancelled state on completion.

Note, that the naming of various states (see docs on Job and Deferred) are still under consideration and may change: https://github.com/Kotlin/kotlinx.coroutines/blob/develop/kotlinx-coroutines-core/src/main/kotlin/kotlinx/coroutines/experimental/Job.kt

Nemikolh · 2017-07-12T12:47:13Z

@elizarov Thanks a lot for working on this issue, it is truly appreciated. I am also facing a similar issue.
Do you know when would a new release with this new change would come?

I'm trying to figure out whether or not it's best for us to wait on this fix or use one of the workaround.

elizarov · 2017-07-12T12:48:18Z

I plan to get it released soon (this or next week)

kingsleyadio · 2017-07-17T09:09:43Z

@elizarov I'm thinking. Maybe we don't need to change the behavior of join. Or perhaps, maybe there could be another method that continues to behave like the current implementation.
The early return in the current implementation could sometimes be desirable.

Meanwhile. What will be the behavior in both situations below with the new implementation.

Join immediately after cancel
Cancel immediately after join

Thank you.

elizarov · 2017-07-18T20:56:41Z

The reason to change the implementation of join and await is two-fold. One is consistency with Thread.join as was mentioned above, but that would not be so compelling if not for the other. The other is to ensure that async(context) { doSomething() }.await() is really working just as run(context) { doSomething() } without any spurious concurrency on cancellation. This is what really bothers me in the current (version <= 0.16) implementation of join and await, it is that when I cancel the coroutine, then the other coroutine that was waiting for it is going to be executed concurrently with whatever code that was still running in the coroutine that was cancelled, but has not completed just yet. This concurrency would be so rare, that it would be a source of very hard-to-diagnose bugs.

elizarov · 2017-07-24T14:22:13Z

Released in version 0.17

matejdro · 2018-01-19T21:57:08Z

Is there a way to get old behavior in the new version? Maybe with another method instead of await/join?

For example if I'm wrapping coroutine around blocking task that is outside my control (task itself cannot be cancelled mid-execution) and I cancel the coroutine, I would want to return execution immediately and let that dangling blocking function just finish in the background instead of blocking everything.

elizarov · 2018-01-20T06:51:04Z

You can just use cancel. It does not block. If you don't want to wait, then just don't invoke await.

matejdro · 2018-01-20T06:57:45Z

Yes, but what if cancel is triggered by another thread?

For example, on Android, coroutine is await-ing in the background, but in the meantime, user closes down the application, so cancel is called from Android UI thread. I want to cleanup as fast as possible instead of waiting for that to finish.

elizarov · 2018-01-20T07:07:02Z

You can install callback that gets triggered as soon as cancel is invoked with job.invokeOnCompletion(onCancelling = true) { ... }. Using this callback you can define an extension that waits for a job to become cancelling (as soon as cancel is invoked) in the usual way:

suspend fun Job.awaitCancelling() = 
    suspendCancellableCoroutine<Unit> { cont -> 
        val handle = this@awaitCancelling.invokeOnCompletion(onCancelling = true) {
            cont.resume(Unit) 
        }
        disposeOnCompletion(handle)
    }

matejdro · 2018-01-20T07:10:52Z

that sounds like good workaround. Thanks!

elizarov closed this as completed Jul 24, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can't join a canceled task #64

Can't join a canceled task #64

blastrock commented May 31, 2017

cy6erGn0m commented Jun 2, 2017

cy6erGn0m commented Jun 2, 2017

cy6erGn0m commented Jun 2, 2017 •

edited

Loading

elizarov commented Jun 2, 2017

blastrock commented Jun 2, 2017

elizarov commented Jun 5, 2017 •

edited

Loading

elizarov commented Jun 5, 2017

cy6erGn0m commented Jun 5, 2017

elizarov commented Jun 5, 2017

blastrock commented Jun 6, 2017

elizarov commented Jun 26, 2017 •

edited

Loading

Nemikolh commented Jul 12, 2017

elizarov commented Jul 12, 2017

kingsleyadio commented Jul 17, 2017

elizarov commented Jul 18, 2017 •

edited

Loading

elizarov commented Jul 24, 2017

matejdro commented Jan 19, 2018 •

edited

Loading

elizarov commented Jan 20, 2018

matejdro commented Jan 20, 2018

elizarov commented Jan 20, 2018 •

edited

Loading

matejdro commented Jan 20, 2018

Can't join a canceled task #64

Can't join a canceled task #64

Comments

blastrock commented May 31, 2017

cy6erGn0m commented Jun 2, 2017

cy6erGn0m commented Jun 2, 2017

cy6erGn0m commented Jun 2, 2017 • edited Loading

elizarov commented Jun 2, 2017

blastrock commented Jun 2, 2017

elizarov commented Jun 5, 2017 • edited Loading

elizarov commented Jun 5, 2017

cy6erGn0m commented Jun 5, 2017

elizarov commented Jun 5, 2017

blastrock commented Jun 6, 2017

elizarov commented Jun 26, 2017 • edited Loading

Nemikolh commented Jul 12, 2017

elizarov commented Jul 12, 2017

kingsleyadio commented Jul 17, 2017

elizarov commented Jul 18, 2017 • edited Loading

elizarov commented Jul 24, 2017

matejdro commented Jan 19, 2018 • edited Loading

elizarov commented Jan 20, 2018

matejdro commented Jan 20, 2018

elizarov commented Jan 20, 2018 • edited Loading

matejdro commented Jan 20, 2018

cy6erGn0m commented Jun 2, 2017 •

edited

Loading

elizarov commented Jun 5, 2017 •

edited

Loading

elizarov commented Jun 26, 2017 •

edited

Loading

elizarov commented Jul 18, 2017 •

edited

Loading

matejdro commented Jan 19, 2018 •

edited

Loading

elizarov commented Jan 20, 2018 •

edited

Loading