Async builder and cancellation in structured concurrency #763

qwwdfsad · 2018-10-25T17:48:19Z

Background

After async integration with structured concurrency (#552), any failure in async cancels its parent.
For example, following code

try {
  async { error("") }.await()
} catch (e: Throwable) {
  ...
}

cancels outer scope and it usually leads to undesired consequences.
The rationale behind this change was simple, it was previously hard or impossible to have proper parallel decomposition to fail when one of the decomposed computations failed.

While there is nothing wrong with that reasoning (=> we won't revert this change), we now have a more serious problem with user's intuition of this behaviour.
async is a way too polluted keyword which neither reflects its purpose in kotlinx.coroutines nor matches similar concepts in other paradigms or programming languages, so any newcomer will be confused with this behaviour and thus has no or incorrect mental model about it.

Moreover, if someone already understands concept of kotlinx.coroutines async, (s)he still may want to have future-like behaviour (and according to our slack, demand for that is quite high).
And there is no simple answer to that.
To have a proper semantic (one-way cancellation), one should write something like async(SupervisorJob(coroutineContext[Job])) { ... } (really?!) and it completely defies the goal of having clear and easily understandable API. coroutineScope is not applicable for that purpose, because it awaits all its children and GlobalScope is just unsafe.

We should address this issue and educating users with "this is intended 'async' behaviour" without providing an alternative is just wrong. I can't imagine the situation where someone asks about this behaviour in Slack and community responds with "yes, this is how it works, you actually need async(SupervisorJob(coroutineContext[Job])) { ... }"

Possible solutions

In my opinion, it is necessary to provide future-like builder (aka "old" async) and it would be nice if its name won't clash with anything we already have.
For example, deferred builder. It is not something newcomers will start to use immediately (while async is kinda red flag "please use me") due to its name, but it is a simple concept, it is clean, short and easy to explain (see my "can't imagine the situation" rant).

Another possible solution (please take it with a grain of salt, it requires a lot more design/discussing with other users) is to deprecate async at all. As I have mentioned, this name is useless, polluted and does not reflect its semantics. Even with deferred builder, we still should make a tremendous effort with educating users that this is not async you are familiar with, this is completely different async.
But if we will deprecate it and introduce another primitive with a more intuitive name, for example, decompose {} (naming here is a crucial part, decompose is just the first thing that popped in my mind), then we will have no problems with async. Newcomers won't see their familiar async, but deferred and decompose and then will choose the right primitive consciously.

Reported:
https://discuss.kotlinlang.org/t/caught-exceptions-are-still-propagated-to-the-thread-uncaughtexceptionhandler/10170
#753
#787
#691

The text was updated successfully, but these errors were encountered:

fvasco · 2018-10-25T20:59:41Z

@qwwdfsad I agree with your concerns.

Unfortunately I don't fully understand your consideration about async, personally I like the old pattern async ... await.

My proposal is to consider fork instead of decompose.
The decompose word suggests me the data structure decomposition, instead fork remembers me the fork join pattern, so we fork the current scope for a future join.

My second consideration is to avoid any join or await phases, this avoids any try { await() } catch { ... } block. In my example below is not clear how handle error inside the coroutine scope.

suspend fun <T : Comparable<T>> List<T>.quickSort(): List<T> = 
    if(size <= 1) this
    else coroutineScope {
        val pivot = first()
        val (smaller, greater) = drop(1).partition { it <= pivot}
        val smallerSorted by fork { smaller.quickSort() }
        val greaterSorted by fork { greater.quickSort() }
        smallerSorted.quickSort() + pivot + greaterSorted.quickSort()
    }

LouisCAD · 2018-10-25T21:07:49Z

@fvasco This would need to allow suspend val for delegated properties.

fvasco · 2018-10-26T07:05:01Z

Yes @LouisCAD, I like the idea.
Do you have any consideration against the suspending delegates?

suspend operator fun <T> Fork<T>.getValue(thisRef: Any?, property: KProperty<*>): T

Edit: suspending delegates does not requires suspend val

SolomonSun2010 · 2018-10-26T08:06:36Z

@qwwdfsad I agree with your concerns.

Unfortunately I don't fully understand your consideration about async, personally I like the old pattern async ... await.

My proposal is to consider Erlang style spawn / link / monitor instead of decompose.
spawn(modulename, myfuncname, []) % Erlang
please see also:
http://erlang.org/doc/reference_manual/processes.html

Also could consider go but for safety better Go style go ?
go(CTX) { myfunc() }

LouisCAD · 2018-10-26T12:40:04Z

@fvasco Wow, I didn't know about this trick!
In Kotlin 1.2.71, this compiles, and works as you would expect, apart from the IDE not showing the suspension point for delegated property/local access, but on declaration instead:

private suspend operator fun <T> Deferred<T>.getValue(
    thisRef: Any?,
    property: KProperty<*>
): T = await()

val a by async { // IDE shows suspend call here
    delay(5000)
    1
}
val b by async { // IDE shows suspend call here too
    delay(5000)
    2
}
val c = a + b // Suspends, but IDE doesn't show it.

However, according to play.kotl.in, this no longer compiles in Kotlin 1.3, as suspend operator getValue seems to have been marked as unsupported.

fvasco · 2018-10-27T05:10:53Z

Hi @LouisCAD,
the above example is the my preferred way for building a suspending lazy property (#706):
The fork function is heavily inspired by the lazy one, so I consider it as single solution for launch, async and lazy in a coroutine scope (for decomposition use case).

In my opinion async is a good operator to start a deferred computation and handle its result later (i.e. the awaitFirst problem #424). async live in the scope but it should not fail the scope.

A different consideration is for launch, for my use case it is very uncommon to use a scoped launch, it exposes only a boolean exit status (isCancelled) and it is pretty unclear how to handle it, the most common solution is to steal the error using a CompletionHandler (#345).

The last operator is produce, a producer should live in the scope (like async) without cancelling it, instead both produce and async should be cancelled when the scope exit (#645).

As a quick recap of my idea:

launch is a top level function and it is not scoped, like a Thread
async is always scoped (GlobalScope is OK), it does not influence the scope but it is cancelled at the scope termination (it cannot more be awaited), it is similar to RX Single
produce is always scoped (GlobalScope is OK), it does not influence the scope but it is cancelled at the scope termination (it cannot more be consumed), it is similar to RX Observable
fork is always scoped (I don't see a use case for GlobalScope), if a fork fails then the whole scope fails, if the scope fails then all forks fails. The scope terminates successfully if all forks complete the task without errors (or lazy forks are not started at all). It implements the fork-join model.

LouisCAD · 2018-10-27T08:29:48Z

@fvasco I don't agree with making launch top level again without scope, there's a reason it moved to structured concurrency with mandatory scopes. You can still write a globalLaunch or alike method if you want, but scopes are really useful for launch, think Android apps, child coroutines that should not be forgotten, etc…

For an async alternative though, like fork or something, it would be great. Then, we could have an async or alike that uses a SupervisorJobso it doesn't cancel parent on failure, but only throws whenawait()` is called.

LouisCAD · 2018-10-27T08:31:54Z

@qwwdfsad Maybe async should be marked as experimental before 1.0.0 to allow changes regarding this?
Not so much code is currently using it anyway I think.

fvasco · 2018-10-27T09:17:35Z

@LouisCAD

child coroutines that should not be forgotten, etc…

async can do this job with no downside.

Can you propose an example so it is not possible to use async instead of launch?
Please consider that any Deferred is a Job, and consider the follow code:

val something: Deferred<*> = async { }
something.join() // job join
something.await() // check error

fun Deferred<*>.asJob() = object : Job by this

However I am not an Android developer, so I don't have any strong opinion against GlobalScope.launch.

LouisCAD · 2018-10-27T09:43:30Z

@fvasco There are two reasons:

async doesn't behave as launch. An uncaught exception in async doesn't crash the app (silent fail), while in launch it does. I don't want the app to continue working on uncaught exceptions.
launch has the benefit of not having async connotation that people may have because of async/await in C#, Dart, JS or other languages.

Finally, there's a lot of code that uses launch in Android apps, and moving to async exception handling could introduce many problems because of muted throwables.

The discussion is not about launch vs async which are completely different because of their exception/throwable handling, but about behavior of async when a Throwable occurs, and possible solutions to this.

fvasco · 2018-10-27T14:34:48Z

An uncaught exception in async doesn't crash the app (silent fail), while in launch it does.

I don't understand, launch and async look similar on 1.0.0-RC1

fun main(args: Array<String>) = runBlocking<Unit> {
    async { error("Scope crashed") }
}

@qwwdfsad proposed async(SupervisorJob(coroutineContext[Job])) { ... }, but the follow code does not solve the issue

fun main(args: Array<String>) = runBlocking<Unit> {
    async(SupervisorJob(coroutineContext[Job]))  { error("Scope crashed") }
}

LouisCAD · 2018-10-27T15:10:03Z

@fvasco Did you read the docs for launch and async? Did you notice the IDE warning when calling async without using its resulting Deferred? Did you compare the results with async and launch? Did you try to understand the differences, beyond the fact that they "look similar"?

It is already clear with the docs, and clear if you run it, but I'll repeat it once for all:
current async since structured concurrency cancels the scope, but the exception is still hidden, and only thrown when calling await(). launch on the other hand crashes the app with the original exception in addition to having the scope cancelled.

Again, this issue is not about launch vs async but about the fact that async cancels the scope even if exception is caught at await() call, and what solutions we have to get the old async behavior safely, and the naming for all of this.

zach-klippenstein · 2018-10-27T22:25:45Z

This is probably a horrible misuse of completion handlers, but what if async did something like this:

When the async body throws an exception, hang onto it. If there is a job in the context, immediately register a completion handler on the job that will rethrow the exception. Hang onto the handler's disposable.
If await is called before the job completes, throw the exception there and consider the exception reported. Dispose the completion handler so it won't propagate to the parent.
If the parent is cancelled without await having been called, the completion handler from 1 will ensure the exception is not lost (it will get wrapped with a CompletionHandlerException).
If there's no parent job, and await is not called, the exception is dropped, but if you're opting out of using structured concurrency you're already using hard mode.

This allows async to work as expected in the happy case and when the async body wants to handle exceptions itself, but still ensures that it's impossible to forget to do so and lose an exception, as long as there's a parent job.

fvasco · 2018-10-31T11:09:43Z

I wish to recover this issueto reevaluate some previous considerations.

@qwwdfsad focus this issue around the async builder, but I wish to extend the issue to launch and produce.

Therefore, though we introduce some new fancy builder like job,deferred and channel, we need to customize the behavior for each one (really similar to CoroutineStart attribute).
I.e. I can start a supervisor job, but I want to build some child tasks as strictly related to this scope.

LouisCAD · 2018-11-09T10:31:29Z

Wild idea: a function named asyncCatching { ... } that calls async(SupervisorJob(coroutineContext[Job])) { ... } under the hood.
There may be a confusion with Result and xxxxCatching { … } methods though… Maybe a similar convention could be introduced with Deferred?

elizarov · 2018-12-06T13:36:33Z

Linking here a related question from Kotlin dicussions https://discuss.kotlinlang.org/t/kotlin-coroutines-are-super-confusing-and-hard-to-use/10628

neworld · 2019-10-07T10:56:44Z

async is mainly used for parallelization. Wrapping parallelization into its own scope looks a safe solution. I would like to suggest an idea to deprecate async and allow use it in ParallelScope, like this:

launch {
  async { a() } // deprecated

  val result = parallel { // we are inside ParallelScope
    val a = async { a() }
    val b = async { b() }
    try {
      a.await() + b.await()
    } catch (e: Exception) {
      -1
    }
  }
}

parallel should work like coroutineScope, but only ParallelScope offer async.

mtimmerm · 2019-11-08T18:44:25Z

Please note that this workaround that is often mentioned above does not work: async(SupervisorJob(coroutineContext[Job])) { ... }
The SupervisorJob that is created is a CompletableJob that will never be completed.

I have fixed it like this:

fun <T> CoroutineScope.detach(block: suspend () -> T) : Deferred<T> {
    val result = CompletableDeferred<T>()
    launch {
        try {
            result.complete(block())
        } catch (e: Throwable) {
            result.completeExceptionally(e)
        }
    }
    return result
}

This is subtly different too, though, in that failure of a 2nd level subordinate job can still cancel the parent.

elizarov · 2020-04-01T09:17:29Z

The non-standard coroutineScope { async { ... } ... await() } seems to have settled. Closing this issue.

rocketraman · 2020-04-01T15:50:15Z

The non-standard coroutineScope { async { ... } ... await() } seems to have settled. Closing this issue.

Can someone add some color to this? What are the implications for those of us following this issue?

elizarov · 2020-04-01T16:02:12Z

It stays the way it is now.

rocketraman · 2020-04-01T16:53:25Z

It stays the way it is now.

I got that, but a lot of different approaches have been discussed in this issue. Your comment seemed to indicate the idiomatic approach now is:

coroutineScope { async { ... } ... await() }

however, your use of the term "non-standard" gave me pause. What did you mean by that?

elizarov · 2020-04-02T07:41:31Z

It simply means that people coming from other async/await ecosystems (C#, JS, etc) might get initially confused by the fact that async always cancels its parent scope. That is what I meant by "non-standard".

spyro2000 · 2020-04-16T16:34:53Z

This is so confusing. I am reading and testing Kotlin async exception handling for days now. And I am still not getting it. This is even worse than Java with it's Future-API from hell (can't believe, I'm writing this).

Wish we could just have simple async/await equivalents like in C# or Javascript. They just do their stuff like expected without having to deal with (global) scopes, coroutine builders, dispatchers, suspend functions, supervisors, catched exceptions bubbling up and crashing my app etc.

The current state is just - awful. Everyone is just confused how to use all those stuff correctly. Sorry. In C# all those works with async/await like expected, done.

In Kotlin it's rocket science.

LouisCAD · 2020-04-17T07:45:12Z

@spyro2000 In Kotlin, it's done the correct way:
async coroutines need to handle their errors themselves, or they propagate up the scope (coroutineScope { ... }), which needs to handle them.

Also, you need async and await much less in Kotlin since you can (should, unless you need parallel operations in that coroutine) just call suspending functions sequentially.

neworld · 2020-04-17T08:20:58Z

@spyro2000 In Kotlin, it's done the correct way:
async coroutines need to handle their errors themselves, or they propagate up the scope (coroutineScope { ... }), which needs to handle them.

Also, you need async and await much less in Kotlin since you can (should, unless you need parallel operations in that coroutine) just call suspending functions sequentially.

For me, it looks like a design flaw. async is needed only for parallel operations, but you always have to handle error yourself. The simplest way I found to do it without using ugly try-caches:

launch {
  val a = async {
    runCatching {
      //your body
    }
  }

  val b = async {
    runCatching {
      //your body
    }
  }

  if (a.await().isSuccess && b.await().isSuccess) {
    //both requests are success. 
  }
}

I hardly imagine cases without a double indentation level due to runCatching.

elizarov · 2020-04-17T11:18:08Z

@neworld Can, you please, clarify, what are you trying to achieve? Why your use-case is not handled by a much simpler code like shown below?

launch { 
    val dA = async { /* body a */ }
    val dB = async { /* body b */ }
    val a = dA.await()
    val b = dB.await()
    // both requests were success here, no need to write "if" to check
}

spyro2000 · 2020-04-17T11:19:32Z

@elizarov

Your code has no error handling at all. And the app would end itself right away (AFAIK) as launch() does not wait until anything has finished (maybe runBlocking() here would be better?)

In C# this all would just works fine and in parallel like this:

try {
  var a = someAsycMethod() 
  var b = someOtherAsycMethod()
  var c = await a + await b
} catch (e: Exception) {
  // handle
}

What's the Kotlin equivalent to this?

elizarov · 2020-04-17T11:51:48Z

Let's expand this example with a bit of context. I suppose that C# snippet is part of some async method (because it is using await), like this:

async Task<int> SumTwoThingsConcurrentlyAsync() {
    try {
      var a = someAsyncMethod(); 
      var b = someOtherAsyncMethod();
      var c = await a + await b;
      return c;
    } catch (e: Exception) {
      // handle
    } 
}

In Kotlin the similar function looks like this:

suspend fun sumTwoThingsConcurrently() {
    try { 
        coroutineScope { 
            val a = async { someMethod() }
            val b = async { someOtherMethod() }
            val c = a.await() + b.await()
            return@coroutineScope c
        }
    } catch (e: Exception) {
        // handle
    }
}

I don't see how that's much different from C# code, but it is clearly safer than C# code. The problem with C# solution is that if the call to await a fails, then you do handle the corresponding exception (write it to the log, retry something or whatever), but someOtherAsycMethod() continues to work somewhere in background.

No such problem with the code in Kotlin, even though the code looks quite similar. When sumTwoThingsConcurrently completes in any way it is guaranteed that no background operation leaks.

spyro2000 · 2020-04-17T12:00:08Z

Thank you @elizarov

So the magic part to prevent the catched exception bubbeling up is basically return@coroutineScope c and an additional coroutineScope at the beginning? So what's the correct way to wrap the main method then for a complete example?

elizarov · 2020-04-17T12:01:41Z

My original answer above had a bug (I flipped the nesting of coroutineScope and try/catch, fixed now) which highlights the real problem existing Kotlin API has. There are a number of ideas on the table on how to make it less error-prone and easier for novices to learn. Update: One of them is the inspection that should have caught a mistake in my original code: https://youtrack.jetbrains.com/issue/KT-30294

elizarov · 2020-04-17T12:02:30Z

@specherkin It depends on what are you doing in your main method and why you need to wrap it.

neworld · 2020-04-17T12:20:18Z

@elizarov, tricky happens for complex business logic, which involved many parallel IO organized in a tree. For example content feed with multiple data sources. Doing that properly is not hard, and my point is not that coroutines have bad API. It just looks like design flow because code fills with many async { runCatching { ... }} and supervisorScope/coroutineScope to make sure the application keeps running. It makes easy to slip.

elizarov · 2020-04-17T12:23:14Z

@neworld I still don't understand why would ever need to write async { runCatching { ... } }. On the other hand, you cannot miss writing coroutineScope, because in a suspend fun sumTwoThingsConcurrently() you cannot simply write neither async nor launch. It will not compile. You will be basically forced to use coroutineScope { ... }. No way to miss it. So, as long as you follow the practice of writing your logic in suspending functions the correct code should be somewhat forced onto you automatically.

mtimmerm · 2020-04-17T12:46:41Z

still don't understand why would ever need to write async { runCatching { ... } }.

I think this happens when you write a suspending function that actually produces or returns a Deferred.

The case that comes up fairly often for me is memoizing asynchronous results. When someone wants the value, I would return or await a previously cached Deferred, or if there isn't one, I would kick off an async task and cache the Deferred it produces. If the task fails (sometime later), it is no problem -- completing the Deferred with an exception is the desired behaviour.

mtimmerm · 2020-04-17T12:59:28Z

The problem with C# solution is that if the call to await a fails, then you do handle the corresponding exception (write it to the log, retry something or whatever), but someOtherAsycMethod() continues to work somewhere in background.

It's great that a failure of the enclosing scope cancels the async tasks. The problem is that exceptions thrown by the async tasks cancel the enclosing scope.

spyro2000 · 2020-04-17T13:31:26Z

@elizarov Thank you again.

I tried to adapt your example in my simple test setup:

fun main() {
    System.getProperties().setProperty("Dkotlinx.coroutines.debug", "true");
    runBlocking {

        val value1Def = async { getValueThrowsExceptionAsync() }
        val value2Def = async { getValueAsync() }

        val sum = try {
            println("Awaiting results...")
            value1Def.await() + value2Def.await()
        } catch (e: Exception) {
            println("Catched exception")
            0
        }
        
        println("Our result: $sum")
    }
}


suspend fun getValueAsync() = coroutineScope {
    println("Calling without exception...")
    delay(2000)
    println("Calling without exception - DONE...")
    return@coroutineScope 1
}


suspend fun getValueThrowsExceptionAsync() = coroutineScope {
    println("Calling with exception...")
    delay(3000)
    println("Throwing exception...")
    throw RuntimeException("Argh!")
    return@coroutineScope 1 // this is actually dead code but is enforced by compiler
}

But even that results in the following output:

Awaiting results...
Calling with exception...
Calling without exception...
Calling without exception - DONE...
Throwing exception...
Catched exception
Our result: 0
Exception in thread "main" java.lang.RuntimeException: Argh!
...

So the exception is still not catched :(

Also tried the following (out of sheer desperation):

suspend fun main() = coroutineScope {
    System.getProperties().setProperty("Dkotlinx.coroutines.debug", "true");
    val sum = try {
        val a = async { getValue() }
        val b = async { getValueThrowsException() }
        a.await() + b.await()
    } catch (e: Exception) {
        println("Catched")
    }
    
    println("Sum: $sum")
}

Still the same. Exception crashes the app.

This, however, seems to work as excpected:

suspend fun main() {
    System.getProperties().setProperty("Dkotlinx.coroutines.debug", "true");

    supervisorScope {  
        val a = async { getValueThrowsException() }
        val b = async { getValue() }

        // do something else
        
        try {
            println(a.await() + b.await())
        } catch (e:Exception) {
            println("catched")
        }
    }
}

Output:

Calling with exception...
Calling without exception...
Calling without exception - DONE...
Throwing exception...
catched

So, is this the preferred pattern to avoid exceptions breaking out from coroutines?

LouisCAD · 2020-04-17T14:15:39Z

@mtimmerm

It's great that a failure of the enclosing scope cancels the async tasks. The problem is that exceptions thrown by the async tasks cancel the enclosing scope.

That's not a problem, but a feature, and I like and use that feature, personally. For the cases where I don't want it, I catch the errors inside the async block. Catching the errors at the call site of await() is incorrect. You need to catch them inside the async block, or outside of the enclosing coroutineScope { ... }, as @elizarov showed.

LouisCAD · 2020-04-17T14:19:05Z

@spyro2000 Here's the correct version of your last snippet:

suspend fun main() {
    System.getProperties().setProperty("Dkotlinx.coroutines.debug", "true");

    try {
        val result = coroutineScope {  
            val a = async { getValueThrowsException() }
            val b = async { getValue() }

            // do something else
        
            a.await() + b.await() // Last expression of the lambda is its return type, put in result
        }
        println(result)
    } catch (e:Exception) {
        println("catched")
    }
}

spyro2000 · 2020-04-17T20:34:43Z

@LouisCAD Thank you.

Are there any disadvantages / pitfalls to use make the main method suspending? In other words - can I always do that? And why does this actually work at all (coroutineScope() seems to behave like runBlocking()) and waiting for an result before terminating the app?

rocketraman · 2020-04-17T21:16:31Z

And why does this actually work at all (coroutineScope() seems to behave like runBlocking()) and waiting for an result before terminating the app?

Getting OT here, but like runBlocking, coroutineScope will wait for all its child coroutines to finish before execution continues. However, one difference is that coroutineScope is itself a suspending function, whereas runBlocking is not i.e. the former leaves the calling thread free to do other work, whereas the latter blocks the calling thread. IOW, coroutineScope is blocking but asynchronous, and runBlocking is blocking and synchronous.

LouisCAD · 2020-04-17T21:37:20Z

@spyro2000 coroutineScope and runBlocking have only one thing in common: they accept a suspending lambda, and provide a local CoroutineScope that you can use to launch other coroutines (using launch, or async if you need to await a value they would produce later), so that you can do parallel operations in that scope.

Now, it's best to look at the KDoc of both to understand the differences (that reply above summarizes a little).

In the case of the main function, there's a few things to know as of Kotlin 1.3 (or gotchas if you prefer):

suspend fun main() runs on the JVM main thread, but has no CoroutineDispatcher in its coroutineContext, that means that any coroutine you launch into it (using launchor async) will then use Dispatchers.Default, because, as its name implies, it's the default CoroutineDispatcher. The JVM main thread will be blocked but not used (although using it in Dispatchers.Default in that specific case could be a nice efficiency evolution).
fun main() = runBlocking { ... } will allow to call suspending functions in the lambda, and there's a scope to launch other coroutines. That scope, and the coroutineContext will reuse the blocked main JVM thread as an "event loop" to run any coroutines that don't specify another dispatcher.

spyro2000 · 2020-04-18T12:10:07Z

@rocketraman @LouisCAD Many thanks to both of you.

taras-i-fedyk · 2020-05-04T15:52:44Z

@elizarov Could you please answer the following questions?

If an async coroutine A has failed and some coroutine B is failing or is being cancelled automatically due to that (as a result of Structured Concurrency), then wouldn’t it be better if awaiting the coroutine A’s result within the coroutine B threw a corresponding CancellationException (not the original exception generated by the coroutine A)?

That way it would conform to the paradigm in accordance with which a suspending function throws a CancellationException when the current coroutine is failing or is being cancelled. Plus, it would not create the illusion that you can catch the exception generated by the coroutine A for the sake of preventing the coroutine A’s parent coroutine from failing automatically. Not to mention, it would be consistent with ReceiveChannel.receive and Job.join which already work like that.
For clarity, wouldn’t it be better if the API for determining Job states was implemented in the following way?
1. The isCompleted property returns true solely if the Job has finished successfully (not also when it has failed or has been cancelled). Consequently, the Completed state represents solely such situations.
2. The isFailed property returns true if the Job is failing / has failed. Consequently, the Failing / Failed states represent such situations. In addition to that, the fail function replaces completeExceptionally.
3. The isCancelled property returns true solely if the Job is being cancelled / has been cancelled (not also when it is being failed / has been failed). Consequently, the Cancelling / Cancelled states represent solely such situations.
4. The isFailedOrCancelled property returns true if isFailed or isCancelled returns true.
5. The isFinished property returns true if isCompleted or isFailed or isCancelled returns true.

elizarov · 2020-05-06T09:01:44Z

@taras-i-fedyk We used to have a different Job state machine in the past (before stable 1.0 release), but we simplified it to avoid other problems it caused. The "the illusion that you can catch the exception generated by the coroutine A for the sake of preventing the coroutine A’s parent coroutine from failing automatically" is a complicated one and is hard to solve in its entirety (you can always write wrong code!). However, we are looking at different ways to prevent users from making some specific common mistakes. Please, take a look at this issue, for one: https://youtrack.jetbrains.com/issue/KT-30294

As for your other question and comments, I cannot quite grasp the context of what you are trying to achieve here. Can you please elaborate with example snippet of code which you'd like to improve the behavior of?

taras-i-fedyk · 2020-05-07T13:56:36Z

@elizarov Here's my response. It'll be a bit long because I have to explain what exactly I meant within my initial post.

I gave three reasons for why it’d make sense to make it so that await doesn’t throw the original exception but a corresponding CancellationException (which wraps the original exception), if the current coroutine is failing or is being cancelled as a result of the async coroutine’s failure (which takes places if the async coroutine has a parent coroutine that is not a supervisor one). I’ll list those reasons again and try to elaborate on them a bit:
- the proposed behavior of await would conform to the paradigm in accordance with which a suspending function throws a CancellationException when the current coroutine is failing or is being cancelled.
  
  (The current behavior of await doesn’t conform to that paradigm. Meaning it is not in line with the rest of the design of coroutines.)
- the proposed behavior of await would be consistent with how Job.join and ReceiveChannel.receive work. Because they work in exactly such a way.
  
  (And the current behavior of await is not consistent with that way. Which means it is not in line with the rest of the design of coroutines.)
- the proposed behavior of await would not create the illusion that you can catch the exception generated by a failed async coroutine for the sake of preventing its parent coroutine from failing automatically.
  
  Because if you're told that await can throw only a CancellationException under the given circumstances, you understand that once some exception has been thrown, Structured Concurrency has been already applied and you can’t undo it. Thus you know beforehand that the effect of catching an exception is limited, and such a limited effect doesn't contradict common sense.
  
  (The current behavior of await contradicts common sense in the described context an does create the above-mentioned illusion.)
So you don’t agree those are sufficient reasons for changing how await works, right? If so, what counter-reasons exactly do you have?

(And the link you shared within your response is a bit out of context. It’s about how to educate the user so that he knows how the API works. While I raised a different and a more fundamental question: about the fact that there’s likely a drawback in the API’s design and about how that drawback could be eliminated.)
By my second question, I meant that the current API for determining Job states is a bit messy. And I proposed how it could be implemented so that it’s clearer.

The problem boils down mainly to the following two things:
- within the current API for determining Job states, there’s no clear distinction between a failure and a cancellation of a Job. Everything is too blurred in that context. Hence it’s difficult to reason about things.
- within the current API for determining Job states, the isCompleted property is for determining if a Job has finished for whatever reason. But at the same time, the Completed state is solely for a successful finish (not also for a failure or a cancellation). Hence a contradiction.
So do you agree that the problem does exist and that it could be solved in the proposed way (or something along those lines)? Or do you have a different opinion?

taras-i-fedyk · 2020-06-11T13:59:07Z

Since my posts above haven’t been fully addressed here, I’ve created two dedicated issues for those topics. I hope the discussion will be continued there.

Usually, await throws a wrong exception for representing a failure of a Deferred (e.g., of an async coroutine)

The API for determining Job states is a bit messy

If we don't do this, catching the exception isn't enough; it'll still bubble to its parent and kill everything. See Kotlin/kotlinx.coroutines#763 for an extended discussion of Kotlin misdesigns. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

qwwdfsad added the design label Oct 25, 2018

jcornaz mentioned this issue Oct 26, 2018

App exits after catching exception thrown by coroutine #753

Closed

gildor mentioned this issue Oct 29, 2018

New integration: AndroidX Lifecycle #760

Closed

Tolriq mentioned this issue Nov 2, 2018

Documentation improvement about async.await cancelling outer scope #787

Closed

elizarov mentioned this issue Dec 14, 2018

Throwing exception in channel sharing context with the consumer will also trigger uncaught exceptioon handler #891

Closed

elizarov mentioned this issue Feb 25, 2019

Structured concurrency for Completable/Listenable futures #1007

Closed

fvasco mentioned this issue Mar 26, 2019

Coroutines Handle Exception #1058

Closed

elizarov closed this as completed Apr 1, 2020

Munzey mentioned this issue Aug 4, 2020

Concurrent failing binds causes uncaught BindException michaelbull/kotlin-result#28

Closed

BenHenning mentioned this issue Aug 28, 2020

Allow coroutines to fail without destabilizing the app oppia/oppia-android#1741

Open

Async builder and cancellation in structured concurrency #763

Async builder and cancellation in structured concurrency #763

Comments

qwwdfsad commented Oct 25, 2018 • edited Loading

Background

Possible solutions

fvasco commented Oct 25, 2018

LouisCAD commented Oct 25, 2018

fvasco commented Oct 26, 2018 • edited Loading

SolomonSun2010 commented Oct 26, 2018 • edited Loading

LouisCAD commented Oct 26, 2018 • edited Loading

fvasco commented Oct 27, 2018

LouisCAD commented Oct 27, 2018

LouisCAD commented Oct 27, 2018

fvasco commented Oct 27, 2018

LouisCAD commented Oct 27, 2018

fvasco commented Oct 27, 2018

LouisCAD commented Oct 27, 2018

zach-klippenstein commented Oct 27, 2018

fvasco commented Oct 31, 2018

LouisCAD commented Nov 9, 2018

elizarov commented Dec 6, 2018 • edited Loading

neworld commented Oct 7, 2019

mtimmerm commented Nov 8, 2019 • edited Loading

elizarov commented Apr 1, 2020

rocketraman commented Apr 1, 2020 • edited Loading

elizarov commented Apr 1, 2020

rocketraman commented Apr 1, 2020

elizarov commented Apr 2, 2020

spyro2000 commented Apr 16, 2020 • edited Loading

LouisCAD commented Apr 17, 2020

neworld commented Apr 17, 2020

elizarov commented Apr 17, 2020

spyro2000 commented Apr 17, 2020 • edited Loading

elizarov commented Apr 17, 2020 • edited Loading

spyro2000 commented Apr 17, 2020

elizarov commented Apr 17, 2020 • edited Loading

elizarov commented Apr 17, 2020

neworld commented Apr 17, 2020

elizarov commented Apr 17, 2020

mtimmerm commented Apr 17, 2020

mtimmerm commented Apr 17, 2020

spyro2000 commented Apr 17, 2020 • edited Loading

LouisCAD commented Apr 17, 2020

LouisCAD commented Apr 17, 2020

spyro2000 commented Apr 17, 2020 • edited Loading

rocketraman commented Apr 17, 2020 • edited Loading

LouisCAD commented Apr 17, 2020

spyro2000 commented Apr 18, 2020

taras-i-fedyk commented May 4, 2020 • edited Loading

elizarov commented May 6, 2020

taras-i-fedyk commented May 7, 2020 • edited Loading

taras-i-fedyk commented Jun 11, 2020 • edited Loading

qwwdfsad commented Oct 25, 2018 •

edited

Loading

fvasco commented Oct 26, 2018 •

edited

Loading

SolomonSun2010 commented Oct 26, 2018 •

edited

Loading

LouisCAD commented Oct 26, 2018 •

edited

Loading

elizarov commented Dec 6, 2018 •

edited

Loading

mtimmerm commented Nov 8, 2019 •

edited

Loading

rocketraman commented Apr 1, 2020 •

edited

Loading

spyro2000 commented Apr 16, 2020 •

edited

Loading

spyro2000 commented Apr 17, 2020 •

edited

Loading

elizarov commented Apr 17, 2020 •

edited

Loading

elizarov commented Apr 17, 2020 •

edited

Loading

spyro2000 commented Apr 17, 2020 •

edited

Loading

spyro2000 commented Apr 17, 2020 •

edited

Loading

rocketraman commented Apr 17, 2020 •

edited

Loading

taras-i-fedyk commented May 4, 2020 •

edited

Loading

taras-i-fedyk commented May 7, 2020 •

edited

Loading

taras-i-fedyk commented Jun 11, 2020 •

edited

Loading