Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crashes on effect shift when using Arrow with ktor client #2760

Closed
jrgonzalezg opened this issue Jul 12, 2022 · 23 comments
Closed

Crashes on effect shift when using Arrow with ktor client #2760

jrgonzalezg opened this issue Jul 12, 2022 · 23 comments

Comments

@jrgonzalezg
Copy link
Member

Hi!. I'm trying to migrate an Android app's network layer to the new Effect / effect APIs in Arrow but there seems to be some issue or incompatibility with ktor's HttpClient. The code works fine on the happy path, but as soon as it gets to a shift the app crashes and it does not even stop on catch blocks or fold error handling so it seems to be bypassing the error handling mechanism.

The function I'm trying to implement looks like this:

suspend inline fun <reified R, reified A> get(
    url: Url,
    acceptedErrorCodes: List<Int> = emptyList()
  ): Effect<BackendFailure<R>, A> =
    effect {
      try {
        val httpResponse: HttpResponse = getHttpResponse(url)

        val httpStatusCode: HttpStatusCode = httpResponse.status

        if (httpStatusCode.isSuccess() || httpStatusCode.value in acceptedErrorCodes) {
          when (val backendResponse: BackendResponse<R, A> = httpResponse.body()) {
            is BackendResponse.Failure -> shift(BackendFailure.Known(backendResponse.data))

            is BackendResponse.Success -> backendResponse.data
          }
        } else {
          shift(BackendFailure.Unknown(null))
        }
      } catch (throwable: Throwable) {
        shift(BackendFailure.Unknown(throwable.nonFatalOrThrow().message))
      }
    }

but same issue also happened with more minimal code without the try-catch and simplified error type. This happens both with arrow-stack 1.1.2 and 1.1.3-alpha.29. The stacktraces of the crashes look like this:

kotlinx.coroutines.CoroutinesInternalError: Fatal exception in coroutines machinery for AwaitContinuation(DispatchedContinuation[Dispatchers.Main.immediate, Continuation at io.ktor.client.engine.HttpClientEngine$DefaultImpls.executeWithinCallContext(HttpClientEngine.kt:100)@96aed1b]){Completed}@4a75eb8. Please read KDoc to 'handleFatalException' method and report this incident to maintainers
        at kotlinx.coroutines.DispatchedTask.handleFatalException(DispatchedTask.kt:144)
        at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:115)
        at android.os.Handler.handleCallback(Handler.java:938)
        at android.os.Handler.dispatchMessage(Handler.java:99)
        at android.os.Looper.loopOnce(Looper.java:201)
        at android.os.Looper.loop(Looper.java:288)
        at android.app.ActivityThread.main(ActivityThread.java:7839)
        at java.lang.reflect.Method.invoke(Native Method)
        at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:548)
        at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:1003)
        Suppressed: kotlinx.coroutines.DiagnosticCoroutineContextException: [StandaloneCoroutine{Active}@5e7e8cd, Dispatchers.Main.immediate]
    Caused by: ShiftCancellationException(Shifted Continuation)
        at com.example.screens.main.MainViewModel$load$1$invokeSuspend$$inlined$effect$1$2.shift(Effect.kt:776)
        at arrow.core.continuations.EffectScope$bind$2.invoke(EffectScope.kt:57)
        at arrow.core.continuations.EffectScope$bind$2.invoke(EffectScope.kt:57)
        at arrow.core.continuations.FoldContinuation$resumeWith$2$f$1.invokeSuspend(Effect.kt:715)
        at arrow.core.continuations.FoldContinuation$resumeWith$2$f$1.invoke(Unknown Source:8)
        at arrow.core.continuations.FoldContinuation$resumeWith$2$f$1.invoke(Unknown Source:2)
        at arrow.core.continuations.FoldContinuation.resumeWith(Effect.kt:716)
        at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:46)
        at io.ktor.util.pipeline.SuspendFunctionGun.resumeRootWith(SuspendFunctionGun.kt:138)
        at io.ktor.util.pipeline.SuspendFunctionGun.loop(SuspendFunctionGun.kt:112)
        at io.ktor.util.pipeline.SuspendFunctionGun.access$loop(SuspendFunctionGun.kt:14)
        at io.ktor.util.pipeline.SuspendFunctionGun$continuation$1.resumeWith(SuspendFunctionGun.kt:62)
        at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:46)
        at io.ktor.util.pipeline.SuspendFunctionGun.resumeRootWith(SuspendFunctionGun.kt:138)
        at io.ktor.util.pipeline.SuspendFunctionGun.loop(SuspendFunctionGun.kt:112)
        at io.ktor.util.pipeline.SuspendFunctionGun.access$loop(SuspendFunctionGun.kt:14)
        at io.ktor.util.pipeline.SuspendFunctionGun$continuation$1.resumeWith(SuspendFunctionGun.kt:62)
        at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:46)
        at io.ktor.util.pipeline.SuspendFunctionGun.resumeRootWith(SuspendFunctionGun.kt:138)
        at io.ktor.util.pipeline.SuspendFunctionGun.loop(SuspendFunctionGun.kt:112)
        at io.ktor.util.pipeline.SuspendFunctionGun.access$loop(SuspendFunctionGun.kt:14)
        at io.ktor.util.pipeline.SuspendFunctionGun$continuation$1.resumeWith(SuspendFunctionGun.kt:62)
        at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:46)
        at io.ktor.util.pipeline.SuspendFunctionGun.resumeRootWith(SuspendFunctionGun.kt:138)
        at io.ktor.util.pipeline.SuspendFunctionGun.loop(SuspendFunctionGun.kt:112)
        at io.ktor.util.pipeline.SuspendFunctionGun.access$loop(SuspendFunctionGun.kt:14)
        at io.ktor.util.pipeline.SuspendFunctionGun$continuation$1.resumeWith(SuspendFunctionGun.kt:62)
        at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:46)
        at io.ktor.util.pipeline.SuspendFunctionGun.resumeRootWith(SuspendFunctionGun.kt:138)
        at io.ktor.util.pipeline.SuspendFunctionGun.loop(SuspendFunctionGun.kt:112)
        at io.ktor.util.pipeline.SuspendFunctionGun.access$loop(SuspendFunctionGun.kt:14)

So it seems like some kind of incompatibility maybe because both Arrow and ktor try to tap on the coroutine mechanisms. Apparently the ShiftCancellationException is somehow handled by ktor internal code and that in turn produces a crash before the exception is handled by Arrow code.

@nomisRev
Copy link
Member

Hey @jrgonzalezg,

This seems very weird 🤔 I am not sure this is being caused by Ktor, I cannot see any clear indication in the stacktrace at least. Those Android packages got me worried more :D It looks here was able to catch the shift bubbling through the code, which is a CancellationException that should never be caught.

Do you have a reproducible example of this code I could look at? Is there any surrounding code that could be capturing the CancellationException?

@jrgonzalezg
Copy link
Member Author

@nomisRev The Android code is only reached after the exception bubbles all the way up to the activity, so I don't think it plays a role on it. Shouldn't the ShiftCancellationException be captured at least by Arrow to be part of the effect?. Neither the catch on the try-catch in my sample code above nor any effect code or the final fold (error variant) catch it at all. My impression is that either Arrow somehow is unable to capture the ShiftCancellationException when it interacts with ktor code or that the code at the SuspendFunctionGun on ktor side is somehow jumping over the Arrow continuations and pushing it further up the stack from where it should be handled. Therefore, it then ends up reaching Android code and crashing the whole app as nothing catches it there. I will try and see if I can make a non-Android example, but I'm not used to non-app code 😅.

@nomisRev
Copy link
Member

Hey @jrgonzalezg,

Shouldn't the ShiftCancellationException be captured at least by Arrow to be part of the effect?.

That is definitely the case, but something is going wrong here. I just noticed you have suspend inline, there are some known issues with this (incorrect binary codegen). What Kotlin version are you on?

Could you try 2 things?

  • I noticed this code has suspend but returns Effect, so there should no reason to use suspend here. It can simply be inline + reified, since Effect models the suspend part already.

  • If that doesn't work, could you try removing inline, and thus also reify and test the code if it works like that? Manually providing the R::class and T:class? This might fix the incorrectly generated code, and then we know that it's related to one of the bugs in the Kotlin Compiler (which should be fixed soon).

@nomisRev
Copy link
Member

The reason I suspect this is because I don't see anything unexpected in the stack-trace.

Potentially ShiftCancellationException is escaping Effect because of the "conflicting" suspend modifier, and the Kotlin compiler generates the bytecode in such a way that the exception goes to suspend and not to Effect. This is behavior that we've already witnessed, and reported to YouTrack. Should be fixed in 1.7.20 IIRC.

@jrgonzalezg
Copy link
Member Author

@nomisRev Neither removing suspend nor inline / reified seemed to work. But I have also tried to reproduce it by adding some minimal code on the console application template made by IntelliJ and it does not seem to have the issue on that one. I'm using Kotlin 1.7.0 in all cases, since I can only use that exact version on the app code due to being blocked by Jetpack Compose restrictions. But I don't understand how the Android side of things may be causing this though. I'm simply calling the effect on a viewModelScope.launch { } block, there is nothing special at all, it is kinda like the most standard way of launching coroutines from apps using MVVM. Maybe the dependencies are being resolved to something slightly different on app and console codes and that could be the cause, I will try and look at the dependencies output on both to see if I can spot any difference.

@jrgonzalezg
Copy link
Member Author

I have spent a bit of time trying to figure out the differences between the variants that work and the ones that doesn't and it seems it can be minimized to this fairly minimal code:

val httpClient = HttpClient()

suspend fun main() {
    effect {
        get(url = Url("https://jsonplaceholder.typicode.com/todos/1")).bind()
    }
        .fold(
            recover = { it.toString() },
            transform = { it.toString() },
        ).let {
            println("Result was: $it")
        }

        exitProcess(0)
}

fun get(
    url: Url, acceptedErrorCodes: List<Int> = emptyList()
): Effect<BackendFailure, HttpResponse> = effect {
    val httpResponse: HttpResponse = getHttpResponse(url)

    val httpStatusCode: HttpStatusCode = httpResponse.status

    if (httpStatusCode.isSuccess() || httpStatusCode.value in acceptedErrorCodes) {
        httpResponse
    } else {
        shift(BackendFailure(null))
    }
}

suspend fun getHttpResponse(url: Url): HttpResponse = httpClient.get(url)

data class BackendFailure(val error: String?)

That fails both on console application and Android application. The inline / reified and so on don't seem to play a role after all so I removed it. That code on Android side crashes with the exception while on console application it just hangs forever but does not crash.

If instead I replace the:

    effect {
        get(url = Url("https://jsonplaceholder.typicode.com/todos/1")).bind()
    }
        .fold...

with

    get(url = Url("https://jsonplaceholder.typicode.com/todos/1")).fold...

it does work. So somehow it seems that the issue is related to trying to combine effects inside an enclosing effect { } block. When the enclosing block is added and a shift occurs it crashes on Android / hangs forever on console. When the enclosing effect {} block is removed it works both on Android and console application both on success and on failure (shift). Does that help? Isn't it possible to bind other effects inside an effect block? How it should be done instead if that is not the right way?

@jrgonzalezg
Copy link
Member Author

Another way that does work is this:

    effect {
        get(url = Url("https://jsonplaceholder.typicode.com/todos/1")).toEither().bind()
    }.fold...

so maybe the issue is the Effect<R, B>.bind() at EffectScope?

@nomisRev
Copy link
Member

nomisRev commented Jul 13, 2022

Oh... I've seen this before, but I thought it was related to context receivers at the time.

If .toEither().bind() works that is still a strong indication that something is going wrong during code generation.
The reason it hangs on cli is probably due to a suspension point not being released, and thus it keeps the main thread occupied.

Using fun main(): Unit = runBlocking { you might get the same result as on Android,
since suspend fun main() doesn't have any of the KotlinX Coroutines functionality installed you won't see kotlinx.coroutines.CoroutinesInternalError.

I am going to try these examples for latest versions of Arrow, Kotlin and KotlinX since we should try to pin-point the exact issue so we can fix it in Arrow, or create a YouTrack issue for the compiler.

@jrgonzalezg
Copy link
Member Author

Yes, you are right, using runBlocking the console app also crashes with the same exception as the Android app. BTW, all my tries above where done with both arrow-stack 1.1.2 and arrow-stack 1.1.3-alpha.29 which behave identically in case that helps.

@nomisRev
Copy link
Member

@jrgonzalezg another similar bug has been reported, and I shared some findings here.
#2764 (comment)

It's indeed going wrong inside of the compiler, since toEither.bind() results in exactly the same at runtime,
and fold({ shift(it) }, { it }) works fine as well.

You can use toEither().bind() or fold({ shift(it) }, { it }) as a current workaround.
I'll report back asap with a fix 🤞

@jrgonzalezg
Copy link
Member Author

jrgonzalezg commented Jul 14, 2022

Thanks @nomisRev! But note that in my examples only toEither().bind() works as a workaround. Using fold({ shift(it) }, { it }) is something I tried before toEither and it still leads to the crashes on the ktor example cases.

@nomisRev
Copy link
Member

That is very strange 🤔

I've just changed the implementation of bind to fold({ shift(it }, { it }) and it fixes the issue for me locally. Let me try your example as well to see if it works for me. Then we can raise a PR and you can test the alpha to verify the fix.

@nomisRev
Copy link
Member

I have reduced the issue further down to the following, so it's not related to Ktor.
Ktor however most likely uses a dispatcher like IO for scheduling the get request.

suspend fun main() {
  effect {
    failure().bind()
  }.fold(
    recover = { println("Recovered $it") },
    transform = { println("Result $it") },
  )
}

fun failure(): Effect<Unit, HttpResponse> = effect {
  // Different dispatcher then where the code is running
  withContext(Dispatchers.IO) {}
  shift(Unit)
}

@jrgonzalezg
Copy link
Member Author

Nice simplification!, that one behaves just like the code I had albeit being much simpler. It also does not work with the bind() changed to .fold({ shift(it) }, { it }) but only if changed to toEither().bind().

@nomisRev
Copy link
Member

I found a fix, you can check out the PR here. #2765
I tested your Ktor scenario as well, and it also fixes the issue there.

@jrgonzalezg
Copy link
Member Author

Nice!. I will try it once a new alpha is out 🎉

nomisRev added a commit that referenced this issue Jul 16, 2022
@nomisRev
Copy link
Member

@jrgonzalezg if you have some time to test the fix in version 1.1.3-alpha.31 that would be great! 🙏

@jrgonzalezg
Copy link
Member Author

@nomisRev Yes, it seems to be working now 🎉 . Closing this one.

@nomisRev
Copy link
Member

nomisRev commented Aug 3, 2022

@jrgonzalezg the actual issue was not fixed before, only the symptom...

The actual fix is available in 1.1.3-alpha.38
If you have some time could you check it, and report back please? 🙏

@jrgonzalezg
Copy link
Member Author

Thanks!, I knew there were still issues as I mentioned on some comments I made on #2769 but I had not yet been able to reduce it to a minimal failing test like #2779. I will try alpha 38 (maybe better 39?) and see if it works in more cases. For instance I had crashes on a tapLeft that I also had to rewrite in a way that did not cause coroutine exceptions. I will try the original implementation I did for that too and see if it works now. I will report back once I have some insights.

@nomisRev
Copy link
Member

nomisRev commented Aug 3, 2022

Awesome, thanks @jrgonzalezg! Any version after alpha.38 includes the fix.

@jrgonzalezg
Copy link
Member Author

jrgonzalezg commented Aug 3, 2022

@nomisRev Both the original code where I was using Arrow with the Ktor client and my initial tapLeft implementation (a simple fold+shift which also crashed when mixed with Ktor) work now. I had some additional (different) code failing that sadly I did not kept a copy to test against... But the code I still have plus some additional code that I could recreate from memory all seem to work now 🎉 .

@nomisRev
Copy link
Member

nomisRev commented Aug 3, 2022

Awesome, thank you so much for taking the time to test this and reporting back @jrgonzalezg!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants