Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fold restriction not allowing any Function<*> to be returned is too strict #3391

Closed
Zordid opened this issue Mar 4, 2024 · 20 comments
Closed

Comments

@Zordid
Copy link
Contributor

Zordid commented Mar 4, 2024

Hi there!
Coming from Arrow 1.2.1, I was quite surprised to get some exceptions in my unit tests like this:

java.lang.IllegalStateException: Returning a lazy computation or closure from 'fold' breaks the context scope, and may lead to 
leaked exceptions on later execution.
Make sure all calls to 'raise' and 'bind' occur within the lifecycle of nullable { }, either { } or similar builders.

In my case, my code creates a "Validator", which happens to be a just a function signature.
I do not leak anything from within, I simply need error detection when creating my independent validator.

I think I understand what you want to prevent, but simply disallowing return values to be a Function<*> is too much.
I will now rework my interface to not be a functional interface, but a normal one - just to circumvent this error.

This worked fine with 1.2.1:

import arrow.core.Either
import arrow.core.getOrElse
import arrow.core.raise.Raise
import arrow.core.raise.either
import arrow.core.raise.ensure

typealias NumValidator = (Int) -> Either<String, Int>

context(Raise<String>)
fun createLessThanNumValidator(lt: Int): NumValidator {
    ensure(lt > 0) { "lt must be a positive number" }
    return { i ->
        either {
            ensure(i < lt) { "$i is not less than $lt" }
            i
        }
    }
}

fun main() {
    val lessThanThree = either { createLessThanNumValidator(3) }.getOrElse { error(it) }

    for (i in 1..10) {
        println(lessThanThree(i))
    }
}

The creation of a validator could have a logical error if the given threshold is not greater than 0. If it is, you get a proper validator that validates any int agains the threshold.
There is nothing wrong with the code, nothing can and will ever leak. It is just a double-stacked validation.

Outputs:

Either.Right(1)
Either.Right(2)
Either.Left(3 is not less than 3)
Either.Left(4 is not less than 3)
Either.Left(5 is not less than 3)
Either.Left(6 is not less than 3)
Either.Left(7 is not less than 3)
Either.Left(8 is not less than 3)
Either.Left(9 is not less than 3)
Either.Left(10 is not less than 3)

Going from Arrow 1.2.1 to 1.2.3 makes this code throw the error on the first line of main - in a very bad way, as the error is a runtime error and not a compile time error. I would strongly advise not to create runtime errors...

@Zordid
Copy link
Contributor Author

Zordid commented Mar 4, 2024

@nomisRev just adding you to see this issue quickly... ;-)

@Zordid
Copy link
Contributor Author

Zordid commented Mar 4, 2024

To make the same code work with Arrow 1.2.3 it is necessary to give up the typealias for the validator and create your own interface, which must not inherit the functional type, but can only mimic the old interface by an operator fun invoke like this:

fun interface NumValidator {
    operator fun invoke(int: Int): Either<String, Int>
}

context(Raise<String>)
fun createLessThanNumValidator(lt: Int): NumValidator {
    ensure(lt > 0) { "lt must be a positive number" }
    return NumValidator { i ->
        either {
            ensure(i < lt) { "$i is not less than $lt" }
            i
        }
    }
}

This will circumvent the error thrown by either (a.k.a. fold) while doing the exact same thing.

@kyay10
Copy link
Collaborator

kyay10 commented Mar 4, 2024

Use foldUnsafe instead, which bypasses those checks. Maybe a comment should be added about it in the docs

@Zordid
Copy link
Contributor Author

Zordid commented Mar 4, 2024

Hm. I really do not like to use functions called ...unsafe at all. It is like using !! everywhere, basically stating "I know what I am doing"... :(
But yes, this hint needs to be added not only to the docs, but as well to the error thrown! It was really, really baffling to me.

The problem with the error is:
I am doing nothing wrong in my code. Not breaking any boundaries at all. Just using the tools from Arrow in a correct way. If it weren't for my unit tests that run this code, I could have easily been deploying a new version of our product without catching this problem until it's too late. That's why I am so hard against introducing runtime exceptions that feel more like they should be compile time checks.

@Zordid
Copy link
Contributor Author

Zordid commented Mar 4, 2024

Another catch people will stumble upon is: fold is used as a basis in many other higher level functions, like recover in my case. It even took me a while to realize what is going on here, as I did not do any calls to fold at all. So, the exception thrown here inside fold is very user unfriendly...

@Zordid
Copy link
Contributor Author

Zordid commented Mar 4, 2024

...just realizing, in my case, as I am using recover and not fold itself, I would need to call an recoverUnsafe, right? Does that exist...? ;-)

@Zordid
Copy link
Contributor Author

Zordid commented Mar 4, 2024

I played around with it and I do understand what you wanted to achieve with this error. Alas, I am not convinced this is the right way to go forward, as this change breaks compatibility for users like me in an unintended way, whereas the old error was clearly an error provoked by abuse of the nested raises.

This code failed at runtime with Arrow 1.2.1 - and it should. The inner function returned has nothing to do with the outer scope at its creation time and as such must not call raise.

fun main() {
    val i = 5
    val r = recover({
        ensure(i < 10) { "not less than 10" };
        { i: Int ->
            if (i == 3) raise("BOOM!")
            i * 2
        }
    }) { error(it) }
    println(r(3))
}

In Arrow 1.2.3 this correctly throws an error - sadly at runtime, which is not good enough as this is a static problem. Because of today's limitations you cannot tackle this correctly at compile time, and as such the addition to disallow any functional or closure type as a return type was added. But this is, IMHO, not a good idea, it even looks awkward as you have to specify explicitly some types you had thought of today:
if (it is Function<*> || it is Lazy<*> || it is Sequence<*>)
But a) is this all? and b) will you add more types that potentially could leak the Raise context?

I would argue this change is a little too far fetching and does not help as abuse has also been detected at runtime before, so there is no difference at all - other than punishing the correct use of fold and all derived methods.

@kyay10
Copy link
Collaborator

kyay10 commented Mar 4, 2024

I see your point! Would you be satisfied if it was changed to an assert call so that it wouldn't trigger in production? Also, I think there could be some way to make this error static by using @OverloadResolutionByLambdaReturnType, but that'll need further investigation. In fact, with this type of solution, we can have it marked with an OptIn annotation so that you have a simple "I know what I'm doing" switch off

@Zordid
Copy link
Contributor Author

Zordid commented Mar 4, 2024

If there's any way to make it more subtle, that would be a good way. I know it's a very tricky situation to work around, as basically you need the compiler to "know" not to make the raise context part of the closure created in the returned function. If that can be done with the said annotation, fine. Otherwise, I personally would go back to 1.2.1 behaviour. If a user really does that, it breaks, but it should. Then, telling in the exception what was done wrong is best you can do.

@kageru
Copy link

kageru commented Mar 8, 2024

Would you be satisfied if it was changed to an assert call so that it wouldn't trigger in production?

Please don’t do that. You’re still replacing a system that threw runtime exceptions when used improperly with a system that throws runtime exceptions even when used properly, just that this alternative will “only” break unit tests. I agree with Zordid. If you can make it work correctly and at compile time, that’s good, but if that’s not possible, I’d rather have no checks than bad checks.

And in the future, I’d appreciate you not making breaking changes that produce runtime exceptions in patch releases. 😅

@nomisRev
Copy link
Member

nomisRev commented Mar 8, 2024

And in the future, I’d appreciate you not making breaking changes that produce runtime exceptions in patch releases. 😅

@kageru I'm so sorry about that 😓 😭That was never our intention!
This problem has kept me up at night for more problems than I care to admit in the last 2 years 😅

I think part of our rationale was here that we imagined almost no1 doing this, and we were afraid people were having subtle bugs right now. Clearly some of you really now what you're doing here 😁
Since for example in @Zordid example, it was perfectly valid. Since he nested a secondary either inside his Function. The problem actually lies in raise being leaked, but we have another "fix" for that.

So this is what is actually problematic:

val x = either<String, () -> Int> {
   { raise("Hello") }
}

x.getOrElse { fail("Boom!") }
  .invoke() // RaiseLeakedException

That's happening here:


So it's already kind-of covered, but we were still wondering if we should be more strict. Which resulted in this change in the patch, we incorrectly figured all cases of this would been subtle bugs (the thought of introducing subtle/hard to find bugs keeps me up at night 😓).

So I guess what I am looking for is feedback, it seems it's hindering at least a couple of people and perhaps we can/should figure out a better default for 2.0.0.

cc\ @wfhartford you also raised this topic on Slack so including you here 😉

@Zordid
Copy link
Contributor Author

Zordid commented Mar 8, 2024

I second @kageru's opinion. Don't exchange a system that breaks when not used correctly with a system that breaks when used correctly. ;-) I now had to work my way around a correct solution by either either replacing the use of the recover function which internally uses fold with my own handcrafted recoverUnsafe which uses foldUnsafe instead or (this is what I did) not using a functional type at all, but my own interface that does is not considered "unsafe" by fold, but in effect could equally be doing the wrong thing which you cannot catch by the explicit check against Function, Lazy or Sequence at all.

So, this should be enough to not leave the if check in fold. Cause it does not do what it should do in too many cases and punishes the correct use of the tools.

@nomisRev
Copy link
Member

nomisRev commented Mar 8, 2024

second @kageru's opinion. Don't exchange a system that breaks when not used correctly with a system that breaks when used correctly. ;-)

I totally agree. Shouldn't have happened.

So, this should be enough to not leave the if check in fold. Cause it does not do what it should do in too many cases and punishes the correct use of the tools.

Very good point. I'm open to revert it, and publish another patch or do it directly in 2.x.x.

@wfhartford
Copy link

Thanks for copying me here Simon. I don't have much to add that hasn't been brought up by others. My use case was validating user input to produce an instance of a sealed class extending Predicate (typealias Predicate<T> = (T) -> Boolean). I don't know what the correct solution here is, but I don't think that an exception at runtime is it. Would it be possible to use something like detekt to identify the problem more precisely?

@nomisRev
Copy link
Member

Would it be possible to use something like detekt to identify the problem more precisely?

I think so, but only using a compiler plugin I am afraid. So I guess we cannot count that as a solution.

@kyay10 experimented with special overloading syntax, to detect when a DSL returns Function1 as value but as discussed in this issue it doesn't really solve the problem for other lazy computation types. So I am afraid that both the overload solution, and runtime solution is not a good option...

@Zordid
Copy link
Contributor Author

Zordid commented Mar 13, 2024

Has this really been an issue for anyone? I mean, yes, it breaks at runtime when you use the Raise context during the computation at a later time - but hey, yes, that hurts. If we cannot tell this at compile time though, we should not make good code break to protect people who did not understand.
I would revert to what we had.

@nomisRev
Copy link
Member

I agree @Zordid. I am okay with reverting. Does anyone need a patch on the latest version, or are you okay with using a workaround/older version until 2.0(-RC) lands?

Any other thoughts @kyay10 @serras

@serras
Copy link
Member

serras commented Mar 13, 2024

This thread definitely gives enough reasons to revert the change. I think we should do a 1.2.4 bug fix release, given that we haven’t completely moved main to 2.0 yet.

@nomisRev
Copy link
Member

Yes, I was thinking the same @serras 👍

If we want to switch 2.0 to main we can, we can just make a new branch from the latest commit before doing that and release 1.2.4 from there. So doesn't have to be blocking, but let's plan some dates anyway!

@serras
Copy link
Member

serras commented Apr 3, 2024

This problem is fixed as per this commit. Version 1.2.4 is on its way to Maven Central.

@serras serras closed this as completed Apr 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants