Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

restrict traverse_ and friends to require Unit #4352

Closed
wants to merge 6 commits into from

Conversation

jpassaro
Copy link

@jpassaro jpassaro commented Nov 18, 2022

When using traverse_ or sequence_ to evaluate some applicative effect G[B] within the context of a foldable structure, any B value is thrown away. Per the scaladoc for traverse_, these functions expect that the G[B] is primarily a side effect or action. The scala convention for expressing this expectation is to require a Unit parameter.

Teach traverse_, sequence_, and their parallel and nonempty analogues to follow this convention by accepting only G[Unit] instead of simply any G[B].

Furthermore, update the implementations of traverse_ and nonEmptyTraverse to use foldMapA and reduceMapA respectively. Since requiring Unit gives us a (trivial) Monoid for free, it turns out that in the presence of such a monoid, the ***MapA functions are doing exactly the same thing as the existing implementations of traverse_ and nonEmptyTraverse_.

@jpassaro
Copy link
Author

by the way -- the PR template asked me to run sbt +prePR. Doing so my GC got pushed to its limit. I thought this would be controlled via .sbtopts (i.e. in source-controlled code) but if there's something I need to do on my end to make this work, please let me know.

I also think @djspiewak had some insight about a potentially better way to accomplish this.

@djspiewak
Copy link
Member

So I have a more controversial idea: what if we just change traverse_ to do this? It would be entirely binary compatible but source incompatible, but we do that all the time in minor releases.

@johnynek
Copy link
Contributor

I do think it would be better for these G[Unit] returning function should accept functions that return G[Unit], so it is more clear that there is no result in the G[_].

Also, it composes nicer: traverse_: (A => G[Unit]) => (F[A] => G[Unit])

that said... it is also just foldMap with the Monoid coming from Applicative.unit and Applicative.productR(_, _)

/**
* Variant on travarse_ which enforces the "action or effect" expectation.
*/
def traverseEach[G[_], A](fa: F[A])(f: A => G[Unit])(implicit G: Applicative[G]): G[Unit] =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally don't love these aliases where two functions are actually always exactly the same.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hence why I think we should just replace traverse_ in place, since it's binary compatible.

@djspiewak
Copy link
Member

that said... it is also just foldMap with the Monoid coming from

If you happen to have the free Monoid in scope for any Applicative, sure. :-) But a lot of applicatives don't expose that implicitly.

When using `traverse_` or `sequence_` to evaluate some applicative
effect `G[B]` within the context of a foldable structure, any `B` value
is thrown away. Per the scaladoc for `traverse_`, these functions expect
that the `G[B]` is primarily a side effect or action.  The scala
convention for expressing this expectation is to require a `Unit`
parameter.

Teach `traverse_`, `sequence_`, and their parallel and nonempty
analogues to follow this convention by accepting only `G[Unit]` instead
of simply any `G[B]`.

Furthermore, update the implementations of `traverse_` and
`nonEmptyTraverse` to use `foldMapA` and `reduceMapA` respectively.
Since requiring Unit gives us a (trivial) `Monoid` for free, it turns
out that in the presence of such a monoid, the `***MapA` functions are
doing exactly the same thing as the existing implementations of
`traverse_` and `nonEmptyTraverse_`.
@jpassaro jpassaro changed the title implement traverse_ variant which requires G[Unit] restrict traverse_ and friends to require Unit Nov 20, 2022
@jpassaro
Copy link
Author

I'm convinced and have updated the branch. It's a bigger change now of course but it did pass +prePR (after boosting memory in .jvmopts considerably!).

To the point about foldMap(), I realized that even closer is foldMapA, whose implementation is nearly identical to the existing implementation of traverse_ in light of the Monoid on Unit. I went ahead and replaced the original implementation with foldMapA for that reason.

Because foldMapA recommends using foldMapM if possible for perf reasons, I wonder if the same thing makes sense here -- can or should cats add traverseEachM which does the same thing but requires Monad instead of Applicative?

def traverse_[G[_], A, B](fa: F[A])(f: A => G[B])(implicit G: Applicative[G]): G[Unit] =
foldRight(fa, Always(G.pure(()))) { (a, acc) =>
G.map2Eval(f(a), acc) { (_, _) =>
()
}
}.value
def traverse_[G[_], A](fa: F[A])(f: A => G[Unit])(implicit G: Applicative[G]): G[Unit] =
foldMapA(fa)(f)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid this change is not compatible. Although we can change the signature, we need to keep the current implementation. See the test I added in 41439fc.

sbt:root> binCompatTest/test
catsBC.MimaExceptionsTest:
==> X catsBC.MimaExceptionsTest.is binary compatible  0.566s java.lang.ClassCastException: class java.lang.String cannot be cast to class scala.runtime.BoxedUnit (java.lang.String is in module java.base of loader 'bootstrap'; scala.runtime.BoxedUnit is in unnamed module of loader sbt.internal.ScalaLibraryClassLoader @15aec8f1)
    at cats.instances.EitherInstances$$anon$2.$anonfun$map2Eval$2(either.scala:95)
    at scala.util.Either.map(Either.scala:382)
    at cats.instances.EitherInstances$$anon$2.$anonfun$map2Eval$1(either.scala:95)
    at cats.Eval.$anonfun$map$1(Eval.scala:78)
    at cats.Eval$.loop$1(Eval.scala:379)
    at cats.Eval$.cats$Eval$$evaluate(Eval.scala:384)
    at cats.Eval$FlatMap.value(Eval.scala:305)
    at cats.instances.ListInstances$$anon$1.traverse_(list.scala:163)
    at cats.instances.ListInstances$$anon$1.traverse_(list.scala:37)
    at cats.Foldable$Ops.traverse_(Foldable.scala:1050)
    at cats.Foldable$Ops.traverse_$(Foldable.scala:1049)
    at cats.Foldable$ToFoldableOps$$anon$6.traverse_(Foldable.scala:1075)
    at catsBC.MimaExceptions$.isBinaryCompatible(MimaExceptions.scala:58)
    at catsBC.MimaExceptionsTest.$anonfun$new$1(MimaExceptionsTest.scala:28)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or possibly, this implementation could work:

foldMapA(fa)(a => G.void(f(a)))

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yikes -- neither G.void nor even the original implementation seems to make this go away, at least on my local machine. Does that mean the change can't be made with binary compatibility at all?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Damn, really? I thought it should be possible, if done carefully. But maybe not.

Also, maybe it's not worth the risk since clearly this is not easy to reason about.

Copy link
Author

@jpassaro jpassaro Nov 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pushed my branch with (credited) breaking test, and attempt to fix it by restoring original implementations. It didn't work. Wondering if:

a) there is some path to binary compatibility that I don't understand;
b) is it better to add new functions (traverseEach, sequenceEach, maybe traverseEachM for monads) and possibly deprecate the unrestricted variants (traverse_ and sequence_)?

happy to proceed however y'all think best. many thanks @armanbilge for spotting this issue

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw: what is the motivation for making this change?

foldMapA is relatively rarely used, I think. While we have worked out many stack overflow issues around traverse and sequence. Touching these implementations has an excellent chance of introducing stack overflow errors for users I think.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the motivation was that the implementation is identical so why not reduce repetition; but since it broke binary compatibility (see Arman's added MIMA test, now added to the PR) I reverted it.

Comment on lines 47 to 53
ArbFM: Arbitrary[F[M]],
ArbXM: Arbitrary[X[M]],
ArbYM: Arbitrary[Y[M]],
ArbFGA: Arbitrary[F[G[A]]],
ArbGU: Arbitrary[G[Unit]],
ArbFGU: Arbitrary[F[G[Unit]]],
ArbFXM: Arbitrary[F[X[M]]],
ArbGB: Arbitrary[G[B]],
ArbGM: Arbitrary[G[M]],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes will also cause compatibility problems.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assumed the correct answer here would be to leave the prior implicit params alone and just add mine at the end of arglist. the mima check is still complaining in 2.12 and I'm not sure how to resolve it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately we cannot change these signatures at all: no changes, no additions, no removals. It's not specific to Scala 2.12, that one probably just happened to fail first.

See #4324 (comment) for a possible strategy.

Comment on lines 38 to 39
ArbFGA: Arbitrary[F[G[A]]],
ArbGB: Arbitrary[G[B]],
ArbFGU: Arbitrary[F[G[Unit]]],
ArbGU: Arbitrary[G[Unit]],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

Co-authored-by: Arman Bilge <armanbilge@gmail.com>
Comment on lines +50 to +51
def sequence_(implicit F: Foldable[F], G: Applicative[G], ev: A <:< Unit): G[Unit] =
F.sequence_(fga.asInstanceOf[F[G[Unit]]])
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this one is also causing binary compatibility issues, and in this case I don't see how this function can still exist in its current form (i.e. without said evidence) -- I guess it would have to move or something and that would for sure break binary compatibility? not sure if there is a fix here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, we cannot change the signature of this method. What we can do is add a new ops class:

final class NestedUnitFoldableOps[F[_], G[_]](private val fgu: F[G[Unit]])

Copy link
Author

@jpassaro jpassaro Nov 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this one has to change then to F.traverse_(fga)(G.void(_))?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems so.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That still allows people to call .sequence on non-unit structures -- I would suggest deprecating it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

though even that seems so knotty I'm not sure it's worthwhile.

honestly, this is my feeling about this PR as a whole. It's obvious at the moment we don't even have existing test coverage to safely make this change.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we can't change signatures, what about adding new methods with an appropriate signature and deprecating the old ones?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recall I did something similar – i.e. was tuning a method with a very subtle change in its signature while preserving its name: #3997. Not exactly the same though, but there are some similarities apparent.

if we can't change signatures, what about adding new methods with an appropriate signature and deprecating the old ones?

Personally, I don't think it is a good idea, especially because the old methods do not have such bugs that would hinder their usage.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@armanbilge

It's obvious at the moment we don't even have existing test coverage to safely make this change.

Do you mean, there's no tests for traverse_ nor sequence_ or do you mean some specific test scenarios?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

val gb = f(a)
G.void(gb)
val fb = f(a)
G.void(fb)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't need this now. fb: G[Unit] so the void is a no-op.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We absolutely do, or the change is not compatible. See #4352 (comment).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow you. My claim is that f(a): G[Unit] already. While G.void(f(a)) is still G[Unit], it isn't needed and is likely to be wasteful.

In fact, unconditionally calling void internal to a method is a code smell that should have alerted us the type was poor to begin with.

That said, there could be ergonomic reasons to not have the users required to call this void.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow you. My claim is that f(a): G[Unit] already

Except, it's not really :) sure, that's what the current signature claims, and a signature of G[Unit] is binary-compatible with a signature of G[A] for an unbounded type parameter A (due to type erasure).

But Cats still has to retain compatibility with calling code that was passing e.g. a G[String] there. Otherwise we will get class cast exceptions, exactly as demonstrated in #4352 (comment).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johnynek just to spell it out: once the MIMA test was added, returning f(a) without void caused it to break.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to be pedantic :) actually, that's not a MiMa test. MiMa compares the binary-compatibility of signatures. MiMa can only detect linking errors but not other errors such as class casting exceptions. In this case there is no linking error, because due to type erasure the method signature is the same.

What I added was a runtime-compatibility test. Cats has an internal test project, that is compiled against Cats 2.0.0 but run against the latest development version of Cats. It shows that code compiled with an old version of Cats, that is not passing G[Unit] to these methods will encounter a class cast exception if we rewrite the code here to assume we are indeed getting a G[Unit].

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you for the explanation!

I'd love to understand the team's thoughts and @armanbilge 's in particular on:
a) what's your evaluation of the problem? to be precise: is there value, if we can do it in a binary-compatible way, in alerting users to the code-smell of calling traverse_ and friends with non-Unit values?
b) what do you think of the approach to that problem of adding aliases with the correct signature and deprecating the original methods?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is value. See this issue I opened proposing a lint for flatTap, which currently suffers from a similar problem.

The linting route offers a safe way to opt-in to this change.

However, there are different perspectives. Quoting @SystemFw from the Discord discussion linked in that issue.

this discussion has happened a few times
it boils down to safety vs boilerplate, because it applies also to >> <*, etc
I wouldn't say that flatTap having Unit makes it more safe in practice, because the purpose of flatTap is to throw the B away

Copy link
Contributor

@SystemFw SystemFw Nov 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, I can make the same exact point here. The whole point of traverse_ is the avoid the boilerplate of calling void, so if I have to do list.traverse_(a => f(a).void), why don't I just do list.traverse(f).void? (And if the answer is, to avoid allocating the list, if you really care about that there is still foldMapA)

That being said, I don't feel strongly about this change :)

val gb = f(a)
G.void(gb)
val fb = f(a)
G.void(fb)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment, this void is not needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants