Skip to content

Comments

[SPARK-46939][CORE] Remove isClosureCandidate check from getSerializationProxy function in IndylambdaScalaClosures#44977

Closed
LuciferYang wants to merge 2 commits intoapache:masterfrom
LuciferYang:get-serialization-proxy
Closed

[SPARK-46939][CORE] Remove isClosureCandidate check from getSerializationProxy function in IndylambdaScalaClosures#44977
LuciferYang wants to merge 2 commits intoapache:masterfrom
LuciferYang:get-serialization-proxy

Conversation

@LuciferYang
Copy link
Contributor

What changes were proposed in this pull request?

This PR simplifies the getSerializationProxy function in the IndylambdaScalaClosures. The main change is to remove the isClosureCandidate check when determining if a class is a closure because Apache Spark 4.0 no longer support Scala 2.11.

Why are the changes needed?

The reason for removing the isClosureCandidate check is that in Scala 2.12 and later versions, closures are implemented as synthetic classes generated by the Java LambdaMetafactory. These synthetic classes do not need to implement the scala.Function interface, which is what the isClosureCandidate check was originally checking for. Therefore, the isClosureCandidate check is not necessary for identifying closures in Scala 2.12 and later versions.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Pass GitHub Actions

Was this patch authored or co-authored using generative AI tooling?

No

@LuciferYang LuciferYang marked this pull request as draft February 1, 2024 06:27
case c if !c.isSynthetic || !maybeClosure.isInstanceOf[Serializable] => None

case c if isClosureCandidate(c) =>
case _ =>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @rednaxelafx May I ask if this modification attempt is correct? Thanks

@LuciferYang LuciferYang changed the title [SPARK-46939][CORE] Simplify IndylambdaScalaClosures#getSerializationProxy [SPARK-46939][CORE] Remove isClosureCandidate check from getSerializationProxy function in IndylambdaScalaClosures Feb 1, 2024
@rednaxelafx
Copy link
Contributor

I'm neutral on lifting the isClosureCandidate restriction (as I already mentioned in the comment, it was expected that someone would life it in the future ;-) What I had in mind at the time was like "what if we need to also clean Java lambdas?", which only matters if we add JShell support into Spark.
All I'd ask is to implement it right instead of doing it half-baked.

My original intent with the isClosureCandidate check was to use a "quick" check to reduce the number of candidates flowing into the later parts. It had nothing to do with Scala 2.11 compatibility/support (so please correct that from your PR description). I had only been developing and testing with Scala 2.12's simple lambda cases, and I know that the ones I were testing with all implemented the scala.FunctionN traits, so just being conservative and limiting the support to the cases I've tested with, I added the isClosureCandidate and a few other restrictions (like the $anonfunc$ check that immediately followed, which again isn't a perfect filter, but one that likely filters out regular named functions instead of lambdas).

The special thing about language-level lambdas is that you can have higher confidence in that the class is generated by the compiler (or at least the recipe of the class comes from the compiler), so they'll follow certain patterns and won't have surprises with fields that are hand-added.

It certainly is possible to abuse the imperfect checks/filters I had put in place, to trick the cleaner to trust a non-compiler-generated class as being a Scala lambda... there's quite some room for improvement here.

@vsevolodstep-db Seva had been working on ClosureCleaner improvements like this one (#42995) which found quite some cases that I didn't handle right...

@LuciferYang
Copy link
Contributor Author

@rednaxelafx I haven't thought of many of the issues you mentioned, thank you very much for your explanation. At the same time, I think it's better to keep the current state, so let me close this pull request first. Thanks ~

@LuciferYang LuciferYang closed this Feb 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants