Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
In sc.runJob, actually clean the inner closure
If a closure is passed into another closure as a reference, the inner closure will be a field of the outer closure. In sc.runJob, we used to only clean the outer closure, leaving the inner one uncleaned. Simple reproduction: Wrap RDD#take in a closure. For instance: // The body "..." here contains a return statement def take(num: Int): Array[T] = (1 to 1).foreach { _ => ... } Now if you call `sc.parallelize(1 to 10).take(5)`, the closure cleaner will not be able to find the return statement in the `foreach` closure. This is because it's not even cleaning the `foreach` closure. Instead, it will fail with a not serializable exception complaining the internal java.lang.Object $nonLocalReturnKey is not serializable.
- Loading branch information