-
Notifications
You must be signed in to change notification settings - Fork 28k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-7318][Streaming] DStream cleans objects that are not closures #5860
Conversation
Merged build triggered. |
Merged build started. |
Test build #31655 has started for PR 5860 at commit |
Test build #31655 has finished for PR 5860 at commit
|
Merged build finished. Test FAILed. |
Test FAILed. |
Fantastic catch! Though this does not merge. |
Yeah I know. I conflicted with myself. |
…re-cleaner Conflicts: core/src/main/scala/org/apache/spark/util/ClosureCleaner.scala
Merged build triggered. |
Merged build started. |
Test build #31673 has started for PR 5860 at commit |
Test build #31673 has finished for PR 5860 at commit
|
Merged build finished. Test FAILed. |
Test FAILed. |
This breaks a valid use case where the user code passes in a case class into `map`. See ml.NormalizerSuite.
@pwendell I don't think we can throw an exception if it's not a closure after all. The user code may look like |
Merged build triggered. |
Merged build started. |
Test build #31681 has started for PR 5860 at commit |
Test build #31681 has finished for PR 5860 at commit
|
Merged build finished. Test PASSed. |
Test PASSed. |
@@ -179,6 +179,11 @@ private[spark] object ClosureCleaner extends Logging { | |||
cleanTransitively: Boolean, | |||
accessedFields: Map[Class[_], Set[String]]): Unit = { | |||
|
|||
if (!isClosure(func.getClass)) { | |||
logWarning("Expected a closure; got " + func.getClass.getName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pwendell Is this okay to log a warning?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason not to make this an assertion? Isn't it simply invalid if we call this on something that is not a closure?
LGTM, barring @pwendell LGTMs the ClosureCleaner change. |
I LGTM too - I had a comment but realize it was already addressed. |
Thanks for all the LGTMs. I'm merging this into master 1.4 |
I added a check in `ClosureCleaner#clean` to fail fast if this is detected in the future. tdas Author: Andrew Or <andrew@databricks.com> Closes #5860 from andrewor14/streaming-closure-cleaner and squashes the following commits: 8e971d7 [Andrew Or] Do not throw exception if object to clean is not closure 5ee4e25 [Andrew Or] Fix tests eed3390 [Andrew Or] Merge branch 'master' of github.com:apache/spark into streaming-closure-cleaner 67eeff4 [Andrew Or] Add tests a4fa768 [Andrew Or] Clean the closure, not the RDD (cherry picked from commit 57e9f29) Signed-off-by: Andrew Or <andrew@databricks.com>
I added a check in `ClosureCleaner#clean` to fail fast if this is detected in the future. tdas Author: Andrew Or <andrew@databricks.com> Closes apache#5860 from andrewor14/streaming-closure-cleaner and squashes the following commits: 8e971d7 [Andrew Or] Do not throw exception if object to clean is not closure 5ee4e25 [Andrew Or] Fix tests eed3390 [Andrew Or] Merge branch 'master' of github.com:apache/spark into streaming-closure-cleaner 67eeff4 [Andrew Or] Add tests a4fa768 [Andrew Or] Clean the closure, not the RDD
I added a check in `ClosureCleaner#clean` to fail fast if this is detected in the future. tdas Author: Andrew Or <andrew@databricks.com> Closes apache#5860 from andrewor14/streaming-closure-cleaner and squashes the following commits: 8e971d7 [Andrew Or] Do not throw exception if object to clean is not closure 5ee4e25 [Andrew Or] Fix tests eed3390 [Andrew Or] Merge branch 'master' of github.com:apache/spark into streaming-closure-cleaner 67eeff4 [Andrew Or] Add tests a4fa768 [Andrew Or] Clean the closure, not the RDD
I added a check in `ClosureCleaner#clean` to fail fast if this is detected in the future. tdas Author: Andrew Or <andrew@databricks.com> Closes apache#5860 from andrewor14/streaming-closure-cleaner and squashes the following commits: 8e971d7 [Andrew Or] Do not throw exception if object to clean is not closure 5ee4e25 [Andrew Or] Fix tests eed3390 [Andrew Or] Merge branch 'master' of github.com:apache/spark into streaming-closure-cleaner 67eeff4 [Andrew Or] Add tests a4fa768 [Andrew Or] Clean the closure, not the RDD
I added a check in
ClosureCleaner#clean
to fail fast if this is detected in the future. @tdas