-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-22328][Core] ClosureCleaner should not miss referenced superclass fields #19556
Conversation
Test build #82971 has finished for PR 19556 at commit
|
Test build #82973 has finished for PR 19556 at commit
|
Test build #82975 has finished for PR 19556 at commit
|
Test build #82987 has finished for PR 19556 at commit
|
setAccessedFields(outerClass, clone, obj, accessedFields) | ||
|
||
var superClass = outerClass.getSuperclass() | ||
while (superClass != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be something more like ...
var currentClass = outerClass
do {
setAccessedFields(currentClass, clone, obj, accessedFields)
currentClass = currentClass.getSuperclass()
} while (currentClass != null)
Just avoids repeating the key method call here. Same above and below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Looks good.
cc @cloud-fan for review too. |
Test build #82999 has finished for PR 19556 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't feel very expert on the ClosureCleaner, but have been looking at it a lot recently, and this looks reasonable to me at a glance.
/** Initializes the accessed fields for outer classes and their super classes. */ | ||
private def initAccessedFields( | ||
accessedFields: Map[Class[_], Set[String]], | ||
outerClasses: Seq[Class[_]]): Unit = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what if multiple outer classes have the same parent class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think from the view of closure, even multiple outer classes have the same parent class, the access of the fields in the parent class shouldn't conflict.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a related test. Please see if it can clarify your concern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Test build #83016 has finished for PR 19556 at commit
|
outerClasses: Seq[Class[_]]): Unit = { | ||
for (cls <- outerClasses) { | ||
var currentClass = cls | ||
do { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel it's better to use while
loop here. Programmatically the loop requires currentClass != null
, even for the first loop. To completely keep the previous behavior, we can add a assert(cls != null)
before the loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. Updated.
|
||
var currentClass = outerClass | ||
do { | ||
setAccessedFields(currentClass, clone, obj, accessedFields) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assume we have class A
and B
having the same parent class P
. P
has 2 fields a
and b
. The closure accessed A.a
and B.b
, so when we clone A
object, we should only set field a
, when we clone B
object, we should only set field b
. However here seems we set field a
and b
for A
and B
object, which is sub-optimal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems this is also a issue for the outerClasses
, maybe I missed something...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems that is true. For a closure that only accessed A.a
, we clone the whole A
object which contains both a
and b
fields. This is the fact in existing ClosureCleaner
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this is not a regression, IIUC, will it block this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea let's leave it
Test build #83037 has finished for PR 19556 at commit
|
val clone = instantiateClass(outerClass, parent) | ||
|
||
var currentClass = outerClass | ||
do { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please update this do while loop too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. It's late, I will update this and below tomorrow. Thanks.
new FieldAccessFinder(fields, findTransitively, Some(m), visitedMethods), 0) | ||
|
||
var currentClass = cl | ||
do { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here too
@cloud-fan Two remaining do while loop are updated. |
Test build #83069 has finished for PR 19556 at commit
|
thanks, merging to master/2.2! |
…ass fields When the given closure uses some fields defined in super class, `ClosureCleaner` can't figure them and don't set it properly. Those fields will be in null values. Added test. Author: Liang-Chi Hsieh <viirya@gmail.com> Closes #19556 from viirya/SPARK-22328. (cherry picked from commit 4f8dc6b) Signed-off-by: Wenchen Fan <wenchen@databricks.com>
…ass fields When the given closure uses some fields defined in super class, `ClosureCleaner` can't figure them and don't set it properly. Those fields will be in null values. Added test. Author: Liang-Chi Hsieh <viirya@gmail.com> Closes apache#19556 from viirya/SPARK-22328. (cherry picked from commit 4f8dc6b) Signed-off-by: Wenchen Fan <wenchen@databricks.com>
What changes were proposed in this pull request?
When the given closure uses some fields defined in super class,
ClosureCleaner
can't figure them and don't set it properly. Those fields will be in null values.How was this patch tested?
Added test.