New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-14679] [UI] Fix UI DAG visualization OOM. #12437
Conversation
The DAG visualization can cause an OOM when generating the DOT file. This happens because clusters are not correctly deduped by a contains check because they use the default equals implementation.
Test build #55986 has finished for PR 12437 at commit
|
|
||
override def equals(other: Any): Boolean = other match { | ||
case that: RDDOperationCluster => | ||
(that canEqual this) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need canEqual
since that
is already known to be a RDDOperationCluster
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check ensures that equality is symmetric if subclasses have a different definition of equals. As I understand scala best practices, it should be there if the class can be extended. Right now this isn't extended, but I thought it may be in the future so I included it. I'm fine either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK this is another way or saying "this.getClass == that.getClass", essentially? I think it is OK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's similar, but it's calling a method on the other object to see if it would reject equality. There's a good explanation of canEqual
on StackOverflow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only benefit you get with canEqual
over this.getClass == that.getClass
is that it's a little bit less strict.
See https://www.artima.com/lejava/articles/equality.html (look for "getClass" in the text to avoid reading the whole thing)
@rdblue Can this PR fix the case like this: 2016-02-24 15:40:20,260 | ERROR | [qtp1927776715-4120] | Failed to make dot file of stage 619 | org.apache.spark.Logging$class.logError(Logging.scala:96)
java.lang.OutOfMemoryError: Requested array size exceeds VM limit
at java.util.Arrays.copyOf(Arrays.java:3332)
at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:137) |
Test build #56123 has finished for PR 12437 at commit
|
@SaintBacchus, I think that error message is essentially the same as that we were seeing, it's just hitting a different limitation when expanding: |
9764b43
to
f05edc1
Compare
Test build #56125 has finished for PR 12437 at commit
|
The test failure looks unrelated to my changes. |
Test build #2828 has finished for PR 12437 at commit
|
Merged to master/1.6 |
## What changes were proposed in this pull request? The DAG visualization can cause an OOM when generating the DOT file. This happens because clusters are not correctly deduped by a contains check because they use the default equals implementation. This adds a working equals implementation. ## How was this patch tested? This adds a test suite that checks the new equals implementation. Author: Ryan Blue <blue@apache.org> Closes #12437 from rdblue/SPARK-14679-fix-ui-oom. (cherry picked from commit a345111) Signed-off-by: Sean Owen <sowen@cloudera.com>
Thank you @srowen! |
## What changes were proposed in this pull request? The DAG visualization can cause an OOM when generating the DOT file. This happens because clusters are not correctly deduped by a contains check because they use the default equals implementation. This adds a working equals implementation. ## How was this patch tested? This adds a test suite that checks the new equals implementation. Author: Ryan Blue <blue@apache.org> Closes apache#12437 from rdblue/SPARK-14679-fix-ui-oom. (cherry picked from commit a345111) Signed-off-by: Sean Owen <sowen@cloudera.com> (cherry picked from commit 17b1384)
What changes were proposed in this pull request?
The DAG visualization can cause an OOM when generating the DOT file.
This happens because clusters are not correctly deduped by a contains
check because they use the default equals implementation. This adds a
working equals implementation.
How was this patch tested?
This adds a test suite that checks the new equals implementation.