-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-5656] Fail gracefully for large values of k and/or n that will ex... #4433
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… exceed max int. Large values of k and/or n in EigenValueDecomposition.symmetricEigs will result in array initialization to a value larger than Integer.MAX_VALUE in the following: var v = new Array[Double](n * ncv)
|
Can one of the admins verify this patch? |
|
ok to test |
|
Test build #26929 has started for PR 4433 at commit
|
|
Test build #26929 has finished for PR 4433 at commit
|
|
Test FAILed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems reasonable, though I have a few minor suggestions. I suppose <= is OK, technically? and if this size is worth checking, it's probably worth checking ncv * (ncv + 8) later. And then, maybe it's best to keep the check by the allocation so they don't get out of sync if someone changes them later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the feedback. I committed your suggestions into this PR.
Move the size check closer to array allocation, set to '<=' and add additional check.
|
Test build #26972 has started for PR 4433 at commit
|
|
Test build #26972 has finished for PR 4433 at commit
|
|
Test FAILed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, one other tiny thing: the problematic product is not necessarily 2_k_n here. I think you could simplify these error messages to just say that the k and n are too large, rather than explain the computation. Right now the expressions are repeated 3 places. Might be nice to keep that to 2, and keep them together.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just want to confirm. Are you suggesting just a single 'require' statement that will AND the two conditions? Then just provide a generic comment that either n or k are two large?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could, sure. Yes, I'm saying the caller may just need or want to know that, basically, k or n is too big. It would be simpler and easier than maintaining a longer explanation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I think i figured out what you are saying. Take a look. The check on k may be overkill, since by definition it must be less than n. Therefore k_n is always greater than k_k. There could always be that weird scenario though. Thanks for the help.
|
Test build #26977 has started for PR 4433 at commit
|
|
Test build #26977 has finished for PR 4433 at commit
|
|
Test PASSed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't mean to belabor this, but the new second error message implies that k alone is the problem, and that k exceeds 2^31-1, and those aren't necessarily true. Concretely I'm suggesting something simple like
require(n * ncv.toLong <= Integer.MAX_VALUE && ncv * (ncv.toLong + 8) <= Integer.MAX_VALUE,
s"k = $k and/or n = $n are too large to compute an eigendecomposition")
|
Test build #27002 has started for PR 4433 at commit
|
|
Test build #27002 has finished for PR 4433 at commit
|
|
Test PASSed. |
...ceed max int.
Large values of k and/or n in EigenValueDecomposition.symmetricEigs will result in array initialization to a value larger than Integer.MAX_VALUE in the following: var v = new Array[Double](n * ncv)