New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-31976][ML][PYSPARK] LinearSVC use MemoryUsage to control the size of block #28974
Conversation
Test build #124865 has finished for PR 28974 at commit
|
ping @WeichenXu123 @mengxr |
cc @huaxingao @srowen too |
Oops, mistake. |
@Since("3.1.0") | ||
def setBlockSize(value: Int): this.type = set(blockSize, value) | ||
setDefault(blockSize -> 1) | ||
def setMaxBlockMemoryInMB(value: Int): this.type = set(maxBlockMemoryInMB, value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: add a scala doc?
@zhengruifeng Did you have a chance to do a benchmark test to verify the performance gain? |
@huaxingao since this changed is suggested by @mengxr and @WeichenXu123 , I perfer to append the performace tests after they think current design is OK. In current commit, if the Existing performace tests were done against |
We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. |
What changes were proposed in this pull request?
in LinearSVC, use
maxBlockMemoryInMB
instead ofblockSize
Why are the changes needed?
According to the performance test in https://issues.apache.org/jira/browse/SPARK-31783, the performance gain is mainly related to the nnz of block.
Does this PR introduce any user-facing change?
yes,
blockSize
is changed tomaxBlockMemoryInMB
How was this patch tested?
added testsuites