-
Notifications
You must be signed in to change notification settings - Fork 28.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-7542] [SQL] Support off-heap index/sort buffer #9477
Conversation
77555e1
to
3cb22d4
Compare
Test build #45072 has finished for PR 9477 at commit
|
Test build #45076 has finished for PR 9477 at commit
|
Test build #45070 has finished for PR 9477 at commit
|
3cb22d4
to
89319e0
Compare
@JoshRosen This is ready for review |
try { | ||
acquireMemory(needed); // could trigger spilling | ||
// could trigger spilling | ||
page = taskMemoryManager.allocatePage(used * 2, this); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implicit here is the fact that the in memory sorter's only source of memory usage is the pointer array itself. That's fine, though.
LGTM overall, but I'd like to address one concern before merging: I'm worried that passing both the Note that the
I think that we should mark those TaskMemoryManager methods as methods that are only supposed to be called by the memory consumer and not directly by developers. The problem with calling them directly is that the bookkeeping in the MemoryConsumer itself won't have been updated. This current API has such a high potential for this type of misuse that I think we should look at fancy Java-isms to restrict the visibility / callability of those methods such that they can only be called from MemoryConsumer. If you fix this by having those classes only allocate through their MemoryConsumers then feel free to merge as soon as this passes tests. I can take care of a followup to fix the documentation / code to avoid this type of misuse. |
Test build #1983 has finished for PR 9477 at commit
|
@JoshRosen I should had addressed your comments, will merge this once pass the tests. |
Test build #45096 has finished for PR 9477 at commit
|
Test build #45108 has finished for PR 9477 at commit
|
Test build #45110 has finished for PR 9477 at commit
|
Jenkins, retest this please. |
try { | ||
acquireMemory(needed); // could trigger spilling | ||
// could trigger spilling | ||
array = allocateArray(used / 16 * 2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be / 8 * 2
instead, since we want to double the number of slots in the array and each slot requires 8 bytes of memory?
Test build #45125 has finished for PR 9477 at commit
|
Test build #45127 has finished for PR 9477 at commit
|
Test build #1989 has finished for PR 9477 at commit
|
The failed test is not related, will re-run it. |
Test build #1993 has finished for PR 9477 at commit
|
LGTM, so I'm going to merge this into master and 1.6. Thanks! |
This brings the support of off-heap memory for array inside BytesToBytesMap and InMemorySorter, then we could allocate all the memory from off-heap for execution. Closes #8068 Author: Davies Liu <davies@databricks.com> Closes #9477 from davies/unsafe_timsort. (cherry picked from commit eec74ba) Signed-off-by: Josh Rosen <joshrosen@databricks.com>
Test build #1994 has finished for PR 9477 at commit
|
This brings the support of off-heap memory for array inside BytesToBytesMap and InMemorySorter, then we could allocate all the memory from off-heap for execution.
Closes #8068