Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP]Use DefaultBlockingPool for Global Processing Pool instead of StupidPool that can allocate arbitrary number of buffers and cause crashes. #5345

Closed
wants to merge 3 commits into from

Conversation

himanshug
Copy link
Contributor

@himanshug himanshug commented Feb 5, 2018

Fixes #5319

also following

TODO: check if hardcoded timeout of 1 minute in all places is OK.

…ool that can allocate arbitrary number of buffers and cause crashes.
@@ -188,7 +188,7 @@ protected Integer addToFacts(
lastBuffer.capacity() - bufferOffset >= aggsTotalSize) {
aggBuffer = lastBuffer;
} else {
ResourceHolder<ByteBuffer> bb = bufferPool.take();
ResourceHolder<ByteBuffer> bb = bufferPool.takeOrFailOnTimeout(60000);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this competing with the processing pool? that means you can choke out processing threads on accident while incremental indexing is going on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, that would be true.
i think it is more of a limitation of current OffheapIncrementalIndex implementation which can not work with fixed amount of resources and would keep on allocating more and more buffers. also, this impl keeps dimensions etc on-heap so doesn't really serve purpose of being off-heap ... things become too slow if dimensions are pushed off-heap due to repeated serde .
FWIW, for above reasons, no-one actually uses current OffheapIncrementalIndex implementation and we can possibly remove it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm considering the removal of OffheapIncrementalIndex in this patch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also it was only ever used by GroupBy-v1 implementation if explicitly configured .. it was not possible to use OffheapIncrementalIndex for indexing actually.

//check that stupid pool gives buffers that can hold at least one row's aggregators
ResourceHolder<ByteBuffer> bb = bufferPool.take();
//check that buffer pool gives buffers that can hold at least one row's aggregators
ResourceHolder<ByteBuffer> bb = bufferPool.takeOrFailOnTimeout(60000);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can also choke the processing pool, right?

@himanshug himanshug added this to the 0.13.0 milestone Feb 9, 2018
*
* @return a resource, or throw RuntimeException on timeout.
*/
default ReferenceCountingResourceHolder<T> takeOrFailOnTimeout(long timeoutMs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this throw a checked Timeout exception of some kind?

new OffheapBufferGenerator("intermediate processing", config.intermediateComputeSizeBytes()),
config.getNumThreads(),
config.poolCacheMaxCount()
config.getNumThreads()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this mean poolCacheMaxCount needs to be removed from docs?

@jihoonson
Copy link
Contributor

Hi guys, I don't think this should be a blocking issue for 0.13.0 release. I'll remove the milestone.

@jihoonson jihoonson removed this from the 0.13.0 milestone Sep 17, 2018
@stale
Copy link

stale bot commented Feb 28, 2019

This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the dev@druid.apache.org list. Thank you for your contributions.

@stale stale bot added the stale label Feb 28, 2019
@stale
Copy link

stale bot commented Mar 7, 2019

This pull request has been closed due to lack of activity. If you think that is incorrect, or the pull request requires review, you can revive the PR at any time.

@stale stale bot closed this Mar 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants