Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

autosize processing buffers based on direct memory sizing by default #6588

Merged
merged 7 commits into from
Dec 4, 2018

Conversation

clintropolis
Copy link
Member

This PR modifies DruidProcessingConfig property druid.processing.buffer.sizeBytes to compute a reasonable default based on the amount of direct memory, number of processing threads, and number of merge buffers instead of using a fixed 1GiB default buffer size. This should be much more friendly behavior out of the box, ensuring reasonably efficient usage of direct memory resources provided to the process, without interfering with operators who still wish to fine tune such things.

On process startup, DruidProcessingModule does a check:

memoryNeeded = druid.processing.buffer.sizeBytes * (druid.processing.numMergeBuffers + druid.processing.numThreads + 1)

to validate that the process has been given enough direct memory. Configs for numThreads and numMergeBuffers produce reasonable defaults if not manually set, but sizeBytes has a fixed default size of 1G, which may or may not work depending on the direct memory settings and core count. This formula is shifted around to produce a default value for sizeBytes. I'm not certain this is actually the optimal formula, since having a lot of merge buffers effectively eats into the amount of space reserved for things like decompressing blocks of segments, while the increased number of merge buffers increases the processing throughput of simultaneous group-by queries, which is a sort of conflict, but I think adjustments can be made in a future PR.

changes:

  • DruidProcessingConfig.intermediateComputeSizeBytes() now computes a default value based on -XX:MaxDirectMemorySize
  • Introduces RuntimeInfo class that wraps Runtime.getRuntime() to expose available processors and memory sizing information, mostly to allow control over these things in unit tests without setting flags on the jvm process, but it also nicely consolidates this stuff I guess?
  • org.apache.druid.common.utils.VMUtils has been moved and renamed org.apache.druid.utils.JvmUtils which has a static injected RuntimeInfo which is used in all Druid sources over calls to Runtime.getRuntime() methods.
  • Introduces RuntimeInfoModule to default injector to inject RuntimeInfo into JvmUtils

int numProcessingThreads = getNumThreads();
int numMergeBuffers = getNumMergeBuffers();
int totalNumBuffers = numMergeBuffers + numProcessingThreads;
int sizePerBuffer = (int) ((double) directSizeBytes / (double) (totalNumBuffers + 1));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just (int) ((double) directSizeBytes / (totalNumBuffers + 1)); would be ok.

@nishantmonu51
Copy link
Member

I have seen users set maxDirectMemory to arbitrary high values when they get the error that we have for not enough memory available. This change can potentially break such clusters.

I think to make it more useful we need to also have a upper threshold on the calculated values.

@clintropolis
Copy link
Member Author

I have seen users set maxDirectMemory to arbitrary high values when they get the error that we have for not enough memory available. This change can potentially break such clusters.

I think to make it more useful we need to also have a upper threshold on the calculated values.

It does have an upper threshold of 2GiB, is this too large?

@gianm
Copy link
Contributor

gianm commented Nov 13, 2018

I haven't seen 2GB buffers be super useful except in niche cases (really big aggregator states). So it might be better to have the cap be 1GB, which is the current default, which means nobody should run into the problem @nishantmonu51 brought up.

@nishantmonu51
Copy link
Member

agree with @gianm an upper limit of 1G seems better.

Copy link
Member

@nishantmonu51 nishantmonu51 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generally LGTM,

one more change needed is in the exception messages in groupby queries when the processing buffer size is not enough.
The current exception messages mentions -

"Try increasing druid.processing.buffer.sizeBytes"

But when auto-calculated, user might not know what is the current buffer size. It would be nicer if we can also add the current processing buffer size in the exception message as well -

"Try increasing druid.processing.buffer.sizeBytes. Current processing buffer size is [auto-calculated-size]"

@@ -50,7 +51,7 @@ public MapInputRowParser(
{
final List<String> dimensions = parseSpec.getDimensionsSpec().hasCustomDimensions()
? parseSpec.getDimensionsSpec().getDimensionNames()
: Lists.newArrayList(
: new ArrayList<>(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove the useless import com.google.common.collect.Lists.

Copy link
Member Author

@clintropolis clintropolis Nov 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

woops, this is an accidental side effect I did when looking at something else, will revert

@clintropolis
Copy link
Member Author

one more change needed is in the exception messages in groupby queries when the processing buffer size is not enough.

I was going to do this, but then noticed it's already printing the current buffer capacity at the start of that message if you're referring to the exception in BufferArrayGrouper

A record of size [%d] cannot be written to the array buffer at offset[%d] because it exceeds the buffer capacity[%d]. Try increasing druid.processing.buffer.sizeBytes

so it looks like someone beat me to it for that one at least. ParallelCombiner also has a similar message, I've added the current processing buffer size to it's messaging.

@fjy fjy added this to the 0.13.1 milestone Nov 21, 2018
@dclim
Copy link
Contributor

dclim commented Dec 4, 2018

👍

@dclim dclim merged commit a1c9d0a into apache:master Dec 4, 2018
clintropolis added a commit to implydata/druid-public that referenced this pull request Jan 10, 2019
@clintropolis clintropolis deleted the auto-buffers branch January 24, 2019 10:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants