autosize processing buffers based on direct memory sizing by default #6588

clintropolis · 2018-11-08T08:39:50Z

This PR modifies DruidProcessingConfig property druid.processing.buffer.sizeBytes to compute a reasonable default based on the amount of direct memory, number of processing threads, and number of merge buffers instead of using a fixed 1GiB default buffer size. This should be much more friendly behavior out of the box, ensuring reasonably efficient usage of direct memory resources provided to the process, without interfering with operators who still wish to fine tune such things.

On process startup, DruidProcessingModule does a check:

memoryNeeded = druid.processing.buffer.sizeBytes * (druid.processing.numMergeBuffers + druid.processing.numThreads + 1)

to validate that the process has been given enough direct memory. Configs for numThreads and numMergeBuffers produce reasonable defaults if not manually set, but sizeBytes has a fixed default size of 1G, which may or may not work depending on the direct memory settings and core count. This formula is shifted around to produce a default value for sizeBytes. I'm not certain this is actually the optimal formula, since having a lot of merge buffers effectively eats into the amount of space reserved for things like decompressing blocks of segments, while the increased number of merge buffers increases the processing throughput of simultaneous group-by queries, which is a sort of conflict, but I think adjustments can be made in a future PR.

changes:

DruidProcessingConfig.intermediateComputeSizeBytes() now computes a default value based on -XX:MaxDirectMemorySize
Introduces RuntimeInfo class that wraps Runtime.getRuntime() to expose available processors and memory sizing information, mostly to allow control over these things in unit tests without setting flags on the jvm process, but it also nicely consolidates this stuff I guess?
org.apache.druid.common.utils.VMUtils has been moved and renamed org.apache.druid.utils.JvmUtils which has a static injected RuntimeInfo which is used in all Druid sources over calls to Runtime.getRuntime() methods.
Introduces RuntimeInfoModule to default injector to inject RuntimeInfo into JvmUtils

QiuMM · 2018-11-08T17:26:49Z

processing/src/main/java/org/apache/druid/query/DruidProcessingConfig.java

+    int numProcessingThreads = getNumThreads();
+    int numMergeBuffers = getNumMergeBuffers();
+    int totalNumBuffers = numMergeBuffers + numProcessingThreads;
+    int sizePerBuffer = (int) ((double) directSizeBytes / (double) (totalNumBuffers + 1));


Just (int) ((double) directSizeBytes / (totalNumBuffers + 1)); would be ok.

processing/src/main/java/org/apache/druid/query/DruidProcessingConfig.java

nishantmonu51 · 2018-11-09T18:46:40Z

I have seen users set maxDirectMemory to arbitrary high values when they get the error that we have for not enough memory available. This change can potentially break such clusters.

I think to make it more useful we need to also have a upper threshold on the calculated values.

clintropolis · 2018-11-09T21:04:51Z

I have seen users set maxDirectMemory to arbitrary high values when they get the error that we have for not enough memory available. This change can potentially break such clusters.

I think to make it more useful we need to also have a upper threshold on the calculated values.

It does have an upper threshold of 2GiB, is this too large?

gianm · 2018-11-13T22:07:50Z

I haven't seen 2GB buffers be super useful except in niche cases (really big aggregator states). So it might be better to have the cap be 1GB, which is the current default, which means nobody should run into the problem @nishantmonu51 brought up.

nishantmonu51 · 2018-11-14T19:54:14Z

agree with @gianm an upper limit of 1G seems better.

nishantmonu51

generally LGTM,

one more change needed is in the exception messages in groupby queries when the processing buffer size is not enough.
The current exception messages mentions -

"Try increasing druid.processing.buffer.sizeBytes"

But when auto-calculated, user might not know what is the current buffer size. It would be nicer if we can also add the current processing buffer size in the exception message as well -

"Try increasing druid.processing.buffer.sizeBytes. Current processing buffer size is [auto-calculated-size]"

asdf2014 · 2018-11-15T03:17:11Z

core/src/main/java/org/apache/druid/data/input/impl/MapInputRowParser.java

@@ -50,7 +51,7 @@ public MapInputRowParser(
  {
    final List<String> dimensions = parseSpec.getDimensionsSpec().hasCustomDimensions()
                                    ? parseSpec.getDimensionsSpec().getDimensionNames()
-                                    : Lists.newArrayList(
+                                    : new ArrayList<>(


Please remove the useless import com.google.common.collect.Lists.

woops, this is an accidental side effect I did when looking at something else, will revert

clintropolis · 2018-11-15T08:31:01Z

one more change needed is in the exception messages in groupby queries when the processing buffer size is not enough.

I was going to do this, but then noticed it's already printing the current buffer capacity at the start of that message if you're referring to the exception in BufferArrayGrouper

A record of size [%d] cannot be written to the array buffer at offset[%d] because it exceeds the buffer capacity[%d]. Try increasing druid.processing.buffer.sizeBytes

so it looks like someone beat me to it for that one at least. ParallelCombiner also has a similar message, I've added the current processing buffer size to it's messaging.

processing/src/main/java/org/apache/druid/query/DruidProcessingConfig.java

dclim · 2018-12-04T01:39:40Z

👍

autosize processing buffers based on direct memory sizing

6e0828f

QiuMM reviewed Nov 8, 2018

View reviewed changes

remove oops, more test

7601f30

gianm assigned dclim Nov 13, 2018

clintropolis added 2 commits November 13, 2018 15:35

max 1gb autosize buffers, test, start of docs

a988911

fix oops

3207935

nishantmonu51 reviewed Nov 14, 2018

View reviewed changes

asdf2014 reviewed Nov 15, 2018

View reviewed changes

clintropolis added 2 commits November 15, 2018 00:10

revert accidental change

c9e07ec

print buffer size in exception

9b91697

QiuMM mentioned this pull request Nov 15, 2018

optimize input row parsers #6590

Merged

dclim reviewed Nov 15, 2018

View reviewed changes

processing/src/main/java/org/apache/druid/query/DruidProcessingConfig.java Outdated Show resolved Hide resolved

dclim reviewed Nov 15, 2018

View reviewed changes

processing/src/main/java/org/apache/druid/query/DruidProcessingConfig.java Outdated Show resolved Hide resolved

change the things

ac49490

fjy added this to the 0.13.1 milestone Nov 21, 2018

dclim merged commit a1c9d0a into apache:master Dec 4, 2018

clintropolis added a commit to implydata/druid-public that referenced this pull request Jan 10, 2019

backport important part of apache#6588

775ca95

clintropolis deleted the auto-buffers branch January 24, 2019 10:29

jon-wei added the Release Notes label Feb 21, 2019

jon-wei mentioned this pull request Feb 22, 2019

0.14.0-incubating release notes #7126

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

autosize processing buffers based on direct memory sizing by default #6588

autosize processing buffers based on direct memory sizing by default #6588

clintropolis commented Nov 8, 2018

QiuMM Nov 8, 2018

nishantmonu51 commented Nov 9, 2018

clintropolis commented Nov 9, 2018

gianm commented Nov 13, 2018

nishantmonu51 commented Nov 14, 2018

nishantmonu51 left a comment •

edited

Loading

asdf2014 Nov 15, 2018

clintropolis Nov 15, 2018 •

edited

Loading

clintropolis commented Nov 15, 2018

dclim commented Dec 4, 2018

autosize processing buffers based on direct memory sizing by default #6588

autosize processing buffers based on direct memory sizing by default #6588

Conversation

clintropolis commented Nov 8, 2018

QiuMM Nov 8, 2018

Choose a reason for hiding this comment

nishantmonu51 commented Nov 9, 2018

clintropolis commented Nov 9, 2018

gianm commented Nov 13, 2018

nishantmonu51 commented Nov 14, 2018

nishantmonu51 left a comment • edited Loading

Choose a reason for hiding this comment

asdf2014 Nov 15, 2018

Choose a reason for hiding this comment

clintropolis Nov 15, 2018 • edited Loading

Choose a reason for hiding this comment

clintropolis commented Nov 15, 2018

dclim commented Dec 4, 2018

nishantmonu51 left a comment •

edited

Loading

clintropolis Nov 15, 2018 •

edited

Loading