New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proxied requests query other nodes in parallel #2779

merged 8 commits into from Sep 8, 2016


None yet
3 participants

dennisoelkers commented Sep 6, 2016


This change makes the ProxiedResource base class perform requests to other nodes in the cluster in parallel, when all nodes are supposed to be queried. This reduces the increase of round trip times with growing cluster sizes, which was growing linearly before (due to requests being executed in a single thread).

The HTTP requests are being executed on a shared thread pool, which has a maximum size, defined by the proxied_requests_max_threads config setting. The default for this is 64, which is rather arbitrary. Coming up with a sane default for this is hard, as it is not related to the number of available CPUs (as most of the time the threads will be sleeping/blocked on IO, so overprovisioning of CPUs is desired to achieve good performance) and the number of nodes in the cluster is dynamically changing during runtime (and maybe not available during startup). Any recommendations for a good default are welcome.

Motivation and Context

For large cluster sizes, performing proxied requests sequentially could lead to large round trip times, which might exceed the defined timeout. This could lead to functionality being unavailable or even an overloaded Graylog server.

How Has This Been Tested?

Node cleanup was disabled and different ranges of dummy node table entries were created, with transport addresses pointing to a local http stub returning dummy metrics responses. Then, a cluster metrics request was sent to the Graylog server and round trip times were measured.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)


  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

dennisoelkers added some commits Sep 5, 2016

@dennisoelkers dennisoelkers changed the title from Proxied requests query other nodes parallely to Proxied requests query other nodes in parallel Sep 6, 2016

@dennisoelkers dennisoelkers added this to the 2.1.1 milestone Sep 6, 2016


This comment has been minimized.


bernd commented Sep 6, 2016

I am not really happy about introducing another thread pool to the system. A small Graylog server already has around 160 threads in several pools. (last time I checked) Most of them are idle, though. I would rather like to have a few configurable thread pools for different purposes than every new or modified subsystem adding their own thread pools.

But I guess that's a larger refactoring and we shouldn't do that right now.

So for the pool sizing I would like to avoid adding 64 threads by default. The HTTP requests in the proxied resources are currently single threaded, so I would actually use a pretty low default like 2 or 4 which already is an improvement for smaller setups. In bigger setups where this is still a problem the pool size needs to be adjusted.

@joschi joschi self-assigned this Sep 7, 2016

@@ -163,6 +163,10 @@
@Parameter(value = "web_tls_key_password")
private String webTlsKeyPassword;
@Parameter(value = "proxied_requests_max_threads", required = true, validator = PositiveIntegerValidator.class)
// TODO: this is a totally abitrary number. this needs a better default based on ... something.
private int proxiedRequestsMaxThreads = 64;

This comment has been minimized.


joschi Sep 7, 2016


I agree with @bernd (#2779 (comment)) that a pool size of 64 threads is a bit too much for most setups.

8 or 16 threads should be fine for most workloads.

# For some cluster-related REST requests, the node must query all other nodes in the cluster. This is the maximum number
# of threads available for this.
proxied_requests_max_threads = 64

This comment has been minimized.

new ThreadFactoryBuilder()
.setUncaughtExceptionHandler(new Tools.LogUncaughtExceptionHandler(LoggerFactory.getLogger(ProxiedResource.class.getName())))

This comment has been minimized.


joschi Sep 7, 2016


Wrong class name?

This comment has been minimized.


joschi Sep 8, 2016


Uncaught exceptions will be logged with the name of the "consumer" of this thread pool (analogous to SchedulerBindings).

joschi and others added some commits Sep 7, 2016


This comment has been minimized.


joschi commented Sep 8, 2016


@joschi joschi merged commit 87acd4a into 2.1 Sep 8, 2016

4 checks passed

ci-server-integration Jenkins build graylog2-server-integration-pr 1345 has succeeded
ci-web-linter Jenkins build graylog-pr-linter-check 828 has succeeded
continuous-integration/travis-ci/pr The Travis CI build passed
continuous-integration/travis-ci/push The Travis CI build passed

@joschi joschi deleted the issue-2764 branch Sep 8, 2016

joschi added a commit that referenced this pull request Sep 8, 2016

Proxied requests query other nodes in parallel (#2779)
* Call other nodes concurrently for proxied requests.
* Injecting ExecutorService in ProxiedResource + configured max pool size
* Explaing proxied_requests_max_threads config parameter in sample config.
* Making config value consistent, changing default, explaining sizing.
(cherry picked from commit 87acd4a)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment