Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rejected executions in QueuedThreadPool can lead to memory leaks #1705

Closed
julienmoumne opened this issue Aug 2, 2017 · 3 comments
Closed
Assignees
Labels
Bug For general bugs on Jetty side

Comments

@julienmoumne
Copy link

julienmoumne commented Aug 2, 2017

This issue follows the discussion that took place with @gregw.

We are experiencing a memory leak in Jetty 9.4.6.

We are still unsure whether we are using Jetty improperly or if there is indeed a limitation in Jetty.

This behavior is observed under high load in a simulated environment using gatling and bench-rest, where QueuedTheadPool is saturated and rejects jobs.

We are getting the following warn log from QueuedThreadPool :

org.eclipse.jetty.util.thread.QueuedThreadPool: dw{STARTED,4<=4<=4,i=0,q=4} rejected org.eclipse.jetty.io.ManagedSelector$$Lambda$72/1495355211@413f1bdf

(the issue is also reproducible with much larger thread pools and queue sizes)

The rejected lambda mentioned in the log is defined in ManagedSelector.destroyEndPoint.

Since the lambda is not called, AbstractConnector.onEndPointClosed is never called.

Therefore, EndPoint objects remain in AbstractConnector._endpoints and lead to out of memory errors.

A screenshot of a memory dump showing 624159 endpoint objects holding 5GiB of heap memory : https://imagebin.ca/v/3VTm2n97Ntki

A screenshot of how we validated the behavior using a "logging breakpoint" in the IDE : https://imagebin.ca/v/3VTnTSgJDnXx

@joakime joakime added the Bug For general bugs on Jetty side label Aug 2, 2017
@joakime
Copy link
Contributor

joakime commented Aug 2, 2017

The topic of tuning Jetty for load testing has come up often enough here, on the mailing lists, and the various stackexchange websites. We need to flesh out our documentation better, for sure.

Here's my thoughts (@gregw will have a clearer set of statements), focused on the configuration for load testing part of the question (the bug with the memory leak is a different part of the question and not addressed in this response).

"much larger thread pools and queue sizes" is a vague statement and has many different meanings to different people. Please specify the configurations you consider "much larger" (include the ThreadPool implementation and configuration, if you have a custom queue, which one and how its configured, what ServerConnectors you have defined, and how those connectors are configured)

Ow ... dw{STARTED,4<=4<=4,i=0,q=4} with a thread pool that small, I would have expected the error message at startup telling you about your system hardware configuration and that invalid thread pool setup.

Eg, with a typical dual core CPU + 1 network interface, with 1 connector, you've already consumed 3 of those 4 threads just for fundamental selector / acceptor handling.

Which means you have 1 thread, and only ever 1 thread, to handle requests.

This is not the appropriate setup for load testing, this is a configuration for resource constrained environments (think real time os's, sub 16MB memory footprints for the server, limited JavaSE support, simple webapps with tiny libs, only ever 1 http client connecting to it, simple requests, not whole web pages w/resources, etc)

If you setup a queue implementation that is fixed in size, you are also not configuring for load testing.

A non-Jetty example: Spring Boot for has a "limited resources" configuration of 20 threads max on the Thread Pool for simple REST applications with 1 endpoint and 1 REST client, with a recommendation of 200 for typical low user web servers serving full web pages, and 5,000 for high load servers.

Since you see a bunch of rejected statements, that means your Thread Pool configuration is definitely insufficient for your load testing.

If you are load testing, the QTP should either be default configuration (if you are using those threads intelligently with Servlet 3.x async behaviors) or many multiples higher (if you are using standard servlet behaviors).

For high load production systems, the QTP max is typically set in the thousands (2,000 all the way up past 8,000 in some extreme cases).

We often get asked to give people the numbers (for their configuration) that they should use, but the exact number you need for your system is dependent on a number of personal factors, such as: the hardware environment, behaviors of your specific webapp, behaviors of your web clients, capabilities of your network, what servlet technologies you are using, etc (to name a few common factors).

@julienmoumne
Copy link
Author

julienmoumne commented Aug 3, 2017

The fact the log shows a pool of 4 threads is because I wanted to replicate the issue on my box with a minimum of resources.

Here are the details you requested where the issue can be replicated :

  • QueuedThreadPool, minThreads=500, maxThreads=500, default timeouts
  • BlockingArrayQueue, capacity=500, growBy=500, maxCapacity=20000
  • ServerConnector with default dropwizard configuration values except for idleTimeout set to 2 seconds, acceptors set to 3, selectors set to 2
  • Standard sync servlet peforming IO on a database and JSON serialization
  • running on KVM/Ubuntu 16.04.2/Docker 17.06.0-ce, 3 cores, 8GiB (-Xmx set to 4GiB)
  • 7k rps (875 short-lived connections per sec) applied from a different machine

The size of the BlockingArrayQueue almost instantly reaches 20k and rejected executions for destroyEndpoint starts to appear.

Memory is full after less than 5 minutes.

Setting the capacity of the BlockingArrayQueue to the maximum as recommended by @gregw fixes the memory leak for this test. It reaches 32k max.

I understand the tuning of the machine and Jetty is very bad in view of the load applied.

Tuning advices are always welcome but I understand this is not the place.

I guess the question is: is it ok for a badly tuned machine/Jetty to run out of memory?

Maybe this is fair and in this case the ticket can be closed.

Thanks for your help.

@sbordet
Copy link
Contributor

sbordet commented Sep 14, 2017

In Jetty 9.4.7, destroyEndPoint is now a non-blocking action, which means it will not be submitted to a thead pool for execution, but run directly in the calling thread, see #1804.

This is a legitimate leak: we should not leak memory no matter what; the thread pool size does not matter, it's a bug and it's fixed now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug For general bugs on Jetty side
Projects
None yet
Development

No branches or pull requests

4 participants