Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Elasticsearch (0.90.2) fails in large core (Ex: ~48) machine #3478

Closed
pmanvi opened this Issue Aug 10, 2013 · 5 comments

Comments

Projects
None yet
3 participants
@pmanvi
Copy link

pmanvi commented Aug 10, 2013

elastic search creates number of threads based on the number of processors available. So when the elastic search client library initialized, it tries to create 350+ threads which is too much for the machine & results in OOM.

Apparently org.elasticsearch.threadpool.ThreadPool tries to assign threads based in cores available in the machines.

Extrat from logs:
thread_pool [index], type [fixed], size [48],
queue_size [null], reject_policy [abort], queue_type [linked] [org.elasticsearch.threadpool] : [xxxxxxxx] creating thread_pool [bulk], type [fixed], size [48], queue_size [null], reject_policy [abort], queue_type [linked] [org.elasticsearch.threadpool] : [xxxxxxxx] creating thread_pool [get], type [fixed], size [48], queue_size [null], reject_policy [abort], queue_type [linked] [org.elasticsearch.threadpool] : [xxxxxxxx] creating thread_pool [search], type [fixed], size [144], queue_size [1k], reject_policy [abort], queue_type [linked] [org.elasticsearch.threadpool] : [xxxxxxxx] creating thread_pool [percolate], type [fixed], size [48], queue_size [null], reject_policy [abort], queue_type [linked] [org.elasticsearch.threadpool] : [xxxxxxxx] creating thread_pool [management], type [scaling], min [1], size [5], keep_alive [5m]

& the stacktrace,

java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:640)
at java.util.concurrent.ThreadPoolExecutor.addThread(ThreadPoolExecutor.java:681)
at java.util.concurrent.ThreadPoolExecutor.addIfUnderMaximumPoolSize(ThreadPoolExecutor.java:727)
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:655)
at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker.start(DeadLockProofWorker.java:38)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.openSelector(AbstractNioSelector.java:343)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.(AbstractNioSelector.java:95)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.(AbstractNioWorker.java:53)
at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.(NioWorker.java:45)
at org.elasticsearch.common.netty.channel.socket.nio.NioWorkerPool.createWorker(NioWorkerPool.java:45)
at org.elasticsearch.common.netty.channel.socket.nio.NioWorkerPool.createWorker(NioWorkerPool.java:28)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorkerPool.newWorker(AbstractNioWorkerPool.java:99)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorkerPool.init(AbstractNioWorkerPool.java:69)
at org.elasticsearch.common.netty.channel.socket.nio.NioWorkerPool.(NioWorkerPool.java:39)
at org.elasticsearch.common.netty.channel.socket.nio.NioWorkerPool.(NioWorkerPool.java:33)
at org.elasticsearch.transport.netty.NettyTransport.doStart(NettyTransport.java:240)
at org.elasticsearch.common.component.AbstractLifecycleComponent.start(AbstractLifecycleComponent.java:85)
at org.elasticsearch.transport.TransportService.doStart(TransportService.java:90)
at org.elasticsearch.common.component.AbstractLifecycleComponent.start(AbstractLifecycleComponent.java:85)
at org.elasticsearch.client.transport.TransportClient.(TransportClient.java:179)
at org.elasticsearch.client.transport.TransportClient.(TransportClient.java:119)
at

@s1monw

This comment has been minimized.

Copy link
Contributor

s1monw commented Aug 10, 2013

Hey @pmanvi can you provide some more info like:

  • how much memory are you giving to the JVM
  • how much memory are you using during applicaion runtime
  • can you past the output of ulimit -Hn as well as ulimit -a
  • how much traffic are you giving to your ES cluster when you see this problem?

@ghost ghost assigned s1monw Aug 10, 2013

@pmanvi

This comment has been minimized.

Copy link
Author

pmanvi commented Aug 10, 2013

I tried setting higher values for -Xmx values and also ulimit -a, all of them didn't help. (we have 32GB of RAM & running many other java services)
This is testing code, so there is no traffic as such. It comes as soon as I instantiate the TransportClient() and error comes out as soon thread pools are initialized as you can see from the logs.

creating thread_pool [search], type [fixed], size [144],

I guess primary reason the creation of threads upfront based on the number of processor. (48 in our case)
http://www.elasticsearch.org/guide/reference/modules/threadpool/

s1monw added a commit to s1monw/elasticsearch that referenced this issue Aug 14, 2013

Limit the number created threads for machines with large number of cores
For machines with lots of cores ie. >= 48 the number of threads
created by default might cause unecessary memory pressure on the system
and can even lead to OOM where the system is not able to create any
native threads anymore. This commit limits the number of available
CPUs on the system used for thread pool initialization to at most
24 cores.

Closes elastic#3478

@s1monw s1monw closed this in 0472bac Aug 14, 2013

s1monw added a commit that referenced this issue Aug 14, 2013

Limit the number created threads for machines with large number of cores
For machines with lots of cores ie. >= 48 the number of threads
created by default might cause unecessary memory pressure on the system
and can even lead to OOM where the system is not able to create any
native threads anymore. This commit limits the number of available
CPUs on the system used for thread pool initialization to at most
24 cores.

Closes #3478
@s1monw

This comment has been minimized.

Copy link
Contributor

s1monw commented Aug 14, 2013

I limited the number of CPUs we take into account when creating the size of the ThreadPool to 24 processors. This should give most of the people good defaults and should prevent you from getting crazy OOMs and too many threads. Would be great if you could give it a go.

@kimchy

This comment has been minimized.

Copy link
Member

kimchy commented Aug 14, 2013

Note that on the client library (if you are using a node as a client), most thread pools will not be used really (we don't prestart those threads). The ones I see from the stack trace is the netty worker ones (which by default is num_proc * 2). If you want to explicitly control that you can set transport.netty.worker_count setting. The commit @s1monw pushed will also bound it to max 24 * 2.

@s1monw

This comment has been minimized.

Copy link
Contributor

s1monw commented Aug 15, 2013

thanks @kimchy for clarifying, my commit just tries to not go crazy with the number of threads, if you have strong hardware you should adjust the sizes according to your needs.

kimchy added a commit that referenced this issue Aug 20, 2013

Bound processor size based cals to 32
We use number of processors to choose default thread pool sizes, and number of workers in networking (for HTTP and transport). Bound it to max at 32 by default as a safety measure to create too many threads.

This relates to #3478, where we set the default to 24, but 32 is probably a better default.

closes #3545

kimchy added a commit that referenced this issue Aug 20, 2013

Bound processor size based cals to 32
We use number of processors to choose default thread pool sizes, and number of workers in networking (for HTTP and transport). Bound it to max at 32 by default as a safety measure to create too many threads.

This relates to #3478, where we set the default to 24, but 32 is probably a better default.

closes #3545

mute pushed a commit to mute/elasticsearch that referenced this issue Jul 29, 2015

Limit the number created threads for machines with large number of cores
For machines with lots of cores ie. >= 48 the number of threads
created by default might cause unecessary memory pressure on the system
and can even lead to OOM where the system is not able to create any
native threads anymore. This commit limits the number of available
CPUs on the system used for thread pool initialization to at most
24 cores.

Closes elastic#3478

mute pushed a commit to mute/elasticsearch that referenced this issue Jul 29, 2015

Bound processor size based cals to 32
We use number of processors to choose default thread pool sizes, and number of workers in networking (for HTTP and transport). Bound it to max at 32 by default as a safety measure to create too many threads.

This relates to elastic#3478, where we set the default to 24, but 32 is probably a better default.

closes elastic#3545
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.