user stacks starting before infra services end up in deadlock because of process blocking thread pool #9680

aemneina · 2017-08-15T03:03:29Z

Rancher versions:
rancher/server: v1.6.5

Steps to Reproduce:

have a large number of hosts and services per environment.
reboot all hosts
delayed processes should grow indefinitely.
the processblocking pool should be fully consumed

Results:
ProcessBlocking thread pool was expanded to allow for processes to start processing again.

deniseschannon · 2017-08-17T23:36:46Z

Available with rancher/server:v1.6.8-rc4

moelsayed · 2017-08-18T22:54:51Z

I was able to reproduce the issue with the following parameters:

rancher v1.6.5
20 hosts
2 env's
50 service - 30 scale
pool.processblockingexecutorservice.max.size = 5
pool.processblockingexecutorservice.core.size = 2

Steps to reproduce:

install v1.6.5
create 2 env's add 10 host each.
create 25 stacks per env, 30 scale, nginx containers
shutdown all the hosts
wait for all hosts to be disconnected on the server
start the hosts back

Results:
The services on the setups remained in updating-active state for several hours.

To check the merged fix, I tested with the follow setup parameters:

rancher v1.6.8-rc4
20 hosts
2 env's
50 service - 30 scale
pool.processblockingexecutorservice.max.size = 5
pool.processblockingexecutorservice.core.size = 2
pool.processblockingextraexecutorservice.core.size = 2 
pool.processblockingextraexecutorservice.max.size = 5 
pool.processblockingsystemexecutorservice.max.size =5 
pool.processblockingsystemexecutorservice.core.size = 2

Following the same steps to reproduce, services recovered in 30~35 minutes after rebooting the hosts.

cjellick assigned alena1108 Aug 15, 2017

cjellick added this to the August 2017 milestone Aug 15, 2017

aemneina added area/server internal labels Aug 15, 2017

alena1108 mentioned this issue Aug 17, 2017

New thread pools for services and system operations rancher/cattle#2883

Merged

alena1108 added the status/resolved label Aug 17, 2017

deniseschannon assigned sangeethah Aug 17, 2017

deniseschannon added the status/to-test label Aug 17, 2017

sangeethah assigned moelsayed and alena1108 and unassigned alena1108 and sangeethah Aug 17, 2017

moelsayed closed this as completed Aug 18, 2017

ghost mentioned this issue Nov 14, 2017

Infrastructure Stacks not starting #10327

Closed

alena1108 mentioned this issue Feb 4, 2019

Prioritize running all infra stacks before user stacks (specifically in this issue, rancher-nfs) #8433

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

user stacks starting before infra services end up in deadlock because of process blocking thread pool #9680

user stacks starting before infra services end up in deadlock because of process blocking thread pool #9680

aemneina commented Aug 15, 2017

deniseschannon commented Aug 17, 2017

moelsayed commented Aug 18, 2017 •

edited

user stacks starting before infra services end up in deadlock because of process blocking thread pool #9680

user stacks starting before infra services end up in deadlock because of process blocking thread pool #9680

Comments

aemneina commented Aug 15, 2017

deniseschannon commented Aug 17, 2017

moelsayed commented Aug 18, 2017 • edited

moelsayed commented Aug 18, 2017 •

edited