Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pod's cpu usage on matser2 node higher than other node #82518

Closed
xieyanker opened this issue Sep 10, 2019 · 6 comments

Comments

@xieyanker
Copy link

commented Sep 10, 2019

What happened:

Hello, my pod's cpu usage on matser2 node higher than other node, although there are same pods. And all of my node's cpu are consistent.

root@csf-637f3666-83ac-4d8b-8781-4861df897ba65e53-master-1:~/xujingsong# kubectl top node
NAME      CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   
master1   433m         14%    3277Mi          50%       
master2   2642m        88%    2633Mi          40%       
master3   125m         4%     1846Mi          28%       
slave1    125m         3%     2641Mi          37% 

root@csf-637f3666-83ac-4d8b-8781-4861df897ba65e53-master-1:~/xujingsong# kubectl get pod -A -owide|grep master2
default       nginx-9498cf4bf-7lkjr                       2/2     Running   0          3d      10.151.208.45     master2   <none>           <none>
default       nginx-9498cf4bf-f5xz7                       2/2     Running   0          3d      10.151.208.44     master2   <none>           <none>
kube-system   calico-node-rsgmv                           1/1     Running   0          3d      192.168.202.110   master2   <none>           <none>
kube-system   coredns-9c98d6cbb-n24mp                     1/1     Running   0          3d20h   10.151.208.2      master2   <none>           <none>
kube-system   kube-apiserver-master2                      1/1     Running   0          4h57m   192.168.202.110   master2   <none>           <none>
kube-system   kube-controller-manager-master2             1/1     Running   1          3d20h   192.168.202.110   master2   <none>           <none>
kube-system   kube-proxy-master2                          1/1     Running   0          3d20h   192.168.202.110   master2   <none>           <none>
kube-system   kube-scheduler-master2                      1/1     Running   1          3d20h   192.168.202.110   master2   <none>           <none>
kube-system   metrics-server-5cb9b96667-zv82r             1/1     Running   0          3d20h   10.151.208.4      master2   <none>           <none>
kube-system   nginx-proxy-master2                         1/1     Running   0          3d20h   192.168.202.110   master2   <none>           <none>
kube-system   resource-reserver-master2                   1/1     Running   0          3d20h   10.151.208.1      master2   <none>           <none>
kube-system   vpa-updater-557c96bf46-wnmdd                1/1     Running   1          3d20h   10.151.208.10     master2   <none>           <none>
monitoring    alertmanager-67c75747cd-q7rv6               1/1     Running   0          3d      10.151.208.49     master2   <none>           <none>
monitoring    fluentd-8bqmz                               1/1     Running   0          3d20h   192.168.202.110   master2   <none>           <none>
monitoring    node-exporter-cp6k9                         1/1     Running   0          3d20h   10.151.208.3      master2   <none>           <none>

root@csf-637f3666-83ac-4d8b-8781-4861df897ba65e53-master-1:~/xujingsong# kubectl top pod -A
NAMESPACE     NAME                                        CPU(cores)   MEMORY(bytes)    node
default       nginx-9498cf4bf-7lkjr                       63m          35Mi             master2
default       nginx-9498cf4bf-9bk48                       6m           40Mi            
default       nginx-9498cf4bf-b6d78                       11m          35Mi            
default       nginx-9498cf4bf-f5xz7                       51m          35Mi             master2
default       nginx-9498cf4bf-mmlms                       5m           35Mi            
default       nginx-9498cf4bf-r7drw                       5m           35Mi            
kube-system   calico-kube-controllers-7b4cc87cc7-485hb    2m           15Mi            
kube-system   calico-node-896m5                           18m          52Mi            
kube-system   calico-node-gxhkv                           18m          25Mi            
kube-system   calico-node-rsgmv                           403m         24Mi             master2         
kube-system   calico-node-svtm8                           10m          24Mi            
kube-system   coredns-9c98d6cbb-gsspx                     12m          7Mi             
kube-system   coredns-9c98d6cbb-n24mp                     108m         7Mi              master2           
kube-system   db-daemon-server-7859f8794b-jgc6f           2m           875Mi           
kube-system   dns-autoscaler-6b9bbb69f4-ng48v             1m           8Mi             
kube-system   kube-apiserver-master1                      40m          280Mi           
kube-system   kube-apiserver-master2                      359m         127Mi            master2          
kube-system   kube-apiserver-master3                      34m          205Mi           
kube-system   kube-controller-manager-master1             1m           14Mi            
kube-system   kube-controller-manager-master2             308m         54Mi             master2          
kube-system   kube-controller-manager-master3             1m           14Mi            
kube-system   kube-proxy-master1                          5m           21Mi            
kube-system   kube-proxy-master2                          18m          20Mi             master2        
kube-system   kube-proxy-master3                          3m           22Mi            
kube-system   kube-proxy-slave1                           1m           21Mi            
kube-system   kube-scheduler-master1                      4m           18Mi            
kube-system   kube-scheduler-master2                      40m          16Mi             master2           
kube-system   kube-scheduler-master3                      1m           17Mi            
kube-system   metrics-server-5cb9b96667-zv82r             17m          22Mi             master2           
kube-system   nginx-proxy-master1                         1m           4Mi             
kube-system   nginx-proxy-master2                         52m          5Mi              master2            
kube-system   nginx-proxy-master3                         1m           6Mi             
kube-system   resource-reserver-master1                   0m           0Mi             
kube-system   resource-reserver-master2                   0m           0Mi              master2            
kube-system   resource-reserver-master3                   0m           0Mi             
kube-system   tiller-deploy-cdd5d96ff-rt26n               1m           12Mi            
kube-system   trigger-d85ff74f-49cxk                      1m           5Mi             
kube-system   vpa-admission-controller-77f89cb6d9-7b6dv   9m           16Mi            
kube-system   vpa-recommender-658cb5b88b-mrvmg            19m          25Mi            
kube-system   vpa-updater-557c96bf46-wnmdd                135m         21Mi             master2           
monitoring    alertmanager-67c75747cd-q7rv6               21m          6Mi              master2             
monitoring    blackbox-exporter-679b8bc8c-ghp82           0m           2Mi             
monitoring    fluentd-8bqmz                               53m          87Mi             master2          
monitoring    fluentd-pnczp                               4m           80Mi            
monitoring    fluentd-r6bwx                               10m          85Mi            
monitoring    fluentd-rjsfw                               8m           113Mi           
monitoring    kube-state-metrics-7f59b96454-mb82s         7m           27Mi            
monitoring    node-exporter-5gs75                         0m           11Mi            
monitoring    node-exporter-cp6k9                         0m           10Mi             master2           
monitoring    node-exporter-qn9pg                         0m           10Mi            
monitoring    node-exporter-wxcch                         1m           11Mi            
monitoring    prometheus-5c8cd4644d-sj94t                 41m          752Mi           
monitoring    prometheus-adapter-557df4fb4b-69gfz         3m           123Mi     

What you expected to happen:

In my opinion, the pod's cpu usage should be consistent, no matter what nodes are there on.

How to reproduce it (as minimally and precisely as possible):

100% in my environment.

Anything else we need to know?:

Environment:

  • Kubernetes version (use kubectl version):
    Client Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.3", GitCommit:"5e53fd6bc17c0dec8434817e69b04a25d8ae0ff0", GitTreeState:"clean", BuildDate:"2019-06-06T01:36:19Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}
    Server Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.3", GitCommit:"5e53fd6bc17c0dec8434817e69b04a25d8ae0ff0", GitTreeState:"clean", BuildDate:"2019-06-06T01:36:19Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}
  • Cloud provider or hardware configuration:
  • OS (e.g: cat /etc/os-release):
    NAME="Ubuntu"
    VERSION="18.04.2 LTS (Bionic Beaver)"
    ID=ubuntu
    ID_LIKE=debian
    PRETTY_NAME="Ubuntu 18.04.2 LTS"
    VERSION_ID="18.04"
    HOME_URL="https://www.ubuntu.com/"
    SUPPORT_URL="https://help.ubuntu.com/"
    BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
    PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
    VERSION_CODENAME=bionic
    UBUNTU_CODENAME=bionic
  • Kernel (e.g. uname -a):
    Linux csf-637f3666-83ac-4d8b-8781-4861df897ba65e53-master-2 4.15.0-45-generic #48-Ubuntu SMP Tue Jan 29 16:28:13 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
  • Install tools:
  • Network plugin and version (if this is a network-related bug):
  • Others:
@xieyanker

This comment has been minimized.

Copy link
Author

commented Sep 10, 2019

/sig cli
/sig apps
/sig api-machinery
/sig node

@xiaoanyunfei

This comment has been minimized.

Copy link
Contributor

commented Sep 11, 2019

kube-scheduler and kube-controller-manager both start a leader election before executing the main loop by default, and only one can gain leadership.

kube-scheduler and kube-controller-manager on master2 maybe gain leadership and get data from local kube-apiserver( It depends on your configuration)

@xieyanker

This comment has been minimized.

Copy link
Author

commented Sep 11, 2019

@xiaoanyunfei Thanks for your reply, my config really set up --leader-elect=true on master2. But I don't think it's a reason for high probability. All of the pods on master2 are almost ten times more than others.

@caesarxuchao

This comment has been minimized.

Copy link
Member

commented Sep 12, 2019

/assign @cheftako

@cheftako

This comment has been minimized.

Copy link
Member

commented Sep 13, 2019

So currently an uneven load across masters (in a multi-master cluster) is to be expected. The primary culprit is the line :
kube-system kube-controller-manager-master2 308m 54Mi master2

Many of our processes run with a leader and hot stand-bys. The primary systems which work this was are kube-controller-manager, cloud-controller-manager and kube-scheduler.
In general the kube-controller-manager tends to be the busiest and largest.

For each theses servers on a 3 master system we expect 1 leader and 2 stand-bys.
So for the kube-controller-manager there is one copy of the server which won the kube-controller-manager leader election.
Once it was won the leader election it then brings up all the linked controllers, established watches and begins trying to bring the cluster into alignment with the goal state.
The two standbys meanwhile just loop trying to win a new round of kube-controller-manager leader election.
As long as the current leader remains healthy they will fail and keep retrying.
As a result one master (master2 in this case) has a busy copy of the kube-controller-manager.
The other two masters have kube-controller managers which are essentially idle.

@xieyanker

This comment has been minimized.

Copy link
Author

commented Sep 13, 2019

@cheftako Thanks for your reply, it is very helpful for me!

@xieyanker xieyanker closed this Sep 13, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.