Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pilot discovery cpu usage is too high #6962

Closed
hzxuzhonghu opened this issue Jul 10, 2018 · 21 comments

Comments

@hzxuzhonghu
Copy link
Member

commented Jul 10, 2018

Describe the bug

pilot-discovery cpu uasage is about 20% running bookinfo sample.

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND   
 32607 root      20   0   96340  83696  26556 S  19.3  0.5 311:59.26 pilot-discovery                                                           

Expected behavior

not too much cpu

Steps to reproduce the bug

run sample bookinfo

Version
What version of istio and Kubernetes are you using? Use istioctl version and kubectl version

Version: 1.0.0-snapshot.0

root@szvp000201060:/home/paas# kubectl version
Client Version: version.Info{Major:"1", Minor:"12+", GitVersion:"v1.12.0-alpha.0.1799+fd222e64412e4c-dirty", GitCommit:"fd222e64412e4c402759b49f1f130e1727ef7a77", GitTreeState:"dirty", BuildDate:"2018-07-09T06:05:40Z", GoVersion:"go1.10.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"12+", GitVersion:"v1.12.0-alpha.0.1799+fd222e64412e4c-dirty", GitCommit:"fd222e64412e4c402759b49f1f130e1727ef7a77", GitTreeState:"dirty", BuildDate:"2018-07-09T06:05:40Z", GoVersion:"go1.10.2", Compiler:"gc", Platform:"linux/amd64"}

Is Istio Auth enabled or not?

no

Environment

linux

@ymesika

This comment has been minimized.

Copy link
Member

commented Jul 10, 2018

Is it the peak CPU usage? Does it go down after the Bookinfo page is loaded?

@hzxuzhonghu

This comment has been minimized.

Copy link
Member Author

commented Jul 10, 2018

NO, it is always so since I noticed.

@hzxuzhonghu

This comment has been minimized.

@hzxuzhonghu

This comment has been minimized.

Copy link
Member Author

commented Jul 11, 2018

/kind bug

@hzxuzhonghu

This comment has been minimized.

Copy link
Member Author

commented Aug 10, 2018

It still exist in 1.0 branch


   PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                        
105035 root      20   0   69808  58400  26196 S  17.1  0.4 221:15.49 /usr/local/bin/pilot-discovery discovery
@hzxuzhonghu

This comment has been minimized.

@nmittler

This comment has been minimized.

Copy link
Contributor

commented Aug 10, 2018

@costinm any update on the fix that you mentioned in #7595 (comment)?

@ibuildthecloud

This comment has been minimized.

Copy link

commented Aug 21, 2018

Did a bit of digging around and I think the idle CPU usage is due to

. The queue appears to run at the ratelimit all the time. So something like 10 times a second it wakes up and does nothing. I changed the implementation of queue.go to use a channel and my CPU usage went to 0.3% from a consistent 12% on my system.

I'm not sure why the queue is implemented with a slice and sleeping as opposed to a channel, but the channel approach seems to work better at doing nothing :)

@andraxylia

This comment has been minimized.

Copy link
Contributor

commented Aug 21, 2018

@ibuildthecloud this is great find, ship it!

@hzxuzhonghu

This comment has been minimized.

Copy link
Member Author

commented Aug 22, 2018

Maybe we should make use of sync.Cond to notify instead of checking if len(q.queue) == 0 { every time.

I can test this today.

@costinm

This comment has been minimized.

Copy link
Contributor

commented Aug 22, 2018

@hzxuzhonghu

This comment has been minimized.

Copy link
Member Author

commented Aug 22, 2018

I opened a pr #8123 to fix this.

reduce pilot cpu usage from about 20% down to below 1%

@hzxuzhonghu

This comment has been minimized.

Copy link
Member Author

commented Aug 22, 2018

We planned to get rid of it as
we refactor the serviceregustry to event based ( there will be no queue),
and for config galley would take over k8s integration, so lower priority.

@costinm I have interest in this.

@jaygorrell

This comment has been minimized.

Copy link
Contributor

commented Oct 17, 2018

@hzxuzhonghu is this in a released version by chance?

@andraxylia

This comment has been minimized.

Copy link
Contributor

commented Oct 17, 2018

I see #8123 is only in master, it will be in 1.1 in November time-frame.

@jaygorrell

This comment has been minimized.

Copy link
Contributor

commented Oct 17, 2018

Ahh thanks - I see now, it's in 1.1.0-snapshot.1, which uses docker.io/istio/pilot:1.1.0.snapshot.1 ... that's what I'm on but still getting pretty crazy cpu usage. Around 2000m on the discovery container.

@dreadbird

This comment has been minimized.

Copy link
Member

commented Nov 1, 2018

yes!
I have rise to 1.0.2 and the cpu is still high.

image

@mnuttall

This comment has been minimized.

Copy link

commented Jan 25, 2019

I'm seeing the same thing:

go get github.com/dpetzold/kube-resource-explorer/cmd/kube-resource-explorer
kube-resource-explorer -namespace istio-system
Namespace     Name                                                              CpuReq       CpuReq%  CpuLimit  CpuLimit%  MemReq          MemReq%  MemLimit     MemLimit%
---------     ----                                                              ------       -------  --------  ---------  ------          -------  --------     ---------
istio-system  istio-pilot-86bb4fcbbd-g96w8/discovery                            500m         8%       0m        0%         2048Mi          17%      0Mi          0%
istio-system  istio-pilot-86bb4fcbbd-qzxb7/discovery                            500m         8%       0m        0%         2048Mi          17%      0Mi          0%
istio-system  istio-pilot-86bb4fcbbd-jf87n/discovery                            500m         8%       0m        0%         2048Mi          17%      0Mi          0%

Docker desktop for Mac is sitting at a continuous 30% CPU consumption while idle, with most of that consumed by istio-pilot. This is a problem since Istio is required for KNative development: I can't turn it off.

@Justin2997

This comment has been minimized.

Copy link

commented Jan 25, 2019

@mnuttall Having the same problem that was not fixed in the past version?

@pc-rshetty

This comment has been minimized.

Copy link

commented Feb 5, 2019

I am using 1.0.5 and see the same issue

kubectl describe nodes
` Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits


istio-system istio-egressgateway-8666f9bdcc-ftf2h 10m (0%) 0 (0%) 0 (0%) 0 (0%)
istio-system istio-pilot-86b6679ddf-k9nxc 510m (25%) 0 (0%) 2Gi (26%) 0 (0%)`

@pc-rshetty

This comment has been minimized.

Copy link

commented Feb 5, 2019

sorry just read 1.1 should have the fix cool!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.