-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix issue with 100% CPU usage in logs.go. #704
Fix issue with 100% CPU usage in logs.go. #704
Conversation
Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please visit https://cla.developers.google.com/ to sign. Once you've signed (or fixed any issues), please reply here (e.g. What to do if you already signed the CLAIndividual signers
Corporate signers
|
I signed it! |
Is that failing integration test correct? It looks like some kind of CI environment config issue https://ci.appveyor.com/project/r2d4/skaffold/build/1.0.6
|
@d11wtq Can you try to rebase? that should fix the appveyor bug. You should also sign the CLA with the email you've used for the commits. |
Thanks @dgageot. I'm pretty sure I logged into Google with my @w3style.co.uk address used in the commits. Will try again. |
Resolves: GoogleContainerTools#531 See also: kubernetes/client-go#12 There is an issue in which the Pods watcher gets into a infinite tight loop and begins consuming 100% of the CPU. This happens after skaffold dev has been running for a while (~30 mins) and once it starts, it doesn't stop. The issue was narrowed down by @ajbouh to the event polling loop in `logs.go`, which was not checking if the `ResultChan()` is closed or not. Kubernetes actually closes the connection after a timeout (default is in the range of 30-60 mins according to the related issue linked to above). In this case, the intended solution is to start the watcher again. This refactors the polling into two nested loops. One to start (and restart) the Pods watcher itself and another to receive and process the events from the watcher. If the `ResultChan()` is closed, the entire watcher loop is restarted and log tailing continues. There is a subtle difference in error handling as a result of this change. Previously any error returned from `client.Pods("").Watch()` would be immediately returned from the `Watch()` func in `logs.go`. This is no longer possible since the watcher is initialized in the goroutine started by that func. As such, in the case the watcher cannot be initialized, we simply log the error and stop tailing logs. Open to suggestions as to be a better way to handle this error. Retrying in a tight loop seems potentially problematic in the error scenario.
b4b6c6f
to
f73f27b
Compare
CLAs look good, thanks! |
So there's good news and bad news. 👍 The good news is that everyone that needs to sign a CLA (the pull request submitter and all commit authors) have done so. Everything is all good there. 😕 The bad news is that it appears that one or more commits were authored or co-authored by someone other than the pull request submitter. We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that here in the pull request. Note to project maintainer: This is a terminal state, meaning the |
@d11wtq I had to update the branch. Can you just confirm that you are ok with that? |
I am cool with that 👍 |
Resolves: #531
See also: kubernetes/client-go#12
There is an issue in which the Pods watcher gets into a infinite tight
loop and begins consuming 100% of the CPU. This happens after skaffold
dev has been running for a while (~30 mins) and once it starts, it
doesn't stop.
The issue was narrowed down by @ajbouh to the event polling loop in
logs.go
, which was not checking if theResultChan()
is closed or not.Kubernetes actually closes the connection after a timeout (default is in
the range of 30-60 mins according to the related issue linked to above).
In this case, the intended solution is to start the watcher again.
This refactors the polling into two nested loops. One to start (and
restart) the Pods watcher itself and another to receive and process the
events from the watcher. If the
ResultChan()
is closed, the entirewatcher loop is restarted and log tailing continues.
There is a subtle difference in error handling as a result of this
change. Previously any error returned from
client.Pods("").Watch()
would be immediately returned from the
Start()
func inlogs.go
. Thisis no longer possible since the watcher is initialized in the goroutine
started by that func. As such, in the case the watcher cannot be
initialized, we simply log the error and stop tailing logs. Open to
suggestions as to be a better way to handle this error. Retrying in a
tight loop seems potentially problematic in the error scenario.