-
Notifications
You must be signed in to change notification settings - Fork 216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SURE-6990] fleet-controller consuming high cpu #1842
Comments
We are observing this same problem. It seems to occur when there are any bundles out of sync. Once all the Bundles are back in sync, the CPU utilization calms back down. But if anything goes out of sync, fleet-controller will start using all the CPU it can get its hands on. When any Bundle is out of sync:
When all Bundles are back in sync:
|
I was able to capture a core file from the fleet-controller process using Delve. If somebody can tell me what to do with such a file (or if there are other Delve commands I can run to extract the useful information for debugging this) please let me know. I cannot attach the core file itself. |
If anybody else sees this and wants to do the same, here's how I attached to it: First get the container ID of fleet-controller (
Then attach to it with a container that has the
Once inside, the fleet-controller is running as PID 1 and you can attach to it. If you cannot attach, you may need to drop out of the container and temporarily set this on the host to get the attach to work:
|
/backport v2.7.x-next |
SURE-6990
Issue description:
fleet-controller is consuming 5GHz plus of CPU. customer is using AzureDevOps with 300+ git repos
Business impact:
fleet gets unstable, or even the node for rancher management
Troubleshooting steps:
scaling nodes 4x
Repro steps:
no repro steps yet
Workaround:
Is workararound available and implemented? yes
What is the workaround: scaling vertically the nodes
Actual behavior:
fleet-controller consuming 5GHz+ of CPU
The text was updated successfully, but these errors were encountered: