Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hotplug causing cloud-init to spike CPU usage #3912

Closed
ubuntu-server-builder opened this issue May 12, 2023 · 10 comments
Closed

hotplug causing cloud-init to spike CPU usage #3912

ubuntu-server-builder opened this issue May 12, 2023 · 10 comments
Labels
launchpad Migrated from Launchpad priority Fix soon

Comments

@ubuntu-server-builder
Copy link
Collaborator

This bug was originally filed in Launchpad as LP: #1946003

Launchpad details
affected_projects = ['cloud-init (Ubuntu)']
assignee = None
assignee_name = None
date_closed = 2021-11-02T19:54:30.086074+00:00
date_created = 2021-10-04T14:14:52.666764+00:00
date_fix_committed = 2021-10-27T14:44:26.549722+00:00
date_fix_released = 2021-11-02T19:54:30.086074+00:00
id = 1946003
importance = critical
is_complete = True
lp_url = https://bugs.launchpad.net/cloud-init/+bug/1946003
milestone = None
owner = falcojr
owner_name = James Falcon
private = False
status = fix_released
submitter = falcojr
submitter_name = James Falcon
tags = []
duplicates = [1948459]

Launchpad user James Falcon(falcojr) wrote on 2021-10-04T14:14:52.666764+00:00

In 21.3, we added udev rules to enable the cloud-init hotplug functionality. If a new device is detected, we call into cloud-init to see if hotplug is supported/enabled, then proceed accordingly based on the results. There are cloud users that are creating and disposing docker containers at a very high rate. This causes many virtual ethernet adapters to be created and disposed. This is triggering cloud-init events at a high volume, consuming significant CPU. Even with the hotplug functionality being disabled, the act of checking if hotplug is enabled is causing the spikes in CPU.

The path taken is:
https://github.com/canonical/cloud-init/blob/main/udev/10-cloud-init-hook-hotplug.rules
to
https://github.com/canonical/cloud-init/blob/main/tools/hook-hotplug
to
https://github.com/canonical/cloud-init/blob/main/cloudinit/cmd/devel/hotplug_hook.py#L158

For more context, see IRC conversations from 10/1/2021 and 10/4/2021:
https://irclogs.ubuntu.com/2021/10/01/%23cloud-init.html
https://irclogs.ubuntu.com/2021/10/04/%23cloud-init.html

@ubuntu-server-builder ubuntu-server-builder added launchpad Migrated from Launchpad priority Fix soon labels May 12, 2023
@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user James Falcon(falcojr) wrote on 2021-10-04T14:19:34.996679+00:00

As far as a fix goes, I'm leaning towards not including the udev rule during install, then installing it during our normal boot process if we detect that hotplug has been enabled.

Another possible solution is to modify the script called by the udev event to only trigger if we detect a PCI device, but IIRC that won't work on all clouds as some clouds expose their new devices as virtual devices.

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Ryan Harper(raharper) wrote on 2021-10-04T15:27:07.823519+00:00

As far as a fix goes, I'm leaning towards not including the udev rule during
install, then installing it during our normal boot process if we detect that
hotplug has been enabled.

I think this makes a lot of sense. IIRC, we don't attempt to handle hotplug
events on firstboot, so it's reasonable to write the new udev rule if enabled
and reload rules (udevadm control --reload)

Another optimization for the rule would be to have it not invoke cloud-init
directly to determine if hotplug is enabled (python3 is a heavy exec).

When cloud-init checks for hotplug config, it can serialize into
/run/cloud-init the current status of hotplug) and I think like
cloud-init.disabled, we could also have a marker file that the hook can check
in the shell script to avoid any exec of python at all).

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Geoffrey Goodman(ggoodman) wrote on 2021-10-13T14:42:00.864998+00:00

As one of the users affected by this performance regression, I like the proposed solutions.

Certainly avoiding the udev rule entirely when cloud-init is otherwise configured to disregard hotplug events seems like the best long-term solution. However, I can also appreciate a less invasive short-term fix that might be scoped to avoiding the heavy python exec via shell scripting.

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Launchpad Janitor(janitor) wrote on 2021-10-22T14:42:05.397039+00:00

Status changed to 'Confirmed' because the bug affects multiple users.

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Chad Smith(chad.smith) wrote on 2021-10-22T16:07:42.691657+00:00

Upstream PR in flight on this.#1069

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Chad Smith(chad.smith) wrote on 2021-10-27T14:48:54.087670+00:00

Upstream commit has landed addressing this issue:
1d01da5

Expect this available in cloud-init version 21.4.

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user James Falcon(falcojr) wrote on 2021-11-02T19:54:31.076358+00:00

This bug is believed to be fixed in cloud-init in version 21.4. If this is still a problem for you, please make a comment and set the state back to New

Thank you.

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Launchpad Janitor(janitor) wrote on 2021-11-03T07:06:45.015913+00:00

This bug was fixed in the package cloud-init - 21.4-0ubuntu1~22.04.1


cloud-init (21.4-0ubuntu1~22.04.1) jammy; urgency=medium

-- James Falcon james.falcon@canonical.com Tue, 02 Nov 2021 18:07:49 -0500

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Paolo Pettinato(p.pettinato) wrote on 2021-11-18T16:45:48.377693+00:00

Thank you @falcojr et al.
Any chances this fix / version will be backported in the "updates" stream of older LTS releases?

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user James Falcon(falcojr) wrote on 2021-11-18T17:01:09.975956+00:00

Yes, the fix will be backported to -updates in Bionic, Focal, Hirsute, and Impish. That could happen as soon as today or early next week. The tracking bug for that is https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1949521

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
launchpad Migrated from Launchpad priority Fix soon
Projects
None yet
Development

No branches or pull requests

1 participant